Home > Perl6 > It’s Classes All The Way Down

It’s Classes All The Way Down

While building a cache for a web api that spits out JSON I found myself walking over the same data twice to fix a lack of proper typing. The JSON knows only about strings even though most of the fields are integers and timestamps. I’m fixing the types after parsing the JSON with JSON::Fast by coercively .map-ing .

@stations.=hyper.map: { # Here be heisendragons!
    .<lastchangetime> = .<lastchangetime>
        ?? DateTime.new(.<lastchangetime>.subst(' ', 'T') ~ 'Z', :formatter(&ISO8601))
        !! DateTime;
    .<clickcount> = .<clickcount>.Int;
    .<lastcheckok> = .<lastcheckok>.Int.Bool;

    (note "$_/$stations-count processed" if $_ %% 1000) with $++;

    .Hash
};

The hyper helps a lot to speed things up but will put a lot of stress on the CPU cache. There must be a better way to do that.

Then lizmat showed where Rakudo shows its guts.

m: grammar A { token a { }; rule a { } }
OUTPUT: «5===SORRY!5=== Error while compiling <tmp>␤Package 'A' already has a regex 'a' 
(did you mean to declare a multi-method?)␤

Tokens are regex or maybe methods. But if tokens are methods then grammars must be classes. And that allows us to subclass a grammar.

grammar WWW::Radiobrowser::JSON is JSON {
    token TOP {\s* <top-array> \s* }
    rule top-array      { '[' ~ ']' <station-list> }
    rule station-list   { <station> * % ',' }
    rule station        { '{' ~ '}' <attribute-list> }
    rule attribute-list { <attribute> * % ',' }

    token year { \d+ } token month { \d ** 2 } token day { \d ** 2 } token hour { \d ** 2 } token minute { \d ** 2 } token second { \d ** 2}
    token date { <year> '-' <month> '-' <day> ' ' <hour> ':' <minute> ':' <second> }

    token bool { <value:true> || <value:false> }

    token empty-string { '""' }

    token number { <value:number> }

    proto token attribute { * }
    token attribute:sym<clickcount> { '"clickcount"' \s* ':' \s* '"' <number> '"' }
    token attribute:sym<lastchangetime> { '"lastchangetime"' \s* ':' \s* '"' <date> '"' }
    token attribute:sym<lastcheckok> { '"lastcheckok"' \s* ':' \s* '"' <bool> '"' }
}

Here we overload some tokens and forward calls to tokens that got a different name in the parent grammar. The action class follows suit.

class WWW::Radiobrowser::JSON::Actions is JSON::Actions {
    method TOP($/) {
        make $<top-array>.made;
    }
    method top-array($/) {
        make $<station-list>.made.item;
    }
    method station-list($/) {
        make $<station>.hyper.map(*.made).flat; # Here be heisendragons!
    }
    method station($/) {
        make $<attribute-list>.made.hash.item;
    }
    method attribute-list($/) {
        make $<attribute>».made.flat;
    }
    method date($_) { .make: DateTime.new(.<year>.Int, .<month>.Int, .<day>.Int, .<hour>.Int, .<minute>.Int, .<second>.Num) }
    method bool($_) { .make: .<value>.made ?? Bool::True !! Bool::False }
    method empty-string($_) { .make: Str }

    method attribute:sym<clickcount>($/) { make 'clickcount' => $/<number>.Int; }
    method attribute:sym<lastchangetime>($/) { make 'lastchangetime' => $/<date>.made; }
    method attribute:sym<lastcheckok>($/) { make 'lastcheckok' => $/<bool>.made; }
}

In case you wonder how to call a method with such a funky name, use the quoting version of postfix:<.>.

class C { method m:sym<s>{} }
C.new.'m:sym<s>'()

I truncated the examples above. The full source can be found here. The .hyper-Version is still quite a bit faster but also heisenbuggy. In fact .hyper may not work at all when executed to fast after a program starts or when used in a recursive Routine. This is mostly due to the grammer being one of the oldest parts of Rakudo with the least amount of work to make it fast. That is a solvable problem. I’m looking forward to Grammar All The Things.

If you got grammars please don’t hide them. Somebody might need them to be classy.

 

Categories: Perl6