It’s Classes All The Way Down
While building a cache for a web api that spits out JSON I found myself walking over the same data twice to fix a lack of proper typing. The JSON knows only about strings even though most of the fields are integers and timestamps. I’m fixing the types after parsing the JSON with JSON::Fast
by coercively .map
-ing .
@stations.=hyper.map: { # Here be heisendragons!
.<lastchangetime> = .<lastchangetime>
?? DateTime.new(.<lastchangetime>.subst(' ', 'T') ~ 'Z', :formatter(&ISO8601))
!! DateTime;
.<clickcount> = .<clickcount>.Int;
.<lastcheckok> = .<lastcheckok>.Int.Bool;
(note "$_/$stations-count processed" if $_ %% 1000) with $++;
.Hash
};
The hyper
helps a lot to speed things up but will put a lot of stress on the CPU cache. There must be a better way to do that.
Then lizmat showed where Rakudo shows its guts.
m: grammar A { token a { }; rule a { } }
OUTPUT: «5===SORRY!5=== Error while compiling <tmp>Package 'A' already has a regex 'a'
(did you mean to declare a multi-method?)
Tokens are regex or maybe methods. But if tokens are methods then grammars must be classes. And that allows us to subclass a grammar.
grammar WWW::Radiobrowser::JSON is JSON {
token TOP {\s* <top-array> \s* }
rule top-array { '[' ~ ']' <station-list> }
rule station-list { <station> * % ',' }
rule station { '{' ~ '}' <attribute-list> }
rule attribute-list { <attribute> * % ',' }
token year { \d+ } token month { \d ** 2 } token day { \d ** 2 } token hour { \d ** 2 } token minute { \d ** 2 } token second { \d ** 2}
token date { <year> '-' <month> '-' <day> ' ' <hour> ':' <minute> ':' <second> }
token bool { <value:true> || <value:false> }
token empty-string { '""' }
token number { <value:number> }
proto token attribute { * }
token attribute:sym<clickcount> { '"clickcount"' \s* ':' \s* '"' <number> '"' }
token attribute:sym<lastchangetime> { '"lastchangetime"' \s* ':' \s* '"' <date> '"' }
token attribute:sym<lastcheckok> { '"lastcheckok"' \s* ':' \s* '"' <bool> '"' }
}
Here we overload some tokens and forward calls to tokens that got a different name in the parent grammar. The action class follows suit.
class WWW::Radiobrowser::JSON::Actions is JSON::Actions {
method TOP($/) {
make $<top-array>.made;
}
method top-array($/) {
make $<station-list>.made.item;
}
method station-list($/) {
make $<station>.hyper.map(*.made).flat; # Here be heisendragons!
}
method station($/) {
make $<attribute-list>.made.hash.item;
}
method attribute-list($/) {
make $<attribute>».made.flat;
}
method date($_) { .make: DateTime.new(.<year>.Int, .<month>.Int, .<day>.Int, .<hour>.Int, .<minute>.Int, .<second>.Num) }
method bool($_) { .make: .<value>.made ?? Bool::True !! Bool::False }
method empty-string($_) { .make: Str }
method attribute:sym<clickcount>($/) { make 'clickcount' => $/<number>.Int; }
method attribute:sym<lastchangetime>($/) { make 'lastchangetime' => $/<date>.made; }
method attribute:sym<lastcheckok>($/) { make 'lastcheckok' => $/<bool>.made; }
}
In case you wonder how to call a method with such a funky name, use the quoting version of postfix:<.>
.
class C { method m:sym<s>{} }
C.new.'m:sym<s>'()
I truncated the examples above. The full source can be found here. The .hyper
-Version is still quite a bit faster but also heisenbuggy. In fact .hyper
may not work at all when executed to fast after a program starts or when used in a recursive Routine
. This is mostly due to the grammer being one of the oldest parts of Rakudo with the least amount of work to make it fast. That is a solvable problem. I’m looking forward to Grammar All The Things.
If you got grammars please don’t hide them. Somebody might need them to be classy.
-
October 9, 2017 at 22:462017.41 The Case for Empathy | Weekly changes in and around Perl 6
-
August 22, 2018 at 11:13biceps fast