Tracing whats missing
I have a logfile of the following form that I would like to parse.
[ 2016.03.09 20:40:28 ] (MessageType) Some message text that depends on the <MessageType>
Since the form of the text depends on the message type I need a rule
to identify the message type and a rule
to parse the message body itself. To aid my struggle through the Grammar in question I use Grammar::Tracer
from jnthn’s Grammer::Debugger
module. It’s a fine module that will tell until where a match was OK and at which point the Grammar gave up parsing. In the case of a successful match it shows part of the substring that was successfully parsed. If parsing a rule
or token
fails it will tell you but wont show the offending string. The whole purpose of Grammar wrangling is to identify the bits that wont match and change the Grammar until they go away. Not showing the offending string is not overly helpful.
But fear not as Grammars are classes and as such can have methods. Let’s define one and add it to a chain of options.
method parse-fail {
# self is a subclass of Grammar
say self.postmatch.substr(0, 100);
exit 0;
}
rule body-line { '[' <timestamp> ']' [ <body-notify> | <body-question> | <body-info> | <body-warning> || <parse-fail> ] }
So when none of the known message types match the Grammar stops and shows the string that still needs to be handled. With that I could parse all 8768 files until I got them all covered. Also this is much faster then running with Grammar::Tracer
.
It seems to be very useful to have folk implement a language they would like to use to implement that language.
Whatever whenever does
Jnthn answered the question why $*IN.lines
blocks in a react block. What isn’t explained is what whenever
actually does before it starts blocking.
react {
whenever $*IN.lines { .say }
}
Looking at the syntax of a whenever
block, we see that whenever
takes a variable immediatly followed by a block. The only place where a structure like that can be defined is Grammar.nqp.
rule statement_control:sym<whenever> {
<sym><.kok>
[
|| <?{
nqp::getcomp('perl6').language_version eq '6.c'
|| $*WHENEVER_COUNT >= 0
}>
|| <.typed_panic('X::Comp::WheneverOutOfScope')>
]
{ $*WHENEVER_COUNT++ }
<xblock($PBLOCK_REQUIRED_TOPIC)>
}
Here the grammar just checks a few things without actually generating any code. So we head to Actions.nqp.
method statement_control:sym<whenever>($/) {
my $xblock := $<xblock>.ast;
make QAST::Op.new(
:op<call>, :name<&WHENEVER>, :node($/),
$xblock[0], block_closure($xblock[1])
);
}
The whenever block is converted to a call to sub WHENEVER
which we find in Supply.pm6.
sub WHENEVER(Supply() $supply, &block) {
There we go. A whenever
block takes its first argument of any type and calles .Supply
on it, as long as Any
is a parent of that type. In the case of $*IN
that type will typically be whatever IO::Handle.lines returns.
Seq.new(self!LINES-ITERATOR($close))
To turn a Seq
into a Supply
Any.Supply
calls self.list.Supply
. Nowhere in this fairly long chain of method lookups (this can’t be fast) are there any threads to be found. If we want to fix this we need to sneak a Channel
into $*IN.lines
which does exactly that.
$*IN.^can('lines')[1].wrap(my method {
my $channel = Channel.new;
start {
for callsame() {
last if $channel.closed;
$channel.send($_)
}
LEAVE $channel.close unless $channel.closed;
}
$channel
});
Or if we want to be explicit:
use Concurrent::Channelify;
react {
whenever signal(SIGINT) {
say "Got signal";
exit;
}
whenever $*IN.lines⇒ {
say "got line";
}
}
We already use ⚛ to indicate atomic operations. Maybe using prefix:<∥> to indicate concurrency makes sense. Anyway, we went lucky once again that Rakudo is implemented (mostly) in Perl 6 so we can find out where we need to poke it whenever we want to change it.
Nil shall warn or fail but not both
As announced earlier I went to write a module to make Nil.list
behave a little better. There are basicly two way Nil
could be turned into a list. One should warn the same way as Nil.Str
does and the other should end the program loudly. Doing both at the same time however does not make sense.
There are a few ways this could be done. One is augmenting Nil
with a list
method and have this method check a dynamic variable to pick the desired behaviour. That would be slow and might hurt if Nil.list
is called in a loop. The other is by using a custom sub EXPORT
and a given
switch.
# lib/NoNilList/Warning.pm6
use NoNilList 'Warning';
# lib/NoNilList/Fatal.pm6
use NoNilList 'Fatal';
# lib/NoNilList.pm6
sub EXPORT($_?) {
given $_ {
when 'Warning' {
# augment Nil with a warning .list
}
when 'Fatal' {
# augment Nil with a failing .list
}
default {
die 'Please use NoNilList::Warning or NoNilList::Fatal.';
}
}
%() # Rakudo complains without this
}
Now use NoNilList;
will yield a compile time error with a friedly hint how to avoid it.
I left the augmenting part out because it does not work. I thought I stepped on #2779 again but was corrected that this is acually a different bug. Jnthn++ fixed part of that new bug (Yes, Perl 6 bugs are so advanced they come in multiple parts.) and proposed the use of the MOP instead. That resulted in #2897. The tricky bit is that I have to delay augmentation of Nil
to after the check on $_
because augment
is a declarator and as such executed at compile time — in a module that can be months before the program starts to run. Both an augment in an EVAL
string and the MOP route would lead there. I wanted to use this module as my debut on 6PAN. That will have to wait for another time.
If you find a bug please file it. It will lead to interresting discoveries for sure.
MONKEY see no Nil
In a for loop Nil
is turned into a List
with one Element that happens to be Any
. This really buged me so I went to find out why. As it turns out the culprit is the very definition of Nil is Cool
. To be able to turn any single value into a List
Cool
implements method list()
. Which takes a single values and turns that value into a List with that one value. Nil
indicates the absense of a value and turning it into a value doesn’t make sense. Luckily we can change that.
use MONKEY-TYPING;
augment class Nil {
method list() {
note 'Trying to turn Nil into a list.';
note Backtrace.new.list.tail.Str;
Empty
}
}
Nil.HOW.compose(Nil);
sub niler() { Nil }
for niler() { say 'oi‽' }
We can’t just warn
because that would show the wrong point in the stack trace. So we note (which also goes to $*ERR) and pull the last value out of the Backtrace
.
Interestingly Failure
throws both in .list
and in .iterator
. Nil
implements push
, append
, unshift
and prepend
by immediatly die
-ing. Adding more to nothing is deadly but turning nothing first into something vaguely undefined and then allowing to add more stuff to it is inconsistent at best. What leads me to believe that Nil.list
as it is specced today is just an oversight.
At least I can now write a simple module to protect my code from surprising Nil
s.
Parallel permutations
Jo Christian Oterhals asked for a parallel solution for challenge 2. I believe he had problems to find one himself, because his code sports quite a few for
loops. By changing those to method call chains, we can use .hyper
to run at lease some code concurrently.
use v6.d;
constant CORES = $*KERNEL.cpu-cores;
# workaround for #1210
sub prefix:<[max]>(%h){ %h.sort(-*.value).first }
my %dict = "/usr/share/dict/words".IO.lines.map({ .lc => True });
my %seen;
%dict.keys».&{ %seen{.comb.sort.join}++; };
with [max] %seen {
say .value, .key.comb.hyper(:batch(1024), :degree(CORES)).permutations».join.grep({ %dict{$_}:exists }).Str
}
My approach is a little different then Jo’s. I don’t try to keep all combinations around but just count the anagrams for each entry in the word list. Then I find a word with the most anagrams (there are more candidates with the same amount that I skip) and reconstruct the anagrams for that word.
The only operation where any form of computation happens is the generation of permutations. Anything else is just too memory bound to get a boost by spinning up threads. With the .hyper
-call the program is a tiny wee bit faster then with just one thread on my Threadripper box. A system with slow cores/smaller caches should benefit a little more. The main issue is that the entire word list fits into the 3rd level cache. With a bigger dataset a fast system might benefit as well.
In many cases multi-core systems are fairy dust, which makes the wallets of chip makers sparkle. Wrangling Hash
s seams to be one of those.
Nil is a pessimist
Guifa was unhappy with $xml.elements
returning a list with one undefined element if there are no child nodes. That led me to the conclusion that Nil
is only halve Empty
.
Let’s consider this piece of code.
sub nilish() { Nil };
for nilish() { say 'oi‽' }
my $nil := nilish();
for $nil { say 'still oi‽' }
sub no-return() { }
for no-return() { say 'even more oi‽' }
sub return-a-list( --> List:D ) { Nil }
for return-a-list() { say 'Should this happen?' }
# OUTPUT:
# oi‽
# still oi‽
# even more oi‽
# Should this happen?
We are iterating over the special container-reset-value called Nil
because there is no container to reset. Since for
binds to its arguments it binds to Nil
. A type object, even a very special one as Nil, is a single item which is treated as a list with one element by for
.
We can solve this problem by a multi sub that turn unreasonable values into the empty list.
multi sub undefined-to-empty(Nil) { Empty }
multi sub undefined-to-empty(@a where all @a.map(!*.defined)) { Empty }
multi sub undefined-to-empty(\item) { item }
for undefined-to-empty(Nil) { say 'nothing happens' }
for undefined-to-empty((Any,)) { say 'nothing happens' }
By adding a candidate that checks if there are only undefined values in a list we can also solve guifa`s problem.
This is of cause just a shim. The real solution is to stop turning null
into Nil
in native bindings. If you write a sub that has to return a list but can’t either fail
or return Empty
if there is nothing to return. To help avoid that mistake in the future I filed #2721.
If it looks empty or sounds emtpy or tastes emtpy make it Empty
!
Wrapping a scope
Intricate instruments like *scopes can be quite temparatur sensitive. Wrapping them into some cosy insulator can help. With Perl 6 it is the other way around. When we wrap a Callable
we need to add insulation to guard anything that is in a different scope.
On IRC AndroidKitKat asked a question about formatting Array
-stringification. It was suggested to monkeytype another gist
-method into Array
. He was warned that precompilation would be disabled in this case. A wrapper would avoid this problem. For both solutions the problem of interference with other coders code (in doubt that is you halve a year younger) remains. Luckily we can use dynamic variables to take advantage of stack magic to solve this problem.
Array.^can('gist')[0].wrap(
sub (\a){
print 'wrapped: ';
$*dyn ?? a.join(',') !! nextsame
}
);
my @a = [1,2,3];
{
my $*dyn = True;
say @a;
}
say @a;
# output:
wrapped: 1,2,3
wrapped: [1 2 3]
Dynamic variables don’t really have a scope. They live on the stack and their assigned value travels up the call tree. A wrapper can check if that variable is defined or got a specific value and fall back to the default behaviour by calling nextsame
if need be. Both.wrap
and dynamic variables work across module bounderies. As such we can make the behaviour of our code much more predictable.
This paragraph was meant to wrap things up. But since blogs don’t support dynamic variables I better stop befor I mess something up.