Home > Perl6 > Things I found out while finding

Things I found out while finding

Concurrent::File::Find is now in the ecosystem. Besides a lack of loop detection, thanks to a lack of readlink in Rakudo and my failed attempt to get __lxstat (GNUs libc got a strange taste when it comes to symbol names) to cooperate, it’s working quite nicely. There are a couple things I learned that I would like to share.

Having a where-clause on a sub-signature will not provide the variables inside the sub-signature. If a where-clause has to operate on all arguments, to check exclusiveness or other interdependence, it has to be applied to the very last argument. This limitation is by design, so I told the docs. That where-clauses don’t work on captures, was not by design and jnthn kindly fixed it the next day. I know that whining gets you stuff, but I didn’t expect it to be so fast.

I also found that a method called .close almost always requires a LEAVE. Let’s have a look at some code.

sub find-simple ( IO(Str) $dir,
    :$keep-going = True,
    :$no-thread = False
) is export {
    my $channel = Channel.new;

    my &start = -> ( &c ) { c } if $no-thread;

    my $promise = start { 
        for dir($dir) {
            CATCH { default { if $keep-going { note .Str } else { .rethrow } } }
            
            if .IO.l && !.IO.e {
                X::IO::StaleSymlink.new(path=>.Str).throw;
            }
            {
                CATCH { when X::Channel::SendOnClosed { last } }
                $channel.send(.IO) if .IO.f;
                $channel.send(.IO) if .IO.d;
            }
            .IO.dir()».&?BLOCK if .IO.e && .IO.d;
        }
        LEAVE $channel.close unless $channel.closed;
    }

    return $channel.list but role :: { method channel { $channel } };
}

This basically takes the List returned by dir and returns IO::Path for files and directories via a Channel. For any directory it recurses in the for-block (that’s what ».&?BLOCK means). Since any of those file-tests may fire an exception, we may leave the start-block. If there is a Promise, that would send us behind the return-statement, where there is nothing. The return-statement is actually in the main-thread and some consumer will be blocking on the returned .list.

.say for find-simple('/home/you');

For the consumer of that Channel it is by no means clear that his may block forever. Hence the LEAVE-statement that ensures a closed Channel if something goes wrong.

Mixing the role into the .listified Channel provides a side channel to hand the Channel over to the consumer, so it can be closed on that end.

my @l := find-simple(%*ENV<HOME>, :keep-going); # binding to avoid eagerness

for @l {
    @l.channel.close if $++ > 5000; # hard-close the channel after 5000 found files
    .say if $++ %% 100 # print every 100th file
}

The binding is required or the Array would eat up the entire Channel and with a scalar in it’s place we would have to flatten. Please note how well the anonymous state variables count in a most untypo-possible manner. This is one of those cases where a boatload of Perl 6 features play together. Sigils allow to indicate a variable, even if it got no name. Autovivication on Any by postfix:<++> gets us a 0 that is incremented to 1. Since we don’t got a name that is spilled into the local scope (the 2nd $++ introduces a separate variable), we don’t need a declarator. I never really liked one shot variables because they are so easy to miss when moving code around. Then the compiler would complain about a missing declaration, or worse, have a variable that is initialised with a wrong value.

There are problems though. While testing in a clean VM (Debian Jessie) I found that symlink-loops (/sys/ got plenty of those) will cause the Promised-thread to consume 100% CPU while the main-thread is doing nothing. There seams to be something amiss with threads. Maybe in conjunction with plenty of exceptions. I would not wonder if there would be a bug. Nobody is testing exception-spamming in a synthetic test.

That’s what $no-thread is for. Simply declaring &start to be a nearly empty pointy will disable any threading and turn the Channel into a glorified Array.

Anyway, both writing and using concurrent code is really easy, as long as we turn a gather/take into a start/send to channel a List through a Channel.

UPDATE: I just managed to golf and report the bug. At the time you read this it may be fixed already.

Categories: Perl6

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: