Archive

Archive for the ‘Perl6’ Category

Threading nqp through a channel

February 3, 2019 1 comment

Given that nqp is faster then plain Perl 6 and threads combining the two should give us some decent speed. Using a Supply as promised in the last post wouldn’t really help. The emit will block until the internal queue of the Supply is cleared. If we want to process files recursively the filesystem might stall just after the recursing thread is unblocked. If we are putting pressure on the filesystem in the consumer, we are better of with a Channel that is swiftly filled with file paths.

Let’s start with a simulated consumer that will stall every now end then and takes the Channel in $c.

my @files;
react {
whenever $c -> $path {
@files.push: $path;
sleep 1 if rand < 0.00001;
}
}

If we would pump out paths as quickly as possible we could fill quite a bit of RAM and put a lot of pressure on the CPU caches. After some trial and error I found that sleeping befor the .send on the Channel helps when there are more then 64 worker threads waiting to be put onto machine threads. That information is accessible via Telemetry::Instrument::ThreadPool::Snap.new<gtq>.

my $c = Channel.new;
start {
my @dirs = '/snapshots/home-2019-01-29';
  while @dirs.shift -> str $dir {
  my Mu $dirh := nqp::opendir(nqp::unbox_s($dir));
  while my str $name = nqp::nextfiledir($dirh) {
  next if $name eq '.' | '..';
  my str $abs-path = nqp::concat( nqp::concat($dir, '/'), $name);
  next if nqp::fileislink($abs-path);
  if Telemetry::Instrument::ThreadPool::Snap.new<gtq> > 64 {
say Telemetry::Instrument::ThreadPool::Snap.new<gtq>;
  say 'sleeping';
sleep 0.1;
}
$c.send($abs-path) if nqp::stat($abs-path, nqp::const::STAT_ISREG);
@dirs.push: $abs-path if nqp::stat($abs-path, nqp::const::STAT_ISDIR);
  }
  CATCH { default { put BOLD .Str, ' ⟨', $dir, '⟩' } }
  nqp::closedir($dirh); }
  $c.close;
}

Sleeping for 0.1s before sending the next value is a bit naive. It would be better to watch the number of waiting workers and only continue when it has dropped below 64. But that is a task for a differnt module. We don’t really have a middle ground in Perl 6 between Supply with it’s blocking nature and the value pumping Channel. So such a module might be actually quite useful.

But that will have to wait. I seam to have stepped on a bug in IO::Handle.read while working with large binary files. We got tons of tests on roast that deal with small data. Working with large data isn’t well tested and I wonder what monsters are lurking there.

Advertisements
Categories: Uncategorized, Perl6

nqp is faster then threads

February 2, 2019 1 comment

After heaving to much fun with a 20 year old filesystem and the inability of unix commands to hande odd filenames, I decided to replace find /somewhere -type f | xargs -P 10 -n 1 do-stuff with a Perl 6 script.

The first step is to travers a directory tree. I don’t really need to keep the a list of paths but for sure run stuff in parallel. Generating a supply in a thread seams to be a reasonable thing to do.

start my $s = supply {
for '/snapshots/home-2019-01-29/' {
emit .IO if (.IO.f & ! .IO.l);
.IO.dir()».&?BLOCK if (.IO.d & ! .IO.l);
CATCH { default { put BOLD .Str } }
}
}

{
my @files;
react whenever $s {
@files.push: $_;
}
  say +@files;
  say now - ENTER now;
}

Recursion is done with by calling the for block on the topic with .&?BLOCK. It’s very short and very slow. It takes 21.3s for 200891 files — find will do the same in 0.296s.

The OS wont be the bottleneck here, so maybe threading will help. I don’t want to overwhelm the OS with filesystem requests though. The buildin Telemetry module can tell us how many worker threads are sitting on their hands at any given time. If we use Promise to start workers by hand, we can decide to avoid threading when workers are still idle.

sub recurse(IO() $_){
my @ret;
@ret.push: .Str if (.IO.f & ! .IO.l);
  if (.IO.d & ! .IO.l) {
if Telemetry::Instrument::ThreadPool::Snap.new<gtq> > 4 {
@ret.append: do for .dir() { recurse($_) }
} else {
@ret.append: await do for .dir() {
Promise.start({ recurse($_) })
}
}
}
CATCH { default { put BOLD .Str } }
@ret.Slip
}
{
say +recurse('/snapshots/home-2019-01-29');
say now - ENTER now;
}

That takes 7.65s what is a big improvement but still miles from the performance of a 20 year old c implementation. Also find can that do the same and more on a single CPU core instead of producing a load of ~800%.

Poking around in Rakudos source, one can clearly see why. There are loads of IO::Path objects created and c-strings concatenated, just to unbox those c-strings and hand them over to some VM-opcodes. All I want are absolute paths I can call open with. We have to go deeper!

use nqp;

my @files;
my @dirs = '/snapshots/home-2019-01-29';
while @dirs.shift -> str $dir {
my Mu $dirh := nqp::opendir(nqp::unbox_s($dir));
while my str $name = nqp::nextfiledir($dirh) {
next if $name eq '.' | '..';
my str $abs-path = nqp::concat( nqp::concat($dir, '/'), $name);
next if nqp::fileislink($abs-path);
@files.push: $abs-path if nqp::stat($abs-path, nqp::const::STAT_ISREG);
@dirs.push: $abs-path if nqp::stat($abs-path, nqp::const::STAT_ISDIR);
}
CATCH { default { put BOLD .Str, ' ⟨', $dir, '⟩' } }
nqp::closedir($dirh);
}
say +@files; say now - ENTER now;

And this finishes in 2.58s with just 1 core and should play better in situations where not many filehandles are available. Still 9 times slower than find but workable. Wrapping it into a supply is a task for another day.

So for the time being — if you want fast you need nqp.

UPDATE: We need to check the currently waiting workers, not the number of spawned workers. Example changed to Snap.new<gtq>.

Categories: Perl6

A picky caller

January 23, 2019 1 comment

I have got myself a paradoxical harddrive by combining a fast ssd and a sizeable disk by setting up a a bcache on my linux box. Now I got a harddrive that is really fast with small files. There are a few stats bcache is providing via sysfs. To watch them one requires to read a few text files. A well suited task for slurp. I ended up with a bunch of pointy blocks that look like this:

-> $cache {
slurp("/sys/fs/bcache/$cache/cache_available_percent").chomp ~ '%'
}

I put them together with a name into an Array to be able to loop over them. As it turned out there are two groups of pointies. The one group needs the name of the bcache-block device which holds the actualy filesystem. The other group gets the UUID of the cache group. I need to group the output of the two groups so I get:

bcache0:
dirty data: 36.0k
ebc67019-9d50-4042-8080-b173e2ba802f:
hit ratio: 62% 66% 50% 0%
bypassed: 7.0G 2.4G 65.1M 9.1M
cache available: 94%

I could have split up the array but figured that I can check the signature of the pointy instead to select what is being output.

for bcache-devs() -> $dev {
  with $dev {
  say BOLD $dev, ':';
  for @stats -> $name, &f {
  next unless &f.signature.params».name eq '$dev';
  put "\t", $name, ': ', .&f
  }
  }
}

If the name of the positional doesn’t fit I just skip the output.

Next I tried to match the signature by adding subsets of Str to the signature of the pointes. Sadly matching a signature literal like so doesn’t work in this case.

subset Dev of Str;
say 'match' if &f.signature ~~ :(Dev $dev);

If I would define my one classes that would certainly work. It seems the sloppy matching of subsets is working as intended. A bit of a shame because subsets are so easy to set up. For my example just matching the parameter name is fine because it saves time when typing the pointies.

Nonetheless it’s really neat that the caller can has a say if it likes the signature of the callee in Perl 6.

Categories: Perl6

Iterating past the finish

January 11, 2019 Leave a comment

A while ago the question was raised how to augment Any. As it turns out the augmenting part is working but the newly added method is not propagated to children of buildin types. One can force the propagation by calling .compose on all type objects. Getting a list of all parents is done with .^mro and the check is done with .

augment class Cool { method HTML { HTML.new(self) } }
if Cool ∈ T.HOW.mro(T) { T.HOW.compose(T); }

I stepped on a Dragon

The tricky part is to get all predefined classes. Mostly because there is a lot of stuff in CORE:: that doesn’t even implement the interfaces to call methods. We can call .DEFINITE because that’s not a method. So we can weed out all predefined objects and are left with type objects and stuff that’s leaking from NQP into Perl 6-land. Those beasties don’t implement .mro so by guarding with try we can bild a list of all Perl 6 type objects. Those type objects contain IterationEnd. Hence we can’t trust for or anything else that is using iterators to loop over a list. There is also Slip in the list. We can help that by using binding everywhere.

my @a = CORE::.values;
my @types;
for 0..^@a.elems -> $i {
my \T := @a[$i];
try @types[$++] := T if not T.DEFINITE;
}

for 0..^@types.elems -> $i {
my \T := @types[$i];
try if Cool ∈ T.HOW.mro(T) {
T.HOW.compose(T);
}
}

And there we have it. All children of Cool have been re-.composed.

It’s no magic!

There are a few things I learned. Firstly much of the magic behind Perl 6 are just type checks. Anything that deals with iteration of lists or similar constructs is checking for Slip or IterationEnd and branching out to deal with their special nature.

And secondly there are a lot of interfaces leaking into spec-land that have no business there. I’m worried that might bite us later because any useful interface will be abused by programmers sooner or later. I would prefer the inner workings of Rakudo to be well hidden.

I found a way to deal with agumenting build in types so it can’t be long before the core devs fix that bug.

Categories: Perl6

Deconstructing Simple Grammars

May 10, 2018 1 comment

Last year I wrote an egg timer that was parsing command line arguments similar to GNU sleep. I was happy with the stringent form of the parser as follows.

my Seconds $to-wait = @timicles»\
    .split(/<number>/, :v)\
    .map(-> [$,Rat(Any) $count, Str(Any) $unit] --> Seconds { %unit-multipliers{$unit} * $count })\
    .sum;

It does a few simple things and does them one after another. A grammar with an action class would be overkill. I wasn’t happy with using splits ability to return the needle with the parts. It certainly does not improve readability.

After quite a few iterations (and stepping on a bug), I came up with a way to use Str.match instead. If I convert each Match-object into a Hash I can use deconstruction in a signature of a pointy block.

my Seconds $to-wait = @timicles»\
    .match(/<number> <suffix>+/)».hash\ # the +-quatifier is a workaround
    .map(-> % ( Rat(Any) :$number, Str(Any) :$suffix ) { %unit-multipliers{$suffix} * $number })\
    .sum;

Instead of using positionals I can use named arguments that correspond to the named regexes inside the match arguments.

Even in such a small pice of code things fall into place. Hyper-method-calls get rid of simple loops. The well crafted buildin types allow signature deconstruction to actually work without loads of temporary variables. It’s almost as certain language designers where aiming to make a most elegant language.

Categories: Perl6

Expensive Egg-Timers

December 31, 2017 Leave a comment

If you use a CLI you might have done something along the line.

sleep 1m 30s; do-the-next-thing

I have a script called OK that will display a short text in a hopeful green and morse code O-K via the PC speaker. By doing so I turn my computer into an expensive egg-timer.

As of late I found myself waiting for longer periods of time and was missing a count-down so I could estimate how much more time I can waste playing computer games. The result is a program called count-down.

Since I wanted to mimic the behaviour of sleep as closely as possible I had a peek into its source-code. That made me realise how lucky I am to be allowed to use Perl 6. If I strip all the extra bits a count-down needs I’m at 33 lines of code compared to 154 lines of GNU sleep. The boilerplate I have is mostly for readability. Like defining a subset called Seconds and a Rexex called number.

Errors in the arguments to the script will be cought by the where clause in MAINs signature. Since there are no further multi candidates for MAIN that might interfere, the usage message will be displayed automatically if arguments are not recognized. Pretty much all lines in the C implementation deal with argument handling and the fact that they can’t trust their arguments until the last bit of handling is done. With a proper signature a Perl 6 Routine can fully trust its arguments and no further error handling is needed. Compared to the C version (that does a lot less) the code can be read linear from top to bottom and is much more expressive. After changing a few identifiers I didn’t feel the need for comments anymore. Even some unclear code like the splitting on numbers and keeping the values, becomes clear in the next lines where I sum up a list of seconds.

Now I can comfortably count down the rest of a year that was made much better by a made better Perl 6. I wish you all a happy 2018.

Categories: Perl6

Racing Rakudo

November 5, 2017 Leave a comment

In many racing sports telemetry plays a big role in getting faster.  Thanks to a torrent of commits by lizmat you can use telemetry now too!

perl6 -e 'use Telemetry; snapper(½); my @a = (‚aaaa‘..‚zzzz‘).pick(1000); say @a.sort.[*-1 / 2];'
zyzl
Telemetry Report of Process #30304 (2017-11-05T17:24:38Z)
No supervisor thread has been running
Number of Snapshots: 31
Initial Size:        93684 Kbytes
Total Time:          14.74 seconds
Total CPU Usage:     15.08 seconds

wallclock  util%  max-rss  gw      gtc  tw      ttc  aw      atc
   500951  53.81     8424
   500557  51.92     9240
   548677  52.15    12376
   506068  52.51      196
   500380  51.94     8976
   506552  51.74     9240
   500517  52.45     9240
   500482  52.33     9504
   506813  51.67     6864
   502634  51.63
   500520  51.78     6072
   500539  52.13     7128
   503437  52.29     7920
   500419  52.45     8976
   500544  51.89     8712
   500550  49.92     6864
   602948  49.71     8712
   500548  50.33
   500545  49.92      320
   500518  49.92
   500530  49.92
   500529  49.91
   500507  49.92
   506886  50.07
   500510  49.93     1848
   500488  49.93
   500511  49.93
   508389  49.94
   508510  51.27      264
    27636  58.33
--------- ------ -------- --- -------- --- -------- --- --------
 14738710  51.16   130876

Legend:
wallclock  Number of microseconds elapsed
    util%  Percentage of CPU utilization (0..100%)
  max-rss  Maximum resident set size (in Kbytes)
       gw  The number of general worker threads
      gtc  The number of tasks completed in general worker threads
       tw  The number of timer threads
      ttc  The number of tasks completed in timer threads
       aw  The number of affinity threads
      atc  The number of tasks completed in affinity threads

The snapper function takes an interval at which data is collected. On termination of the program the table above is shown.

The module comes with plenty of subs to collect the same data at hand and file your own report. What may be sensible in long running processes. Or you call the reporter sub by hand every now and then.

use Telemetry;

react {
    snapper;
    whenever Supply.interval(60) {
        say report;
    }
}

If the terminal wont cut it you can use http to fetch telemetry data.

Documentation isn’t finished nor is the module. So stay tuning for more data.

Categories: Perl6