Home > Perl6 > Sneaking into a loop

Sneaking into a loop

Zoffix answered a question about Perl 5s <> operator (with a few steps in-between as it happens on IRC) with a one liner.

slurp.words.Bag.sort(-*.value).fmt("%10s => %3d\n").say;

While looking at how this actually works I stepped on a few idioms and a hole. But first lets look at how it works.

The sub slurp will read the whole “file” from STDIN and return a Str. The method Str::words will split the string into a list by some unicode-meaning of word. Coercing the list into a Bag creates a counting Hash and is a shortcut for the following expression.

my %h; %h{$_}++ for <peter paul marry>; dd %h
# OUTPUT«Hash %h = {:marry(1), :paul(1), :peter(1)}␤»

Calling .sort(-*.value) on an Associative will sort by values descending and return a ordered list of Pairs. List::fmt will call Pair::fmt what calls fmt with the .key as its 2nd and .value the parameter. Say will join with a space and output to STDOUT. The last step is a bit wrong because there will be an extra space in from of each line but the first.

slurp.words.Bag.sort(-*.value).fmt(“%10s => %3d”).join(“\n”).say;

Joining by hand is better. That’s a lot for a fairly short one-liner. There is a problem though that I tripped over before. The format string is using %10s to indent and right-align the first column what works nicely unless the values wont fit anymore. So we would need to find the longest word and use .chars to get the column width.

The first thing to solve the problem was a hack. Often you want to write a small CLI tool that reads stuff from STDIN and that requires a test. There is currently no way to have a IO::Handle that reads from a Str (there is an incomplete module for writing). We don’t really need a whole class in that case because slurp will simply call .slurp-rest on $*IN.

$*IN = <peter paul marry peter paul paul> but role { method slurp-rest { self.Str } };

It’s a hack because it will fail on any form of type check and it wont work for anything but slurp. Also we actually untie STDIN from $*IN. Don’t try this at homework.

Now we can happily slurp and start counting.

my %counted-words = slurp.words.Bag;
my $word-width = [max] %counted-words.keys».chars;

And continue the chain where we broke it apart.

%counted-words.sort(-*.value).fmt("%{$word-width}s => %3d").join("\n").say;

Solved but ugly. We broke a one-liner apart‽ Let’s fix fmt to have it whole again.

What we want is a method fmt that takes a Positional, a printf-style format string and a block per %* in the format string. Also we may need a separator to forward to self.fmt.

8 my multi method fmt(Positional:D: $fmt-str, *@width where *.all ~~ Callable, :$separator = " "){
9     self.fmt(
10         $fmt-str.subst(:g, "%*", {
11             my &width = @width[$++] // Failure.new("missing width block");
12             '%' ~ (&width.count == 2 ?? width(self, $_) !! width(self))
13         }), $separator);
14 }

The expression *.all ~~ Callable simply checks if all elements in the slurpy array implement CALL-ME (that’s the real method that is executed with you do foo()).

We then use subst on the format string to replace %*, whereby the replacement is a (closure) block that is called once per match. And here we have the nice idiom I stepped on.

say "1-a 2-b 3-c".subst(:g, /\d/, {<one two three>[$++]});
# OUTPUT«one-a two-b three-c␤»

The anonymous state variable $ is counting up by one from 0 for every execution of the block. What we actually do here is removing a loop by sneaking an extra counter and an array subscript into the loop subst must have. Or we could say that we inject an iterator pull into the loop inside subst. One could argue that subst should accept a Seq as its 2nd positional, what would make a call redundant. Anyway, we got hole plugged.

In line 11 we take one element out of the slurpy array or create a Failure if there is no element. We store the block in a variable because we want to introspect in line 12. If the block takes two Positionals, we feed the topic subst is calling the block with as a 2nd parameter to our stored block. That happens to be a Match and may be useful to react on what was matched. In our case we know that we matched on %* and the current position is counted by $++ anyway. With that done we got a format string augmented with a column provided by the user of our version of fmt.

The user supplied block is called with a list of Pairs. We have to go one level deeper to get the biggest key.

{[max] .values».keys».chars}

That’s what we have to supply to get the first columns width, in a list of Pairs dropping out of Bag.sort.

print %counted-words.sort(-*.value).&fmt("%*s => %3d", {[max] .values».keys».chars}, separator => "\n");

The fancy .&fmt call is required because our free floating method is not a method of List. That spares us to augment List.

And there you have it. Another hole in CORE plugged.

Categories: Perl6
  1. August 11, 2016 at 12:30

    I think you can avoid the custom fmt method by just inlining the max expression into the fmt string:

    my %words = slurp.words.Bag; %words.sort(-*.value).fmt(“%{ [max] %words.keys».chars }s => %3d”).join(“\n”).say;

    • August 11, 2016 at 21:16

      %words.sort(-*.value).fmt(“%{ [max] %words.keys».chars }s => %3d”, “\n”).say;

      Putting the separator at the right spot makes that work. Not the most reusable way to do it and it defies the idea of the format string. It’s meant to be brief.

  1. August 15, 2016 at 23:20

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: