Home > Perl6 > Keep your types safe and sound

Keep your types safe and sound

Typesafety can safe your bum. Both by repelling evildoers who want to inject and by repelling bugs that want to make the users of your code unhappy. Luckily Perl 6 can help here with a fairly nice type-system and operators that don’t make many assumtions about their operands. Since type-checks are done at runtime (mostly) they come with a cost. I wanted to know how big that cost is in a real-life example.

To ease my finger-hurt I wrote a little module that can output XHTML called XHTML::Writer. Having a dropin replacement that prevents the most basic injection woes would be nice. Before I could write that I needed a type for the type-system to work with.

Enter: Typesafe::HTML

class HTML is export {
    has $.the-str is rw;
    method new (Str $s = '') { self.bless(the-str=>$s) }
    method Str () is nodal { $.the-str }
    method perl (){ "HTML.new('{$.the-str.subst(Q{\}, Q{\\}, :g).subst(<'>, <\'>, :g).subst("\n",Q{\\n}, :g)}');"; }
    proto method utf8-to-htmlentity (|) is export {*};
    multi method utf8-to-htmlentity (Str:D \s) is nodal {
        s.subst('&', '&amp;', :g).subst('<', '&lt;', :g)
    }
    multi method utf8-to-htmlentity (Str:D @a) is nodal {
        @a>>.utf8-to-htmlentity()
    }
    multi method utf8-to-htmlentity (HTML:D \h) is nodal {
        h
    }
}

This class can hold a Str that can contain bare HTML or stuff that is ment to look like HTML but should not be rendered as such. We do the latter with quoting with HTML-entities. We can set bare HTML either by accessing $.the-str (I don’t believe in shotguns) or via the constructor .new. The method utf8-to-htmlentity is ment to quote HTML. For now there is no actually typesafety done. For that we need the right operator.

multi sub infix:<~>(HTML:D \l, Str:D \r) is export {
    l.new( l.Str ~ l.utf8-to-htmlentity(r) );
}

multi sub infix:<~>(HTML:D \l, HTML:D \r) is export {
    l.new( l.the-str ~ r.the-str );
}

multi sub infix:<~>(Str:D \l, HTML:D \r) is export {
    r.new( r.utf8-to-htmlentity(l) ~ r.Str );
}

Those three fellow will get any Str quoted that is concatenated with unquoted HTML, if either operand is of type HTML already.

use Typesafe::HTML;

my $quoted = HTML.new ~ '<br>'; # this will be '&lt;br>'

It’s a bit unwieldy to create an emtpy HTML string. Most of the time we don’t want to quote a loose quoted string anyway.

use Typesafe::HTML;
use Typesafe::XHTML::Writer :p, br;

my $mixed = p(class=>'quotedhtml', '<br/>');
# q{<p class="quotedhtml">&lt;br></p>}

my $also-mixed = br ~ '<br/>' ~ br;
# q{<br/>&lt;br><br/>}

Typesafe::XHTML::Writer comes with a little helper called xhtml-skeleton that will surround it’s arguments with XHTML1.1 boilerplate and will convert anything that isn’t HTML to quoted xhtml.

Next step was to find out how big the performence hit for using all those type-checks would be. A little script provided plenty of HTML generating function calls. With 50000 tags it takes about 284s to get the script compiled and 73.1s to execute. The typesafe variant takes 2s longer to compile (could be noise) and 82.8s to execute, what is a 11.7% performance hit. Given that any real-world script would likely spend most time doing something else then generating HTML we can assume that 11.7% is the upper boundary in any case. I asked jnthn if he expects type checks to get faster with more work on Rakudo. The answer was: “yes“.

Categories: Perl6