Monday 2 July 2012

Programming languages should be self documenting

Perl 5.16.0, the latest and most greatest release in Perl's 20+ year history of being awesome added a new keyword that I never knew I wanted. I will illustrate by starting in the past.

The following is a complicated way to calculate triangular numbers, that uses recursion, the best technique in computer science:

sub triangle {
  my $num = shift;
  return 1 if $num == 1;
  return $num + triangle($num-1);
}


That's all very trivial, and in this short example it's very obvious that we're using recursion as the subroutine is so short that the name is very close to our recursive call.

Now, the technique of recursion is very useful in more complex problems, especially those operating on graphs, where the subroutine body might be more than a couple of lines long, in which case I'd have to add the annoying comment to remind the reader what's going on:

sub triangle {
  my $num = shift;
  return 1 if $num == 1;
  return $num + triangle($num-1); # recurses!
}

and, if and when I decide to refactor this subroutine, I need to remember to change the name of the subroutine, and the name I use when I call it. This is entirely pointless busywork.

Instead, now, Perl has introduced the __SUB__ keyword which magically turns into the subroutine you're currently in:

use v5.16;
sub triangle {
  my $num = shift;
  return 1 if $num == 1;
  return $num + __SUB__->($num-1);
}

which is both clearer to read and immune to me changing the name of the routine, and, as a bonus, it's helpful if Perl wants to transform the subroutine for the sake of efficiency.

I never missed this little feature, but now it exists it's so natural that I'm surprised that it never existed until now. That's the sign of a perfect addition to a language.

1 comment:

  1. While it may make it more maitainable I find this actually hurts the clarity of the code. The expression __SUB__->($num-1) no longer means anything on its own. You have to refer back to the name of the function to understand what is doing.

    If you have a two-function recursion it won't help either. It is limited only to a single function, which means ultimately you'll have to mix this with the traditional method of giving the name. This is of course a loss of consistency.

    ReplyDelete