Previous Up Next

This HTML version of Think Perl 6 is provided for convenience, but it is not the best format of the book. You might prefer to read the PDF version.

Appendix A  Solutions to the Exercises

This (long) chapter provides solutions to the exercises suggested in the main matter of this book. However, it contains much more than that.

First, in many cases, it provides several different solutions, illustrating different approaches to a problem, discussing their respective merits or drawbacks and often showing solutions that may be more efficient than others.

Second, it often provides a lot of additional information or complementary examples.

Just the sheer volume of code examples of this chapter is likely to teach you a lot about programming in general and about the Perl 6 language in particular.

Finally, this chapter sometimes introduces (with examples) new concepts that are covered only in later chapters of the book. Having seen such examples may help you to get a smoother grasp to these new ideas when you get to these chapters. In a few cases, this chapter covers or introduces notions that will not be covered anywhere else in the book.

When you solve an exercise, even if you’re confident that you did it successfully, please make sure to consult the solutions in this chapter and to try them: you’re likely to learn quite a bit from them.

A.1  Exercises of Chapter ??: Functions and Subroutines

A.1.1  Exercise ??: Subroutine right-justify (p. ??)

The aim is to write a subroutine that prints a string with enough leading spaces so that the last letter of the string is in column 70 of the display.

This is the first real exercise of this book, so let’s do it step by step:

use v6;
sub right-justify ($input-string) {
    my $str_length = chars $input-string;
    my $missing_length = 70 - $str_length;
    my $leading-spaces = " " x $missing_length;
    say $leading-spaces, $input-string;
}
right-justify("Larry Wall");
right-justify("The quick brown fox jumps over the lazy dog");

This subroutine:

  • Assigns the input string length to the $str_length variable;
  • Computes into the $missing_length variable the number of spaces that will need to be added at the beginning of the displayed line to have it end in column 70;
  • Creates the $leading-spaces string with the needed number of spaces;
  • Prints out the $leading-spaces and $input-string one after the other to obtain the desired result.

This displays the following:

                                                            Larry Wall
                           The quick brown fox jumps over the lazy dog

We can, however, make this code shorter by composing some of the statements and expressions:

sub right-justify ($input-string) {
    my $leading-spaces = " " x (70 - $input-string.chars);
    say $leading-spaces, $input-string;
}

It could even be boiled down to a shorter single-line subroutine:

sub right-justify ($input-string) {
    say " " x 70 - $input-string.chars, $input-string;
}

This works fine, but it may be argued that this last version is less clear. In fact, the $leading-spaces temporary variable used in the previous version had a name that self-documented what the subroutine is doing. You can make very concise code as in the last example above, but sometimes it may become a little bit too terse, so there is a tradeoff between concision and clarity.

Note that there are two built-in functions, printf and sprintf, that can perform a similar task, but we will cover them later. There is also a .fmt method for producing formatted output.

A.1.2  Exercise ??: Subroutine do-twice (p. ??)

To add an addressee to the greeting, we need to:

  • Pass a second argument in the call to do-twice (the string “World”)
  • Add a new parameter in the do-twice subroutine signature ($addressee)
  • Add this new parameter as an argument in the calls to $code
  • Add a signature with one parameter ($addr) in the definition of the greet subroutine
  • Use this new parameter in the print statement

This leads to the following code:

sub do-twice($code, $addressee) {
    $code($addressee); 
    $code($addressee);
}
sub greet (Str $addr) {
    say "Hello $addr!";
}
do-twice &greet, "World";

This displays:

Hello World!
Hello World!

For the next question, we replace the greet subroutine by the print-twice subroutine:

sub do-twice($code, $message) {
    $code($message); 
    $code($message);
}
sub print-twice($value) {
    say $value;
    say $value;
}
do-twice &print-twice, "What's up doc";

This prints “What’s up doc” four times.

Finally, we add the new do-four subroutine and let it call the do-twice subroutine twice, printing the message eight times:

sub do-twice($code, $message) {
    $code($message); 
    $code($message);
}
sub print-twice($value) {
    say $value;
    say $value;
}
sub do-four ($code, $message) {
    do-twice $code, $message;
    do-twice $code, $message;
}
do-four &print-twice, "What's up doc";

A.1.3  Exercise ??: Subroutine print-grid (p. ??)

To print a grid such as the one requested in the exercise, we need to print each line one by one, and we basically have two types of lines: the three “dotted lines” and the eight lines without dashes, which we’ll call “empty lines” for lack of a better name, because they are are partly empty (no dashes).

To avoid code repetition, one way to do it is to create a string for each of the two line types and to print these strings in accordance with the needs.

This is one possible solution:

use v6;

my $four-dashes = "- " x 4;
my $dotted_line = ("+ " ~ $four-dashes ) x 2 ~ "+" ;
my $spaces = " " x 9;
my $empty-line = ("|" ~ $spaces ) x 2 ~ "|" ;

sub say-four-times($value) {
    say $value;
    say $value;
    say $value;
    say $value;
}
sub print-grid {
    say $dotted_line;
    say-four-times $empty-line;
    say $dotted_line;
    say-four-times $empty-line;
    say $dotted_line;
}
print-grid;

There are obviously better ways to do something four times than just repeating say $value; four times as in the say-four-times subroutine above, but this will be covered in the Chapter?? (see Section ??).

To draw a similar grid with four rows and four columns, we first need to modify the strings used for printing the lines:

my $dotted_line = ("+ " ~ $four-dashes ) x 4 ~ "+" ;
# ...
my $empty-line = ("|" ~ $spaces ) x 4 ~ "|" ;

In addition to that, we could modify print-grid to just print each line the required number of times. But that would involve quite a bit of code repetition, and the aim of this exercise is to use subroutines to permit code reuse.

There are now two things that we repeatedly need to do four times. It makes sense to write a do-four-times subroutine that will be used both for creating the say-four-times subroutine (in charge of printing the four empty lines) and for calling entire rows four times. This subroutine will be passed the code reference for doing the specific actions required:

my $four-dashes = "- " x 4;
my $dotted_line = ("+ " ~ $four-dashes ) x 4 ~ "+" ;
my $spaces = " " x 9;
my $empty-line = ("|" ~ $spaces ) x 4 ~ "|" ;

sub do-four-times ($code) {
    $code();
    $code();
    $code();
    $code();
}
sub say-four-times($value) {
    do-four-times(sub {say $value});
}
sub print-bottom-less-grid {
    say $dotted_line;
    say-four-times $empty-line;
}
sub print-grid {
    do-four-times(&print-bottom-less-grid);
    say $dotted_line;
}
print-grid;

In addition, rather than declaring global variables for the line strings, it is better practice to declare and define them within the subroutines where they are used. We also no longer need the say-four-times subroutine; we can just pass the relevant arguments to the do-four-times subroutine to the same effect. This could lead to the following program:

sub do-four-times ($code, $val) {
    $code($val);
    $code($val);
    $code($val);
    $code($val);
}
sub print-bottom-less-grid($dot-line) {
    say $dot-line;
    my $spaces = " " x 9;
    my $empty-line = ("|" ~ $spaces ) x 4 ~ "|" ;
    do-four-times(&say, $empty-line);
}
sub print-grid {
    my $four-dashes = "- " x 4;
    my $dotted_line = ("+ " ~ $four-dashes ) x 4 ~ "+" ;
    do-four-times(&print-bottom-less-grid, $dotted_line);
    say $dotted_line;
}
print-grid;

A.2  Exercises of Chapter ??: Conditionals and Recursion

A.2.1  Subroutine do-n-times, Exercise Suggested in Section ?? (p. ??)

We need a subroutine that takes a function and a number, $num, as arguments, and that calls the given function $num times.

The do-n-times subroutine is recursive and is calling itself each time with a decremented argument. It stops “recursing” when this argument is 0. $subref is an anonymous subroutine called within the body of do-n-times:

sub do-n-times ($coderef, Int $num) {
     return if $num <= 0;
     $coderef();
     do-n-times $coderef, $num - 1;
}

my $subref = sub { say "Carpe diem";}

do-n-times $subref, 4;

This prints:

Carpe diem
Carpe diem
Carpe diem
Carpe diem

A.2.2  Exercise ??: Days, Hours, Minutes, and Seconds (p. ??)

The following is one possible way of converting a number of seconds into a number of days, hours, minutes, and seconds:

days-HMS(240_000);

sub days-HMS (Int $seconds) {
    my $minutes = $seconds div 60;
    my $sec_left = $seconds mod 60;
    my ($hours, $min_left) = $minutes div 60, $minutes mod 60;
    my ($days, $hours_left) = $hours div 24, $hours mod 24;
    say "$days $hours_left $min_left $sec_left"; 
              #  prints: 2 18 40 0
}

The first two lines do the integer division and modulo operation separately. For the next two cases, we do both operations in one single line, using a list syntax.

The $minutes, $hours, and $days variables are all computed in essentially the same way. The code could be made more modular by using a subroutine to compute $minutes, $hours, and $days. Although fruitful subroutines will really be studied in the course of the next chapter, we have seen a couple of examples of them and can provide the gist about how they could be used:

sub div_mod (Int $input, Int $num-base) {
    return $input div $num-base, $input mod $num-base;
}
sub days-HMS (Int $seconds) {
    my ($minutes, $sec_left) = div_mod $seconds, 60;
    my ($hours, $min_left)   = div_mod $minutes, 60;
    my ($days, $hours_left)  = div_mod $hours, 24;
    say "$days $hours_left $min_left $sec_left"; 
}

To ask a user to enter a number of seconds, you might do this:

my $sec = prompt "Please enter the number of seconds: ";
days-HMS $sec.Int;

In real life, it would usually be good to verify that the user-provided input is a positive integer and ask again if it is not. As a further exercise, you might try to insert the above code into a recursive subroutine that prints an error message and calls itself again if the user input is not valid. The solution to the next exercise (Section ??) gives an example of a recursive subroutine designed to prompt the user to supply input again; this might help you figure out how to do it if you encounter difficulty.

Try replacing the following code line:

say "$days $hours_left $min_left $sec_left"; 

with this one:

printf "%d days %d hours %d minutes %d seconds \n", days-HMS 240_000;

to see better-formatted output.

A.2.3  Exercise ??: Fermat’s Theorem (p. ??)

The check-fermat subroutine checks whether:

an + bn = cn 

is true for the supplied values of a, b, c, and n.

sub check-fermat (Int $a, Int $b, Int $c, Int $n) {
    if $a**$n + $b**$n == $c**$n {
        if $n > 2 {
            say "Holy smokes, Fermat was wrong!" if $n > 2;
        } elsif $n == 2 or $n ==1 {
            say "Correct";
        }
        return True; 
    }
    return False
}

say "Correct for 3, 4, 5, 2" if check-fermat 3, 4, 5, 2;
get-input();

sub get-input {
    say "Your mission, Jim, should you decide to accept it, is to ";
    say "provide values of A, B, C and n satisfying Fermat's equation:";
    say "  A ** n + B ** n = C * *n";
    my $a = prompt "Please provide a value for A: ";
    my $b = prompt "Please provide a value for B: ";
    my $c = prompt "Please provide a value for C: ";
    my $n = prompt "Please provide a value for the exponent: ";
    if check-fermat($a.Int, $b.Int, $c.Int, $n.Int) {
        say "The equation holds true for your values";
    } else {
        say "Nope. The equation is not right."
    }
    my $try-again = prompt "Want to try again (Y/N)?";
    get-input if $try-again eq 'Y';
}

Fermat’s last theorem has been proven and, needless to say, the mission is truly impossible if n > 2; perhaps this time Jim Phelps should decline to accept the mission.

A.2.4  Exercise ??: Is it a Triangle? (p. ??)

This is a possible routine to find out whether you can make a triangle with three given stick lengths:

sub is-triangle ( Numeric $x, Numeric $y, Numeric $z) {
    my $valid = True;
    $valid = False if $x > $y + $z;
    $valid = False if $y > $x + $z;
    $valid = False if $z > $x + $y;
    if $valid {
        say "Yes"; 
    } else {
        say "No";
    }
}
is-triangle 1, 3, 4;  # -> Yes
is-triangle 1, 3, 6;  # -> No

Another way to do this would be to start by finding the greatest length and test only that one, but that does not make the algorithm significantly simpler.

Prompting the user to input three length has been shown in the previous two exercises; there is nothing really new here. However, this is one new way of doing it:

my ($a, $b, $c) = split " ", 
    prompt "Please enter three lengths (separated by spaces): ";
is-triangle $a.Int , $b.Int , $c.Int;

A.2.5  Exercise ??: The Fibonacci Numbers (p. ??)

The Fibonacci numbers are a sequence of numbers in which the first two numbers are equal to 1 and any subsequent number is the sum of the two preceding ones, for example:

1, 1, 2, 3, 5, 8, 13, 21, 34, ...

Printing the first 20 Fibonacci numbers:

sub n_fibonacci (Int $n) {
    my $fib1 = 1;
    my $fib2 = 1;
    say $_ for $fib1, $fib2;
    for 3..$n {
        my $new_fib = $fib1 + $fib2;
        say $new_fib;
        ($fib1, $fib2) = $fib2, $new_fib;
    }
}
n_fibonacci 20;

Printing the nth Fibonacci number:

my $n = prompt "Enter the searched Fibonacci number: ";
$n = $n.Int;
say fibo($n);

sub fibo (Int $n) {
    my ($fib1, $fib2) = 1, 1;
    for 3..$n {
        my $new_fib = $fib1 + $fib2;
        ($fib1, $fib2) = $fib2, $new_fib;
    }
    return $fib2;
}    

A.2.6  Exercise ??: The recurse Subroutine (p. ??)

Examining the code of the recurse subroutine, the first thing you should notice is that each time it is called recursively, the first argument ($n) is decremented by one compared to the previous call. If the initial value of $n is a positive integer, the succession of calls will eventually lead to the base case where $n == 0, and the cascade of recursive calls will eventually stop.

If $n is not an integer or if it is negative, we will get into infinite recursion.

One way to visualize how the program runs is to display the subroutine parameters at each call:

sub recurse($n,$s) {
    say "Args : n = $n, s = $s";
    if ($n == 0) {
        say $s;
    } else {
        recurse $n - 1, $n + $s;
    }
}
recurse 3, 0;

This would print:

Args : n = 3, s = 0
Args : n = 2, s = 3
Args : n = 1, s = 5
Args : n = 0, s = 6
6

To guard against arguments leading to infinite recursion, we can add integer type constraints to the subroutine signature and some code to stop recursion if the first argument is negative, for example:

sub recurse(Int $n, Int $s) {
    say "Args : n = $n, s = $s";
    if $n == 0 {
        say $s;
    } elsif $n < 0 {
        die 'STOP! negative $n, we need to give up';
    } else {
        recurse $n - 1, $n + $s;
    }
}

Now, if we call recurse with a negative value for $n, we get an error message:

Args : n = -1, s = 0
STOP! negative $n, we need to give up
  in sub recurse at recurse2.pl6 line 6
  in block <unit> at recurse2.pl6 line 12

And if we call it with a non integer value for $n:

===SORRY!=== Error while compiling recurse2.pl6
Calling recurse(Rat, Int) will never work with declared
signature (Int $n, Int $s)
at recurse2.pl6:12
------> <BOL><HERE>recurse 1.5, 0;

Another possibility might be to use a feature of Perl 6 which we haven’t covered yet, multi subroutines, described in Section ?? (p. ??). The idea is to declare two versions of the recurse subroutine, which have the same name but different signatures. The compiler will figure out which version of recurse to call depending on which signature applies to the arguments passed to the subroutine:

multi recurse(Int $n where $n >= 0, $s) {
    say "Args : n = $n, s = $s";
    if ($n == 0) {
        say $s;
    } else {
        recurse $n - 1, $n + $s;
    }
}

multi recurse($n , $s) {
    say "Args : n = $n, s = $s";
    # do something else for such a case, for example:
    # recurse (abs $n.Int), $s; # calling the 1st version of recurse
    # or simply:
    say 'STOP! invalid $n, we need to give up';
}

If the first parameter is a positive integer, the first version of recurse will be called. Otherwise, the second version that will run:

$ perl6 recurse.pl6
Args : n = 6.1, s = 0
STOP! negative $n, we need to give up

$ perl6 recurse.pl6
Args : n = -1, s = 0
STOP! invalid $n, we need to give up

Try running the following code for the second definition of recurse:

multi recurse($n , $s) {
    say "Args : n = $n, s = $s";
    recurse (abs $n.Int), $s;
}

to see what is happening in that case.

A.3  Exercises of Chapter ??: Fruitful Functions

A.3.1  Compare, exercise at the end of Section ?? (p. ??)

Here’s a subroutine that takes two numbers and compares them, and returns 1 if the first one is larger than the second, 0 if they are equal, and -1 otherwise (i.e., if the second is larger than the first):

sub compare (Numeric $x, Numeric $y) {
    return 1 if $x > $y;
    return 0 if $x == $y;
    return -1;
}

say compare 4, 7;   # -1
say compare 7, 4;   # 1
say compare 5, 5;   # 0

Note: this exemplifies a three-way compare function commonly used for sorting in a number of programming languages, including older versions of Perl (such as Perl 5). In Perl 6, the operators implementing this functionality (the three-way comparators cmp, leg and <=>) return special types of values: Order::More, Order::Less, and Order::Same. (See Section ?? on sorting in the chapter about arrays and lists for more details.)

A.3.2  Hypotenuse, exercise at the end of Section ?? (p. ??)

The aim is to use an incremental development plan for calculating the hypotenuse of a right triangle (using the Pythagorean theorem).

We could start with an outline of the subroutine:

sub hypotenuse(Numeric $x, Numeric $y) {
    return 0;
}
say hypotenuse 3, 4;

This will obviously print 0.

Next, we calculate the hypotenuse and print it within the subroutine:

sub hypotenuse(Numeric $x, Numeric $y) {
    my $hypotenuse = sqrt ($x ** 2 + $y ** 2);
    say "hypotenuse = $hypotenuse";
    return 0.0;
}
say hypotenuse 3, 4;

This prints:

hypotenuse = 5
0

The subroutine is calculating correctly the hypotenuse (5), but is still returning the 0 dummy value. We can now return safely the result (and remove the scaffolding):

sub hypotenuse(Numeric $x, Numeric $y) {
    my $hypotenuse = sqrt ($x ** 2 + $y ** 2);
    return $hypotenuse;
}
say hypotenuse 3, 4;

This prints correctly the value of the hypotenuse.

Finally, we can, if we wish, remove the temporary variable to further simplify the subroutine:

sub hypotenuse(Numeric $x, Numeric $y) {
    return sqrt ($x ** 2 + $y ** 2);
}
say hypotenuse 3, 4;

A.3.3  Chained Relational Operators(in Section ??)

We need a subroutine to figure out whether xyz is true or false. We simply need to test it with a chained relational operator and return that:

sub is-between(Numeric $x, Numeric $y, Numeric $z) {
    return $x <= $y <= $z;
}
say is-between 3, 5, 6; # True
say is-between 3, 8, 7; # False
say is-between 6, 5, 6; # False
say is-between 6, 6, 7; # True

Note that the tests provided here are just a limited number of examples, given for illustration purposes. A more complete test suite might be needed (testing for example negative and non integer numbers). We will see later better ways of building more robust test suites (see for example Section ?? and the exercise solution in Section ??).

A.3.4  The Ackermann Function (Exercise ??)

Write a subroutine to compute the Ackermann function. The Ackermann function, A(m, n), is defined as follows:

A(mn) = 




              n+1if  m = 0 
        A(m−1, 1)if  m > 0  and  n = 0 
A(m−1, A(mn−1))if  m > 0  and  n > 0.

Here’s one way to compute the Ackermann function in Perl:

sub ack (Int $m, Int $n) {
    return $n + 1 if $m == 0;
    return ack($m - 1, 1) if $n == 0;
    return ack($m - 1, ack($m, $n-1));
}
say ack 3, 4;    # -> 125

We have used parentheses to better show the structure, but it works well without them. Even in the last code line with two calls to the subroutine, the subroutine signature with two integer numbers is sufficient for the Perl compiler to understand which arguments are associated with which call:

sub ack (Int $m, Int $n) {
    # say "m n = $m, $n";
    return $n + 1 if $m == 0;
    return ack $m - 1, 1 if $n == 0;
    return ack $m - 1, ack $m, $n-1;
}

The Ackermann function is defined for nonnegative integers. As a further exercise, modify the ack subroutine to prevent negative arguments. We discussed two different ways of doing that in Section ??.

A.3.5  Palindromes (Exercise ??)

Write a recursive subroutine that checks if a word is a palindrome:

sub first_letter(Str $word where $word.chars >= 2){
    return substr $word, 0, 1;
}

sub last_letter(Str $word){
    return substr $word, *-1, 1;
}

sub middle_letter(Str $word){
    return substr $word, 1, *-1;
}

sub is_palindrome(Str $word) {
    return True if $word.chars <= 1;
    return False if first_letter($word) ne last_letter($word);
    return is_palindrome(middle_letter($word))
}
for ("bob", "otto", "laurent", "redivider", "detartrated") -> $x {
    say "Is $x a palindrome? Answer: ", is_palindrome($x);
}

Result:

Is bob a palindrome? Answer: True
Is otto a palindrome? Answer: True
Is laurent a palindrome? Answer: False
Is redivider a palindrome? Answer: True
Is detartrated a palindrome? Answer: True

The third parameter (length) of the substr built-in function is optional. In that case, substr will return all characters from a given position. So the first_letter subroutine could be simplified as follows:

sub first_letter(Str $word where $word.chars >= 2){
    return substr $word, 0;
}

And the last_letter subroutine could benefit from the same simplification.

Note: the built-in flip function or .flip method of Perl returns a reversed version of a string and would provide a much easier solution:

sub is_palindrome(Str $word) {
    return $word eq $word.flip;
}

A.3.6  Powers (Exercise ??)

Write a recursive subroutine checking whether a number is a power of another number:

sub is-power-of (Int $a, Int $b) {
    return False unless $a %% $b;
    return True if $a == $b;
    return is-power-of Int($a/$b), $b;
}

say is-power-of 16, 4;
say is-power-of 25, 5;
say is-power-of 125, 5;
say is-power-of 600, 20;
say is-power-of 8000, 20;

Example run:

True
True
True
False
True

Adding an execution trace to visualize the recursive calls:

sub is-power-of (Int $a, Int $b) {     
    return False unless $a %% $b;      
    return True if $a == $b;           
    say "$a\t$b";                      
    return is-power-of Int($a/$b), $b; 
}                                      

Running is-power-of with arguments 1024 and 2, with a printed trace of $a and $b:

1024    2
512     2
256     2
128     2
64      2
32      2
16      2
8       2
4       2
True

A.3.7  Finding the GCD of Two Numbers, Exercise ?? (p. ??)

Write a subroutine that returns the greatest common divisor of two numbers:

sub gcd(Int $a, Int $b) {
    return $a if $b == 0;
    return $b if $a == 0;
    return gcd($b, $a mod $b);
}

say gcd 125, 25;
say gcd 2048, 256;
say gcd 256, 4096;
say gcd 2048, 1;
say gcd 0, 256;
say gcd 33, 45;

Note that there is a simpler method to find the GCD of two numbers without using the modulo function. It is known as Euclid’s algorithm and is considered as the oldest known algorithm (see https://en.wikipedia.org/wiki/Euclidean_algorithm). The Euclidean algorithm is based on the observation that the GCD of two numbers does not change if the larger number is replaced by its difference with the smaller number.

This might be implemented in Perl with the following recursive subroutine:

sub gcd(Int $a, Int $b) { 
    return gcd($b, $a - $b) if $a > $b;
    return gcd($a, $b - $a) if $b > $a;
    return $a;
}

This code works perfectly well in almost all cases, at least for all strictly positive input values, but try to follow the flow of execution if one of the two arguments passed to the subroutine, say $b, is zero. In this case, gcd enters in an infinite recursion. This is often called an edge case or a corner case, i.e., a special input value for which an apparently well-working program ceases to function properly.

We have a similar problem for negative input values.

One solution might be to add a signature constraint (or use a type subset):

sub gcd(Int $a where $a > 0, Int $b where $b > 0) {
    ...
}

but this is not really satisfactory because the GCD of any nonzero integer and 0 is well defined mathematically and is equal to the first number.

Leaving aside for the moment the case of negative numbers, we could rewrite our subroutine as follows:

sub gcd(Int $a, Int $b) { 
    return $a if $b == 0;
    return $b if $a == 0;
    return gcd($b, $a - $b) if $a > $b;
    return gcd($a, $b - $a) if $b > $a;
    return $a;
}

Concerning negative numbers, there is a theorem stating that the GCD of a and b is the same as the GCD of a and −b:

gcd(a,b) = gcd(−a,b) = gcd(a,−b) = gcd(−a,−b)

We can modify further the gcd subroutine:

sub gcd(Int $a is copy, Int $b is copy) { 
    $a = -$a if $a < 0;
    $b = -$b if $b < 0;
    return $a if $b == 0;
    return $b if $a == 0;
    return gcd($b, $a - $b) if $a > $b;
    return gcd($a, $b - $a) if $b > $a;
    return $a;
}

This is now working fine, but remember that a recursive subroutine may be called many times and, for each call, the first four code lines in the program above are executed, although they are really useful only at the first call: once these conditions have been checked during the first call to the subroutine, we know that the arguments must be and remain valid in the chain of recursive calls, so these checks are useless after the first call. This is somewhat wasteful and may lead to unnecessary performance problems.

Ideally, it might be better to separate these four lines that check the preconditions from the cascade of recursive calls. For example, might write two subroutines:

sub gcd1(Int $c, Int $d) {
    return gcd1($d, $c - $d) if $c > $d;
    return gcd1($c, $d - $c) if $d > $c;
    return $c;
}

sub gcd(Int $a is copy, Int $b is copy) { 
    $a = -$a if $a < 0;
    $b = -$b if $b < 0;
    return $a if $b == 0;
    return $b if $a == 0;
    return gcd1 $a, $b;
}

Now, gcd is making all the necessary checks on the initial arguments and calls the recursive gcd1 subroutine with arguments that have been sanitized and will not lead to infinite recursion. Note that we have renamed the parameters within gcd1 for better clarity, but this was not necessary; it would just work the same if we had kept $a and $b.

The preceding code works perfectly well.

There may be a last problem, though. Someone being not careful enough (or wanting to be too clever) might decide to call directly gcd1, thus annihilating the benefits of the checks made by gcd. To prevent that, we can make good use of the fact that subroutines have lexical scope in Perl 6 and can be made local to another subroutine: we can declare and define gcd1 within the body of the gcd subroutine, so that gcd1 can be called only from within the gcd subroutine:

sub gcd(Int $a is copy, Int $b is copy) { 
    sub gcd1($c, $d) {
        return gcd1($d, $c - $d) if $c > $d;
        return gcd1($c, $d - $c) if $d > $c;
        return $c;
    }
    $a = -$a if $a < 0;
    $b = -$b if $b < 0;
    return $a if $b == 0;
    return $b if $a == 0;
    return gcd1 $a, $b;
}

say gcd 125, 25;       # 25
say gcd 2048, 256;     # 256
say gcd 256, 4096;     # 256
say gcd 2048, 1;       # 1
say gcd 0, 256;        # 256
say gcd 33, 45;        # 3
say gcd -4, 6;         # 2

Chapter ?? will come back to lexical scoping.

You may be interested to know that there is a builtin gcd function in Perl 6.

A.4  Exercises of Chapter ?? (Iteration)

A.4.1  Exercise ??: Square Root (p. ??)

We need a subroutine to find the square root of a number by computing successively better approximations of the root, using Newton’s method.

For this exercise, I’ve made the following somewhat arbitrary decisions:

  • I have chosen an epsilon value of 10−11 (or 1e-11).
  • I have used $a/2 as the initial estimate of √$a.

Note that it might make more sense to make this initial estimate within the my-sqrt subroutine, rather than having the caller pass it as an argument. The rationale for doing it in the caller is that, in some cases, the caller might have information on the range of the input value and might therefore be able to provide a better initial estimate, leading the algorithm to converge toward the solution slightly faster.

Here’s an implementation of Newton’s method for computing the square root of a number:

sub my-sqrt ($a, $estimate is copy) {
    my $epsilon = 1e-11;
    while True {
        # say "-- Intermediate value: $estimate";
        my $y = ($estimate + $a/$estimate) / 2;
        last if abs($y - $estimate) < $epsilon;
        $estimate = $y;
    }
    return $estimate;
}

sub print-result ($a, $r, $s, $d) {
    printf "%d  %.13f  %.13f  %.6e \n", $a, $r, $s, $d;
}

sub test-square-root {
    say "a  mysqrt(a)\t    sqrt(a)\t     diff";
    for 1..9 -> $a {
        my $init-estimate = $a/2;
        my $result = my-sqrt $a, $init-estimate;
        my $sqrt = sqrt $a;
        my $diff = abs($result - $sqrt);
        print-result($a, $result, $sqrt, $diff);
    }
}
    
test-square-root;

The printf ("formatted print") function used in the print-result subroutine is derived from the C programming language. Its first argument is a format string, which describes how each of the following arguments should be formatted. Here, the format string requests the compiler to output the first subsequent argument as a signed integer (the %d part of the format string), the next two arguments as floating-point numbers with 13 digits after the decimal point (the %.13f part), and the last argument as a floating-point number in scientific notation with 6 digits after the decimal point (%.6e).

A.4.2  Exercise ??: Pi Estimate (p. ??)

Pi estimate according to Srinivasa Ramanujan’s algorithm:

sub factorial(Int $n) {
    return 1 if $n == 0;
    return $n * factorial $n-1;
}

sub estimate-pi {
    #`{ ======================================
        Algorithm by Srinivasa Ramanujan 
        (see http://en.wikipedia.org/wiki/Pi)
        ======================================
      }
    my $factor = 2 * 2.sqrt / 9801;
    my $k = 0;
    my $sum = 0;
    while True {
        my $num = factorial(4*$k) * (1103 + 26390*$k);
        my $den = factorial($k)**4 * 396**(4*$k);
        my $term += $factor * $num / $den;
        # say "Intermediate term = $term";
        last if abs($term) < 1e-15;
        $sum += $term;
        $k++;
    }
    return 1 / $sum;
}

say estimate-pi;
# say pi - estimate-pi;

This prints: 3.14159265358979.

Notice how we have used a multiline comment to give some additional information about the subroutine.

Uncommenting the intermediate print statement shows the steps toward the solution:

Intermediate term = 0.31830987844047
Intermediate term = 7.74332048352151e-009
Intermediate term = 6.47985705171744e-017
-4.44089209850063e-016

A.5  Exercises of Chapter ?? (Strings)

A.5.1  Exercise in Section ??: String Traversal (p. ??)

The backward traversal of a word with a while loop may be written as follows:

my $fruit = "banana";
my $index = $fruit.chars;
while $index > 0 { 
    $index--;
    my $letter = substr $fruit, $index, 1; 
    say $letter; 
}

The chars method returns the length of the string. The substr function will find letters under $index between 0 and $length - 1. It is therefore practical to decrement the $index variable before using the substr function.

The while loop of the preceding code example can be written more concisely:

my $fruit = "banana";
my $index = $fruit.chars;
while $index > 0 { 
    say substr $fruit, --$index, 1; 
}

Here, we print directly the value returned by substr, without using a temporary variable, and we decrement the $index variable within the expression using substr. We need to use the prefix form of the decrement operator because we need $index to be decremented before it is used by substr.

The loop would be even more concise if we used a while with a statement modifier (or the postfix syntax of while):

my $fruit = "banana";
my $index = $fruit.chars;
say substr $fruit, --$index, 1 while $index;

This is the same idea, using the flip function to reverse the string:

my $fruit = flip "banana";
my $index = 0;
say substr $rev_fruit, $index++, 1 while $index < $rev_fruit.chars;

The aim of this exercise was to train you to use loops to traverse the string. Combining the flip and comb functions or methods would of course make our solution much simpler (and probably faster):

.say for "banana".flip.comb;

A.5.2  Exercise in Section ??: The Ducklings (p. ??)

The first idea that may come to mind for this exercise is to build a modified list of prefixes this way:

for 'J' .. 'N', 'Ou', 'P', 'Qu' -> $letter { #...}

But this does not work properly because it creates a list of four elements in which the first element is itself a sublist “J” to “N”:

> say ('J'..'N', 'Ou', 'P', 'Qu').perl;
("J".."N", "Ou", "P", "Qu")

We will come back to this later in the book, but let us just say that we need to flatten this combination of lists into one single iterable list, which can be done with the flat method or function or the “|” operator:

for ('J' .. 'N', 'Ou', 'P', 'Qu').flat     -> $letter {#...}
# or: for flat 'J' .. 'N', 'Ou', 'P', 'Qu' -> $letter {...}
# or: for |('J' .. 'N'), 'Ou', 'P', 'Qu'   -> $letter {...}
# Note: parentheses needed in the last example above with | 
# to overcome precedence problem

With this small difficulty removed, the solution is now easy:

my $suffix = 'ack';
for ('J' .. 'N', 'Ou', 'P', 'Qu').flat -> $letter {
    say $letter ~ $suffix;
}

Here again, we could make the code slightly more concise with the postfix syntax of for and the $_ topical variable:

my $suffix = 'ack';
say "$_$suffix" for flat 'J' .. 'N', 'Ou', 'P', 'Qu';

Here, we introduced another simple and common way of concatenating two strings: simply inserting the two variables one after the other within double quotes and letting variable interpolation do the work.

A.5.3  Exercise in Section ??: Counting the Letters of a String (p. ??)

This subroutine counts the number of occurrences of a specific letter within a word (or any string):

sub count (Str $word, Str $letter) {
    my $count = 0;
    for $word.comb -> $letter {
        $count++ if $letter eq 'a';
    }
    return $count;   
}
say count "banana", "a";   # -> 3

The solution to Exercise ?? (p. ??) below uses the index and substr functions to perform the same count .

A.5.4  Section ??: Simulating a Regex with a Loop (p. ??)

The aim is to find in a string any letter that is immediately preceded by the letter “l” and followed by the letter “w”.

If you try to do the specified search with the techniques we’ve seen so far, you’ll find out that there are a number of edge cases making it quite complicated.

This is a possible solution:

sub traverse (Str $word, Str $start_letter, Str $end_letter) {
    my $found_start = False;
    my $capture_next = False;
    my $target_letter;
    for 0..$word.chars - 1 -> $idx {
        my $letter = substr $word, $idx, 1;
        next unless $letter eq $start_letter or $found_start;
        if ($capture_next) {
            $target_letter = $letter;
            $capture_next = False;
            next;
        }
        if $letter eq $start_letter and not $found_start {
            $found_start = True;
            $capture_next = True;
            next;
        }
        # if we get there, we have found a candidate target letter
        if $letter eq $end_letter {
            return $target_letter
        } else {
            # wrong match, let's start again, we need to backup
            if $target_letter eq $start_letter {
                 $target_letter = $letter;
                 $capture_next = False;
            } elsif $letter eq $start_letter {
                 $capture_next = True;
            } else {
               $capture_next = False;
               $found_start = False;
            } 
        } 
    }
    return;  # not found!
}          

for <s b   l w   l o   s m   y l   a z> -> $st, $end {
    say "$st $end: ", traverse "yellow submarine", $st, $end;  
}

As you can see, this is quite complicated because of the various edge cases that need to be handled. Compare this with the one-line regex that does the same:

say ~$0 if "yellow submarine" ~~ /l(.)w/;

To tell the truth, I haven’t chosen the simplest way of doing it.

It is much easier to loop on every letter of the string except the first one and the last one and, for each such letter, to check what the previous letter and the next are. Then you simply need to return the current letter if the previous and the next match the conditions:

sub traverse (Str $word, Str $start_letter, Str $end_letter) {
    my $found_start = False;
    my $capture_next = False;
    my $target_letter;
    for 1..$word.chars - 2 -> $idx {
        if $start_letter eq substr $word, $idx - 1, 1
          and $end_letter eq substr $word, $idx + 1, 1 {
            return substr $word, $idx, 1;
        }
     }
    return;  # not found!
}          

for <s b   l w   l o   s m   y l   a z> -> $st, $end {
    say "$st $end: ", traverse "yellow submarine", $st, $end;  
}

In the test cases at the end, I use a for loop with a pointy block construct in which I pick two of the items in the list each time through the loop. The numbers of spaces between the items of the list are technically useless and irrelevant to the way the syntactic construct works; they are just a formatting help for the reader to better see how the letters will be grouped in the process.

This displays:

s b: u
l w: o
l o: l
s m: Nil
y l: e
a z: Nil

This is much simpler than the previous attempt, but it would still be quite difficult to change something, for example to add a new condition: the structure of the code would probably need to be reworked quite a bit.

Even compared with this simpler solution, the regex solution really shines orders of magnitude brighter.

A.5.5  Exercises in Subsection ??: Regex Exercises (p. ??)

As is often the case in Perl, and even more so with regexes, there is more than one way to do it (TIMTOWTDI). Most of the exercises suggested here have more than one solution (and sometimes many).

With regexes, you also have to think carefully about the input data to figure out what should be matched and what should be rejected.

A.5.5.1  Ten digits in a row

Here’s a way to find 10 consecutive digits in a string:

my $string = "567867 8778689 6765432 0123456789 897898";
say ~$0 if $string ~~ /(\d ** 10)/; # -> 0123456789

We are simply using the \d (digit) character class together with a quantifier specifying this class 10 times .

Note that we have used capturing parentheses here in order to populate the matched number into $0. We could also omit parentheses and retrieve the number from the match object:

my $string = "567867 8778689 6765432 0123456789 897898";
say ~$/ if $string ~~ /\d ** 10/; # -> 0123456789

The above solutions would match any 10-digit sequence within a longer sequence of digits, which may or may not be what you need. For example:

my $string = "567867 87786896765432 0123456789 897898";
say ~$0 if $string ~~ /(\d ** 10)/; # -> 8778689676

If you want to match more precisely a sequence of 10 digits (not more than 10), you need to specify what you want to have “around” the match. For example, to match the sole 10-digit sequence above, you might use the nondigit character class:

my $string = "567867 87786896765432 0123456789 897898";
say ~$0 if $string ~~ /\D (\d ** 10) \D/; # -> 0123456789

But that would not match a 10-digit sequence at the start or the end of the string:

my $string = "5678670001 87786896765432 0123456789 897898";
say ~$0 if $string ~~ /\D (\d ** 10) \D/; # -> 0123456789

A better solution might be to use word boundary anchors:

my $string = "5678670001 87786896765432 0123456789 897898";
say ~$0 if $string ~~ /<< (\d ** 10) >>/; # -> 5678670001

Quite a bit of reflection may sometimes be needed to ensure that we match exactly what we want.

A.5.5.2  An octal number

Here’s a possible solution for finding an octal number (i.e., a number composed only of digits between 0 and 7) in a string:

my $string = "567867 8778689 6765432 0123456789 897898";
say ~$0 if $string ~~ /\D (<[0..7]>+) \D/; # -> 6765432

The character class is <[0..7]> for digits between 0 and 7. The + quantifiers means: as many as possible of this character class. And the \D (non digit) are there to prevent the regex from matching part of a larger number with nonoctal digits (for example from matching 567 in the first number). Depending on the exact requirement, using word boundary anchors as in the previous exercise’s solution might be better.

A.5.5.3  First word at the start of the string

To find the first word in a string, we can just search the first sequence of word characters (characters belonging to the \w character class) in the string:

my $string = "The greatest thing you'll ever learn " ~
             "is just to love and be loved in return. " ~
             "(Nature Boy, Nat King Cole)";
say ~$0 if $string ~~ /(\w +)/;  # -> The

A.5.5.4  First word starting with an “a”

Here’s a way to find the first word starting with the letter “a” in a sentence:

my $string = "Four scores and seven years ago our fathers ...";
say ~$0 if $string ~~ /\W (a \w+)/;  # -> and

A.5.5.5  First word starting with a lowercase vowel

To make sure that the match does not start with a vowel in the middle of a word, we might start the pattern with a \W (nonword character) or, better, with a << left word boundary:

my $string = "Democracy is the worst form of government, " ~
             "except for all the others. (Churchill)";
say ~$0 if $string ~~ /<< (<[aeiouy]> \w*)/;  # -> is

Here we use a * (rather than +) quantifier because a word containing only one vowel is eligible as a word starting with a vowel.

A.5.5.6  A mobile number

For a 10-digit number starting with “06” or “07”, the easiest solution is probably to use a <[67]> character class:

my $string = "567867 8778689 0123456789 0723456789 3644";
say ~$0 if $string ~~ /(0<[67]>\d ** 8)/;  # -> 0723456789

A.5.5.7  First word starting with a vowel (lower- or uppercase)

We can simply ignore case for the whole word:

my $string = " Ask not what your country can do for you — " ~
             " ask what you can do for your country. (JFK)";
say ~$0 if $string ~~ /:i << (<[aeiouy]> \w*)/;  # -> Ask

A.5.5.8  Repeated letters

We can capture any letter and check whether the next one is the same as the capture:

say ~$0 if 'appeal' ~~ /((\w)$0);  # -> pp

For capturing the second group of repeated letters:

say ~$1 if 'coffee' ~~ /(\w)$0.*((\w)$0)/;  # -> ee

And for the third group:

say ~$2 if 'Mississippi' ~~ /(\w)$0.*(\w)$0.*((\w)$0)/; # -> pp

A.5.6  Exercise in Section ??: is-reverse Subroutine (p. ??)

The second bug in the is-reverse subroutine is located on this line:

    while $j > 0 {

The $j index should be allowed to loop down until 0 (included) if we want to compare the first letter of $word2 with the last letter of $word1.

The corrected version of the is-reverse subroutine might be:

sub is-reverse(Str $word1, Str $word2) {
    return False if $word1.chars != $word2.chars;
    
    my $i = 0;
    my $j = $word2.chars - 1;

    while $j >= 0 {
        return False if substr($word1, $i, 1) ne substr($word1, $j, 1);
        $i++; $j--;
    }
    return True;
}

A.5.7  Exercise ??: Counting Letters (p. ??)

Counting the number of “a” letters in a word with the index function implies looking for the first “a” from the beginning of the string, then looking for the next one from the position immediately after, and so on until no more “a” letters are found.

Here, we make an infinite loop from which we break out with the last statement when index no longer finds an “a”. The $count counter is incremented each time an “a” is found, and the $idx keeps track of the current position within the string:

sub count_a {
    my $word = "banana";
    my $count = 0;
    my $idx = 0;
    while True {
       $idx = index $word, 'a', $idx;
       last unless $idx.defined;
       $idx++;
       $count++;
    }
    return $count;
}
say count_a();  # -> 3

Adapting it for any string and any letter is just a matter of passing the right arguments to the subroutine and using within the subroutine its parameters instead of hard-coded values:

sub count_index (Str $word, Str $letter) {
    my $count = 0;
    my $idx = 0;
    while True {
       $idx = index $word, $letter, $idx;
       last unless $idx.defined;
       $idx++;
       $count++;
    }
    return $count;
}
say count_index "When in the Course of human events...", "n"; # 5

Counting a given letter in a given word with the substr function is straight forward: we just need to loop over each letter of the word and increment a counter when needed:

sub count_substr (Str $word, Str $letter) {
    my $count = 0;
    for 0..$word.chars - 1 {
        $count++ if $letter eq substr $word, $_, 1;
    }
    return $count;
}
say count_substr "I have a dream that one day...", "a"; # -> 4

A.5.8  Exercise ??: Lowercase Letters (p. ??)

Only any_lowercase5 and any_lowercase7 are correctly checking whether the input string contains at least one lower case letter.

If you did not determine that yourself, really try to find by yourself the mistakes in the others before reading on; you should be able to find the errors in the other subroutines (except perhaps any_lowercase4, which is admittedly a bit tricky).

The any_lowercase5 and any_lowercase7 subroutines perform the search as follows:

  • any_lowercase5 sets $flag to False before the loop, changes it to True if any character in the string is lowercase, and returns $flag after the completion of the loop.
  • any_lowercase7 is also correct (and probably slightly better than any_lowercase5). It returns True if any character is lower-case and return False only if it gets a chance to go to the end of the loop.

The other subroutines have the following mistakes (some have arguably several mistakes; we’re going to list at least one of them):

  • any_lowercase1 is only checking the first character of its argument and exiting the loop thereafter.
  • any_lowercase2 is calling the is-lower subroutine on the string "char", not on the $char variable (it also has the same defect as any_lowercase1).
  • any_lowercase3 is returning True or False depending on only the last character of the input string.
  • any_lowercase4 suffers from a somewhat nasty operator precedence problem: the assignment $flag = $flag ... is executed before the or relational operator is executed, so that the latter part has no effect. Changing the faulty line to:
    $flag = $flag || is-lower $char;   # higher priority operator
    # or
    $flag = ($flag or is-lower $char); # parens to override precedence
    
    would solve the problem.
  • any_lowercase6 is almost correct in terms of its algorithm, but returns the strings "True" or "False" instead of the Boolean values True or False.
  • any_lowercase8 returns False if any character is not lowercase.
  • any_lowercase9 also returns False if any character is not lowercase.

The following is an example of the loop you could write to test each subroutine, each with three input strings:

for <FOO bar Baz> -> $str {
    say "1. $str: ", any_lowercase1 $str;
    say "2. $str: ", any_lowercase2 $str;
    say "3. $str: ", any_lowercase3 $str;
    say "4. $str: ", any_lowercase4 $str;
    say "5. $str: ", any_lowercase5 $str;
    say "6. $str: ", any_lowercase6 $str;
    say "7. $str: ", any_lowercase7 $str;
    say "8. $str: ", any_lowercase8 $str;
    say "9. $str: ", any_lowercase9 $str;
}

It would be possible to replace the nine print statements with a simple loop, but this requires using features that we haven’t studied yet.

You’ll see in other chapters ways to better organize test cases, for example in Section ?? (p. ??).

A.5.9  Exercise ??: Caesar’s Cipher (p. ??)

Implementing a letter rotation cipher:

sub rotate-one-letter (Str $letter, Int $shift) {
    my $upper-end = 'Z'.ord;        # last uppercase letter
    my $lower-end  = 'z'.ord;       # last lowercase letter

    my $rotated-ord = $letter.ord + $shift;
    if $letter ~~ /<[a..z]>/ {                # lower case
        $rotated-ord -= 26 if $rotated-ord > $lower-end;
    } elsif $letter ~~ /<[A..Z]>/ {           # upper case
        $rotated-ord -= 26 if $rotated-ord > $upper-end;
    } else {
        return $letter;
    }
    return $rotated-ord.chr;
}

sub rotate-one-word (Str $word, Int $shift is copy) {
    $shift = $shift % 26;
    $shift = 26 + $shift if $shift < 0;
    my $rotated-word = "";
    for 0..$word.chars - 1 {
        $rotated-word ~=  rotate-one-letter substr($word, $_, 1), $shift;
    }
    return $rotated-word;
}

sub rot13 (Str $word) {
    return rotate-one-word $word, 13;
} 

say rotate-one-word "ABDCabcd", 25;
say rotate-one-word "cheer", 7;
say rotate-one-word "melon", -10;

say rot13("Fbzr cebsnavgl");

If you are interested in decoding only ROT13, the tr transliteration operator can give you much shorter code. For example tr/a..m/n..z/ will transliterate all letters in the a..m range into their respective equivalents in the n..z range.

We can code a ROT13 in a simple Perl one-liner (see Section ??):

$ perl6 -e 'my $w = "foobar"; $w ~~ tr/a..mn..z/n..za..m/; say $w;'
sbbone

$ perl6 -e 'my $w = "sbbone"; $w ~~ tr/a..mn..z/n..za..m/; say $w;"
foobar

It is quite easy to add the ranges for capital letters. You might want to do it as a further exercise.

A.6  Exercises of Chapter ?? (Word Play)

A.6.1  Exercise ??: Consecutive Double Letters (p. ??)

With the looping techniques used in Chapter ??, we could write this:

sub is_triple_double (Str $word) {
    # Tests if a word contains three consecutive double letters.
    my $i = 0;
    my $count = 0;
    while $i < $word.chars - 1 {
        if substr($word, $i, 1) eq substr($word, $i + 1, 1) {
            $count++;
            return True if $count == 3;
            $i += 2;
        } else {
            $count = 0;
            $i++;
        }
    }
    return False;
}

for 'words.txt'.IO.lines -> $word {
    say $word if is_triple_double $word;
}

This is, however, a typical case where regexes might prove more efficient than looping (in terms of coding efficiency, i.e., as an approximation, how many code lines are needed for performing a given task).

We discussed in Chapter ?? that regex captures are populating the $0, $1, $2, etc. special variables. A regex pattern matching any repeated letter might therefore be /(.) $0/, where the character found by $0 is the same as the character found by the dot.

Similarly, a regex pattern matching three pairs of repeated letters in a row might be:

say ~$/ if "abbccdde" ~~ /(.)$0 (.)$1 (.)$2/; # -> bbccdd 

With this, the program to find the words with three double letters in the words.txt file takes just three code lines:

for 'words.txt'.IO.lines -> $word {
    say $word if $word ~~ /(.) $0 (.) $1 (.) $2/;
}

Both programs find four words, which are variations on “bookkeeper” and “bookkeeping.”

The regex version is so simple that you can code it directly at the operating system command line prompt as a one-liner (see Section ??):

$ perl6 -ne '.say if /(.) $0 (.) $1 (.) $2/' words.txt
bookkeeper
bookkeepers
bookkeeping
bookkeepings

A.6.2  Exercise ??: Palindromes in Odometers (p. ??)

The following is a possible program for solving the palindromic odometer puzzle:

sub is-palindrome ($number, $start, $len) {
    # checks if the relevant substring is a palindrome
    my $substring = substr $number, $start, $len;
    return $substring eq flip $substring;
}
    
sub check ($num) {
    # Checks whether the integer num has the properties described
    return (is-palindrome($num, 2, 4) and
        is-palindrome($num + 1, 1, 5) and
        is-palindrome($num + 2, 1, 4) and
        is-palindrome($num + 3, 0, 6));
}

say 'The following are the possible odometer readings:';
for 1e5..1e6 - 4 -> $number {
    say $number if check $number;
}

Another way to do it would be to use regexes to find out whether we have palindromes:

sub check ($num) {
    # Checks whether the integer num has the properties described
        $num ~~ /^..(.)(.)$1$0/ and 
        $num + 1 ~~ /^.(.)(.).$1$0/ and 
        $num + 2 ~~ /^.(.)(.)$1$0/ and 
        $num + 3 ~~ /^(.)(.)(.)$2$1$0/;
}

say 'The following are the possible odometer readings:';
for 1e5..1e6 - 4 -> $number {
    say $number if check $number;
}

This code is shorter, but is also slower: it takes almost twice as long to execute on my computer. So there is a tradeoff here: The first, faster, way is probably better if you need to run your program many times or often, but you might prefer the second version if this is just a one-off computation. It’s up to you to decide.

A.6.3  Exercise ??: Palindromes in Ages (p. ??)

The following program iterates over possible age differences between 15 and 75 and, for each age, calculates all palindromic possibilities.

say 'diff #instances';
check_diffs();
say 'daughter  mother';
num_instances(18, True);

sub are_reversed(Int $i, Int $j) {
    # $j (mother's age) will always be 2 digits
    return $j eq flip sprintf '%02d', $i; # format $i on 2 digits
}

sub num_instances (Int $diff, Bool $flag) {
    # computes and counts all possibilities for one age difference
    my $daughter = 0;
    my $count = 0;
    while True {
        my $mother = $daughter + $diff;
        if are_reversed($daughter, $mother) or 
      are_reversed($daughter, $mother+1) {
            $count++;
            printf "%02d\t%d\n", $daughter, $mother if $flag;
        }
        last if $mother > 99;
        $daughter++;
    }
    return $count;
}

sub check_diffs () {
    # enumerates all possible age differences
    for 15..75 -> $diff {
        my $nb_cases = num_instances $diff, False;
        say "$diff   $nb_cases" if $nb_cases > 0;
    }
}

The while True statement creates an infinite loop. The loop is stopped, however, by the last control flow statement when the mother’s age exceeds 99. We will see in section ?? a more idiomatic way to build an infinite loop, but this is sufficient for now.

The sprintf function used here transforms any number below 10 into a two-digit number string with a leading 0. Its syntax is similar to that of the printf function seen earlier. The difference is that it only creates a new string, but does not print it.

Using the .fmt method instead of the sprintf function, as well as the method syntax for flip, in may render the are_reversed subroutine somewhat nicer:

sub are_reversed(Int $i, Int $j) {
    return $j eq $i.fmt('%02d').flip; # format $i on 2 digits
}

A.7  Exercises of Chapter ?? (Arrays and Lists)

A.7.1  Exercise of Section ??: Implementing a Queue (p. ??)

This a somewhat simplistic implementation of a queue using an array and the unshift and pop functions:

sub enqueue (@queue, $new_item) {
    unshift @queue, $new_item;
}
sub dequeue (@queue) {
    my $item = pop @queue;
    return $item;
}
my @line = 1, 2, 3, 4, 5;
enqueue @line, 6;
say @line;
say dequeue @line for 1..3;

A.7.1.1  Improving the Queue with Subroutine Signatures

Let us try to make our queue a bit more robust.

First, we want to add some signatures to our subroutines. We might be tempted to write something like:

sub enqueue (Array @queue, $new_item) {
    unshift @queue, $new_item;
}

But that does not work, because that would essentially tell Perl that the @queue parameter is an array of arrays. What we need here is the following signature syntax:

sub enqueue (@queue where Array, $new_item) {
    unshift @queue, $new_item;
}
sub dequeue (@queue where Array) {
    my $item = pop @queue;
    return $item;
}

We probably don’t want any type signature here for the $new_item parameter of enqueue, because we want our queue to be able to operate on any data type in order to make it as generic as possible. But, just as we said it about stacks (Section ??), we might want to be able to add several items to the data structure in one go.

A.7.1.2  Slurpy (or variadic) parameters

There are several ways to insert several elements to the queue, but the simplest is probably to use a signature with a slurpy parameter (or variadic parameter): an array or hash parameter is marked as slurpy by a leading asterisk, which means it can bind to an arbitrary amount of arguments (zero or more). These are called "slurpy" because they slurp up any remaining arguments to a function, like someone slurping up noodles. This also means that a positional slurpy parameter can only be the last one in the signature:

sub enqueue (@queue where Array, *@new_items) {
    unshift @queue, $_ for @new_items;
    # or: unshift @queue, |@new_items;
}
sub dequeue (@queue where Array) {
    my $item = pop @queue;
    return $item;
}
my @line = 4, 5, 6, 7, 8;
enqueue @line, 3, 2, 1;
say @line;
say dequeue @line for 1..3;

This will display:

[1 2 3 4 5 6 7 8]
8
7
6

See also Section ?? for more details on slurpy parameters.

Note that, for an enqueue subroutine, we can’t simply write:

sub enqueue (@queue where Array, *@new_items) {
    unshift @queue, @new_items;
}

because, when given an array as a second argument, unshift inserts the new items as a sublist. Using the for loop or the “|” flattening operator solves this slight difficulty.

Another possibility is to use the prepend built-in function instead of unshift, since it does add the flattened elements of the array at the beginning of the queue:

sub enqueue (@queue where Array, *@new_items) {
    prepend @queue, @new_items;
}

A.7.1.3  A queue using shift and append

The order in which the arguments are passed is a bit counterintuitive. Also, we might prefer not having to use a loop to add the new elements. It is slightly easier to use the push and shift combination, and to replace push by append, which does more or less the same thing as push but flattens the list just as prepend did earlier:

sub enqueue (@queue where Array, *@new_items) {
    append @queue, @new_items;
}
sub dequeue (@queue where Array) {
    my $item = shift @queue;
    return $item;
}
my @line = 1, 2, 3, 4;
enqueue @line, 6, 7, 8;
say @line;
say dequeue @line for 1..3;

This will display:

[1 2 3 4 6 7 8]
1
2
3

A.7.1.4  Exceptions

Finally, one additional weakness needs to be fixed: what happens if the queue is empty when we try to dequeue an item? Raising an exception or aborting the program might be what’s needed. We might also decide to return an undefined value and let the caller deal with it:

sub enqueue (@queue where Array, *@new_items) {
    append @queue, @new_items;
}
sub dequeue (@queue where Array) {
    return unless @queue;
    my $item = shift @queue;
    return $item;
}
my @line;
enqueue @line, 1, 2, 3;
say @line;
for 1..4 -> $count {
    my $item = dequeue @line;
    if defined $item {
        say $item;
    } else {
        say "ERROR: The queue is empty !";
    }
}

This produces the following output:

[1 2 3]
1
2
3
ERROR: The queue is empty !

The dequeue subroutine could be made simpler by using the ternary conditional operator (see Section ??) and returning the Nil value if the queue is empty:

sub dequeue (@queue where Array) {
    @queue ?? @queue.shift !! Nil
}

As a further exercise, you might want to apply to the example code for stacks (seen in Section ?? on p. ??) the changes we have made above to the management of queues.

A.7.1.5  Encapsulating the data

Another problem with our implementation of queues is that the @file queue is fully accessible to the developer, who might be tempted to peek directly into the array or, worse, to modify it, without using the enqueue and dequeue subroutines designed to keep the queue consistent.

We might want to prevent that and make it impossible for the user to tamper with the queue or otherwise access it by any other means than the adequate subroutines. Hiding the information about the implementation or otherwise making it inaccessible by other means than those that have been designed for that purpose is often called data encapsulation. One common way to achieve data encapsulation is through object-oriented programming, which we cover in Chapter ??.

We can, however, obtain a similar result by combining variable scoping and some material briefly covered in Section ?? about subroutines as first-class objects.

Consider the following implementation of a queue:

sub create-fifo {
    my @queue;
    return (
        sub {return shift @queue;}, 
        sub ($item) {push @queue, $item;}
        ) ;
}
my ($fifo-get, $fifo-put) = create-fifo();
$fifo-put($_) for 1..10;
print " ", $fifo-get() for 1..5; # ->  1 2 3 4 5

The center piece here is the create-fifo subroutine. The @queue array holding the data is lexically scoped to this subroutine and cannot be accessed directly from anywhere else in the program. create-fifo returns two anonymous subroutines, one to dequeue items and one to enqueue them. These subroutines are lexical closures, which means in simple terms that they can access @queue, because they have been defined within its scope, even if they are called from somewhere else. Even when create-fifo has completed, those subroutines can still access to it because they sort of give an extra life to the array as long as the subroutines are accessible.

The rest of the code should be clear: when create-fifo is called, it manufactures the two anonymous subroutines that are stored into the $fifo-get and $fifo-put variables. A subroutine such as create-fifo is sometimes called a function factory because it generates other subroutines at run time. Finally, $fifo-put is called ten times to populate the queue with integers from 1 to 10, and $fifo-get is called five times to get the first five items of the queue. The queue is encapsulated: there is no way to access to its data other than using the two anonymous subroutines.

Making it possible to enqueue a list of items (rather than a single one) and managing exceptions (such as trying to get an item from an empty queue) are left as an exercise for the reader.

The techniques used here borrow heavily on a programming paradigm called functional programming, a model of programming used by languages such as Lisp, Caml, Clojure, and Haskell. This paradigm is quite different from almost everything we have seen so far, just as object-oriented programming is yet another different paradigm. As you gain more experience as a programmer, you should make a point to understand these different paradigms, because they offer different ways of doing things, and they all have specific advantages for specific types of problems. Knowing all of them gives you more expressive power. One of the good things with Perl 6 is that it gives you a modern and powerful tool to use each of these programming paradigms. Chapter ?? is all about functional programming. Meanwhile, make sure to read Subsection ?? in the array and list chapter.

A.7.2  Exercise of Section ??: Other Ways to Modify an Array (p. ??)

A.7.2.1  Simulating the pop function

The my-pop subroutine uses splice to simulate the pop function:

sub my-pop (@array where @array > 0) {
    my @result = splice @array, @array.end, 1;
    return @result[0];
}
my @letters = 'a'..'j';
my $letter = my-pop @letters;
say $letter;             # -> j
say @letters;            # -> [a b c d e f g h i]

Here, the expression @array.end returns the index of the last item of the array. It is also possible to count the array items from the end and to access to the last and penultimate items of a list or an array using the following syntax:

> say (1..9)[*-1];
9
> say (1..9)[*-2];
8

The my-pop subroutine could be rewritten as follows:

sub my-pop (@array where @array > 0) {
    my @result = splice @array, *-1, 1;
    return @result[0];
}

You don’t have to specify the number of elements with splice if you just want the rest. We can also avoid using the @result intermediate array. So we could simplify my-pop as:

sub my-pop (@array where @array > 0) {
    @array.splice(*-1)[0]
}

A.7.2.2  Simulating the push function

The only slight difficulty in this exercise is to manage a signature with a “variadic” list of parameters (or slurpy parameters). This was explained above in subSubsection ??: (p. ??).

sub my-push (@array, *@list) {
    my @result = splice @array, @array.end + 1, 0, @list;
    return @array; # push returns the modified list
                   # (seldom used for arrays)
}
my @letters = 'a'..'j';
my-push  @letters, 'k', 'l', 'm';
say @letters;      # -> [a b c d e f g h i j k l m]

A.7.2.3  Simulating the unshift function

To simulate the unshift function, we can again use slurpy parameters:

sub my-unshift (@array, *@list) {
    my @result = splice @array, 0, 0, @list;
    return @array; # unshift returns the modified list
                   # (seldom used for arrays)
}
my @letters = 'd'..'j';
my-unshift @letters, 'a'..'c';
say @letters;      # -> [a b c d e f g h i j]

A.7.2.4  Simulating the delete subscript adverb

Remember the delete adverb removes the value, but leaves the slot undefined within the array. The splice function would also remove the slot, so this might not be what is really needed here if we want to simulate the behavior of delete (although, in a sense, it might also be considered to be an improvement to remove the slot altogether). To really simulate delete, it is probably better to just “undefine” the value:

sub my-delete (@array, $idx) {
    my $item = @array[$idx];
    @array[$idx] = Nil;
    return $item;
}
my @letters = 'a'..'j';
my $letter = my-delete @letters, 4;
say $letter;       # -> e
say @letters;      # -> [a b c d (Any) f g h i j]

A.7.3  Exercise of Section ??: Mapping and Filtering the Elements of a List (p. ??)

Producing an array containing the square of the numbers in the input list is very straight forward:

my @squares = map { $_ ** 2 }, 3, 5, 7;     # -> [9 25 49]

To keep the elements of a list that are perfect squares, one way is to check for each number whether its square root is an integer. For example:

my @filt = grep { my $sq = sqrt $_; $sq == $sq.Int}, 3, 9, 8, 16;
say @filt;                                  # -> [9 16]

This is working fine with the sample data of the example test, but the program will abort if we try it with a negative input value. We want to avoid that exception and just consider that a negative number can never be a perfect square.

Since the code block here would be getting a bit more complicated, we might prefer to use a function reference instead:

sub is-square (Numeric $num} { 
    return False if $num < 0;
    my $sq = sqtr $num;
    return $sq == $sq.Int;
} 
my @filt = grep &is-square, 3, 9, -6, 16;   # -> [9 16]

A.7.4  Exercise of Section ??: Advanced Sorting Techniques (p. ??)

The transformation subroutine that can extract the letter groups from the strings is quite straight forward:

sub my_comp (Str $str) {
    return $0 if $str ~~ /^\d+ (\w+)/; 
    Nil;   # returning Nil if the regex did not match
}

The sort is just the same as in the original chapter:

say sort &my_comp, <22ac 34bd 56aa3 12c; 4abc( 1ca 45bc>;
     # -> (56aa3 4abc( 22ac 45bc 34bd 12c; 1ca)

The transformation subroutine is simple enough to be easily replaced by a code block:

my @unsorted = <22ac 34bd 56aa3 12c; 42acd 12cd; 4abc( 1ca 45bc 3dab!>;
my @sorted = sort {/\d+ (\w+)/; $0 // Nil}, @unsorted;
say @sorted; 
     # -> [56aa3 4abc( 22ac 42acd 45bc 34bd 12c; 1ca 12cd; 3dab!]

This can also be written with a method invocation syntax:

my @sorted = @unsorted.sort: {/\d+ (\w+)/; $0 // Nil};

A.7.5  Exercise ??: Nested Sum (p. ??)

The most obvious way to compute the sum of all values contained in nested lists or arrays is to use nested loops. For example:

my @AoA = [[1, 2], [3], [4, 5, 6]];
sub nested-sum (@array) { 
    my $sum; 
    for @array -> $item { 
        for $item.flat -> $nested_item {
            $sum += $nested_item;
        }
    } 
    return $sum
}
say nested-sum @AoA;  # -> 21

The only slight syntactical difficulty here is that we need to flatten the $item sublists in order to traverse them. This could also be done with the “|” operator:

        for |$item -> $nested_item {

Here is another way to do it, using a for loop for traversing the outer array and a reduction operator for adding the elements of the nested lists:

my @AoA = [[1, 2], [3], [4, 5, 6]];
sub nested-sum (@array) { 
    my $sum; 
    for @array -> $item { 
        $sum += [+] $item;
    } 
    return $sum
}
say nested-sum @AoA;  # -> 21

Using map for flattening the nested lists and the reduction operator can make this code considerably shorter:

my @AoA = [[1, 2], [3], [4, 5, 6]];
sub nested-sum (@array) { 
    return  [+]  map {|$_}, @array;
}
say nested-sum @AoA;  # -> 21

Comparing this solution needing one line of actual code with the first one shows how expressive the functional programming style can be for handling arrays and lists and hopefully tells you one of the reasons why I have been insisting on this programming style in this chapter.

These solutions work well because it is known that there are at most lists and nested lists (lists of lists). What if the level of “nestedness” is not known in advance and can be higher than two? A solution would be to use a recursive subroutine to explore the tree of lists:

my @AoA = [[1,2], [3], [4,5,6], [3, [7,6, [3,2]]]];
sub nested-sum ($input) { 
    my $sum = 0; 
    for |$input -> $item { 
        if $item.WHAT ~~  Int {
            $sum += $item;
        } else {
            $sum += nested-sum $item;
        }
    } 
    return $sum;
}
say nested-sum @AoA;  # -> 42

Remember that a recursive approach is often an efficient tool when dealing with nested or chained data.

A.7.6  Exercise ??: Cumulative Sum (p. ??)

To compute the cumulative sum of a list of numeric values, we just need an accumulator and we push the value of the accumulator each time through the iteration on the array:

my @numbers = <2 5 7 6 5 3 6 8>;
say cumul-sum(@numbers); # -> [2 7 14 20 25 28 34 42]

sub cumul-sum (@array) {
    my @cumulative;
    my $partial_sum = 0;
    for @array -> $element {
        $partial_sum += $element;
        push @cumulative, $partial_sum;
    }
    return @cumulative;
}

But guess what? The code can be much shorter with functional programming. Remember that the reduction metaoperator can give you a list of partial results:

my @numbers = <2 5 7 6 5 3 6 8>;
say [\+] @numbers;    # -> (2 7 14 20 25 28 34 42)

You might think at this point that I have designed these exercises to make my point about the expressive power of functional programming. Not at all! Both this exercise and the previous one are straight from the list chapter of Allen Downey’s Think Python book on which this book is loosely based. I haven’t written these two exercises, but only the solutions presented here.

A.7.7  Exercise ??: Middle (p. ??)

The easiest way to produce a new list that contains all but the first and last elements of a given list is probably to simply use a slice:

say middle(5..10);    # -> (6 7 8 9)
sub middle (@array) { 
    return @array[1..*-2] 
}

Note that *-1 refers to the index of the last element of an array. To discard the last element, we limit the range to *-2.

A.7.8  Exercise ??: Chop (p. ??)

The basic difference with the previous exercise is that the array should be modified in place, rather than returned from the function.

Here’s one possible soluution, which uses the shift and pop functions to remove respectively the first and the last element of the array:

my @nums = 5..10;
chop-it(@nums); 
say @nums;   # -> [6 7 8 9]

sub chop-it (@array) { 
    shift @array; 
    pop @array; 
    return;
}

Using a slice is somewhat simpler; just make sure to assign the slice to the array in order to modify the array in place:

sub chop-it (@array) { 
    @array = @array[1..*-2];
    return;
}

A.7.9  Exercise ??: Subroutine is-sorted (p. ??)

To check whether a list is sorted, we just need to iterate over its items, keep track of the previous value and compare the current value with the previous one. Return false if any pair of values does not satisfy the comparison, and return true upon getting to the end of the iteration:

sub is-sorted (@array) {
    my $previous = @array[0];
    for @array -> $current {
        return False if $current < $previous;
        $previous = $current;
    }
    return True;
}
say is-sorted < 2 4 5 7 7 8 9>;    # -> True
say is-sorted < 2 4 5 7 6 8 9>;    # -> False

Another approach might be to simply compare the input list with a sorted version of the same:

sub is-sorted (@array) {
    return @array eqv @array.sort;
}
say is-sorted < 2 4 5 7 7 8 9>;    # -> True
say is-sorted < 2 4 5 7 6 8 9>;    # -> False

While this leads to short and simple code, this is not an optimal solution, because it forces the program to sort the input array, which is significantly more costly than just traversing the array, at least when the array to be checked is large.

Once again, functional programming and especially the reduction hyperoperator can lead to much shorter code than the first solution, without incurring the cost of an additional sort:

sub is-sorted (@array) {
    return [<=] @array }
}
say is-sorted < 2 4 5 7 7 8 9>;    # -> True
say is-sorted < 2 4 5 7 6 8 9>;    # -> False

By the way, this last version of is-sorted will “short-circuit” and return False as soon as it has found values not in the proper order, without iterating over the rest of the list.

A.7.10  Exercise ??: Subroutine is-anagram (p. ??)

When comparing two words to see if they are anagrams, we can start by returning false if they are not the same length, since anagrams obviously have the same letter count. This might make the process faster if the detailed process to compare two strings is time consuming, by avoiding the time-consuming part for cases that will obviously not match.

We don’t want to try every permutation of letters since it would take a long time. The easiest way to check for anagrams is probably to start by normalizing the input strings, i.e., reorganizing them in such a way that they can easily be compared. The most obvious way is just to sort the letters of the two words and compare the results:

sub is-anagram (Str $word1, Str $word2) {
    return False if $word1.chars != $word2.chars;
    return False if $word1.comb.sort ne $word2.comb.sort;
    True;
}
for <ban bane post stop pots stop post pots pots taps> -> $w1, $w2 {
    say "$w1 $w2:\t", is-anagram $w1, $w2;
}

This produces the following output:

$ perl6 anagrams.pl6
ban bane:       False
post stop:      True
pots stop:      True
post pots:      True
pots taps:      False

Note that this works correctly because the ne operator coerces its argument into a string before performing the comparison.

This code can be made shorter (but possibly slightly less efficient) by returning directly the comparison of the sorted versions:

sub is-anagram (Str $word1, Str $word2) {
    return $word1.comb.sort eq $word2.comb.sort;
}

A.7.11  Exercise ??: Subroutine has-duplicates (p. ??)

Within the context of what we have seen so far, the easiest way to find out if a list of strings has duplicates is probably to sort the list, so that possible duplicates will be adjacent, and to compare each item of the sorted array with the previous (or next) one:

say has-duplicates( < a b c df g xy z r e >); # -> False
say has-duplicates( < a b c df g xy z c e >); # -> True

sub has-duplicates (@array) {
    my @sorted = sort @array;
    for 1..@sorted.end -> $i {
        return True if @sorted[$i] eq @sorted[$i - 1];
    }
    False;
}

Here, the loop starts on index 1 (and not 0) because each item is compared with the previous one.

Another way is to iterate on the elements of the sorted array and to keep track of the previous item to enable the comparison:

say has-duplicates( < a b c d f y z r e >); # -> False
say has-duplicates( < a b c d f y z c e >); # -> True

sub has-duplicates (@array) {
    my @sorted = sort @array;
    my $previous = shift @sorted;
    for @sorted -> $item {
        return True if $item eq $previous;
        $previous = $item;
    }
    False;
}

Another possibility is to use the unique function of Perl 6, which returns a sequence of unique values from the input list or array. Comparing the item count of the output of unique with the element count of the original list will tell us whether some duplicates were removed by unique:

sub has-duplicates (@array) {
    my @unique-items = unique @array;
    return False if @unique-items.elems == @array.elems;
    True;
}

This could be rewritten more concisely by chaining the method invocations:

sub has-duplicates (@array) {
    @array.unique.elems != @array.elems;
}

Note that Perl also has a repeated built-in function, which is the counterpart of unique and returns the duplicates of a list:

say <a b c d b f d>.repeated;  # -> (b d)

The has-duplicates subroutine can just coerce the output of repeated into a Boolean:

sub has-duplicates (@array) {
    ?@array.repeated
}

Another efficient way of finding or removing duplicates from a list or an array is to use hashes, a built-in data structure which we cover in Chapter ?? (see Exercise ??).

A.7.12  Exercise ??: Simulating the Birthday Paradox (p. ??)

For simulating the birthday paradox, we need to generate random integers between 1 and 365 (each integer representing a date in the year). For the sake of simplicity, we will generate random integers between 0 and 364, which is equivalent for our purposes.

We will run the simulation 1,000 times:

sub has-duplicates (@array) {
    return ?@array.repeated
}

sub has-duplicate-birthdays (Int $num-students) {
    my @blist;
    for 1..$num-students {
        push @blist, 365.rand.Int; # numbers between 0 and 364
    }
    return has-duplicates(@blist);
}

my $dupl-count = 0;
my $nb-tests = 1000;
for 1..$nb-tests {
    $dupl-count++ if has-duplicate-birthdays 23; # 23 students
}
say "On $nb-tests tests, $dupl-count had at least one duplicate birthday";

Note that we have reused the has-duplicates subroutine of the previous exercise. It is so short that its code could have been inlined in the populate-birthdays subroutine, but it is generally considered good practice to reuse software components that have been developed and tested.

Running the program four times gave the following results:

$ perl6 birthdays.pl6
On 1000 tests, 498 had at least one duplicate birthday

$ perl6 birthdays.pl6
On 1000 tests, 505 had at least one duplicate birthday

$ perl6 birthdays.pl6
On 1000 tests, 527 had at least one duplicate birthday

$ perl6 birthdays.pl6
On 1000 tests, 491 had at least one duplicate birthday

This simulation confirms that with a sample of 23 persons, there is an approximate 50% probability that at least two will have the same birthday.

Note that Perl has a roll built-in that returns randomly selected elements from a list. This can make the populate-birthdays subroutine significantly more concise:

sub has-duplicate-birthdays (Int $num-students) {
    has-duplicates( (^365).roll($num-students) )
}

A.7.13  Exercise ??: Comparing push and unshift (p. ??)

Populating an array with either push or unshift is something you’ve seen before. The only new thing here is to compare run times of various solutions. The now function returns the number of seconds elapsed since a theoretical start point called “the Epoch,” usually January 1, 1970. Calling now once before running a script and once after will tell us how long it ran through a simple subtraction.

my $start_push = now;
my @push_array;
for 'words.txt'.IO.lines -> $line {
    push @push_array, $line;
}
say "push took " ~ now - $start_push ~ " seconds.";
@push_array = ();

my $start_unsh = now;
my @unsh_array;
for 'words.txt'.IO.lines -> $line {
    unshift @unsh_array, $line;
}
say "unshift took " ~ now - $start_unsh ~ " seconds.";
@unsh_array = ();

This is a sample run of this program:

push took 1.870107 seconds.
unshift took 2.2291266 seconds.

Try it for yourself and run it several times. You should probably notice that push is consistently faster than unshift, even though the difference is not that large.

The reason is presumably that since unshift is inserting items at the start of the array, Perl has to move data around in order to reorganize the whole array many times over, whereas, using push for inserting items at the end of the array implies less internal house keeping.

As a further exercise, you may try to explore other ways to populate an array, such as append or splice.

If you are just going to insert each line from the input file into an array without changing anything to those lines, then slurping the data into the array without a loop will be simpler and much faster:

my $start_slurp = now;
my @slurp_array = 'words.txt'.IO.lines;
say "slurp took " ~ now - $start_slurp ~ " seconds.";

This is four to five times faster:

slurp took 0.42602506 seconds.

Note that you don’t really need to call the now function at the beginning of the program: you can use INIT now to retrieve the time when the program began to run:

my @slurp_array = 'words.txt'.IO.lines;
say "slurp took " ~ (now - INIT now) ~ " seconds.";

A.7.14  Exercise ??: Bisection Search in a List (p. ??)

We can start with a recursive bisection algorithm:

sub bisect (@word_list, Str $word) {
    my $index = (@word_list.elems / 2).Int;
    return False if $index == 0 and @word_list[$index] ne $word;
    my $found = @word_list[$index];
    if $word lt $found {
        # search the first half
        return bisect @word_list[0..$index-1], $word;
    } elsif $word gt $found {
        # search the second half
        return bisect @word_list[$index+1..*-1], $word;
    }
    True;     # if we get there, we've found the word
}

for <a f w e q ab ce> -> $search { 
    if bisect [<a b d c e f g>], $search {
        say "found $search";
    } else {
        say "did not find $search";
    }
}

This will display the following output:

found a
found f
did not find w
found e
did not find q
did not find ab
did not find ce

There are a couple of weaknesses, though, in this implementation. First, on each recursive call, bisect passes as an argument an array that may be quite large, and this is not very efficient both in terms of memory usage (to store the successive subsets of the original array) and in terms of CPU cycles (the time to copy these arrays).

In addition, we can figure out whether the target word can be found in the list (and there are many cases where we don’t need more information than that), but we don’t know where it was found (i.e., on which subscript of the original array), which is often what is really needed.

A better option might be to have only one copy of the original array, say as a global variable, and to pass around subscript ranges. But global variables are usually very much frowned upon because they tend to go against the tenets of structured programming and can be dangerous (even though this would arguably be a case where a global variable does make some sense). We can actually do better than global variables and still have the benefit of not passing the whole array around again and again thanks to the fact that, in Perl 6, subroutines are closures, which means that they can use variables that exist in the environment where they are created.

In the following code, bisect is no longer a recursive subroutine; it is a very simple subroutine that just sets up the environment for bisect2, which is the recursive routine and is defined within the body of bisect. Because the array and the searched word exist within bisect, bisect2 will be able to access to them. The parameters to bisect2 are now just two subscripts representing the range in which it will have to look up for the word:

sub bisect (Str $word, @word_list) {
    sub bisect2 ($low_idx, $high_idx) {
        my $mid_idx = (($low_idx + $high_idx) /2).Int;
        my $found = @word_list[$mid_idx];
        return $mid_idx if $word eq $found;
        return -1 if $low_idx >= $high_idx;
        if $word lt $found {
            # search the first half
            return bisect2 $low_idx, $mid_idx - 1;
        } else {
            # search the second half
            return bisect2 $mid_idx+1, $high_idx;
        }
    }
    my $max_index = @word_list.end;
    return bisect2 0, $max_index;
}

for <a f w e q ab ce g> -> $search { 
    my $result = bisect $search, [<a b d c e f g>];
    if $result == -1 {
        say "did not find $search";
    } else {
        say "found $search on position $result";
    }
}

As a further exercise, adapt the above program to search for English words in words.txt. Notice how fast this is. Please be aware that it works correctly because the words in this file are listed in alphabetical order.

Try to change the code to count and display the number of steps necessary to find a given word. Compare this with the number of steps it would take on average for a linear search (i.e., traversing the array linearly until the word is found or can be declared to be absent). Can you guess why this search algorithm is sometimes called a logarithmic search?

You may also want to try to write a nonrecursive solution using a loop.

A.7.15  Exercise ??: Reverse Pairs (p. ??)

Finding reverse pairs requires reading each word of the list and checking the list to see whether the reversed words exist in the list. This means that you are going to look up about 113,000 words in a list having 113,000 words. Your lookup method needs to be efficient. The obvious solution is to use the bisection search implemented in the previous exercise:

sub bisect (Str $word, @word_list) {
    # see the code in the previous exercise
}

my @array = 'words.txt'.IO.lines;

for @array -> $word {
    my $reverse = $word.flip;
    my $res = bisect $reverse, @array;
    say "$word and $reverse form a reverse pair" if $res >= 0;
}
say now - INIT now;

On my laptop (a decent box, but not a racehorse), the whole process ran in about 42 seconds, i.e., less than 0.4 millisecond per lookup.

If you think about it, the for loop in the code above is really filtering from the word list those words belonging to a reverse pair. This could be implemented with a grep using the bisect subroutine to select the matches:

say "$_ and $_.flip() form a reverse pair" 
    for @array.grep( { bisect( .flip, @array ) >= 0 } );

With the algorithm used here, each reverse pair is found twice (once for each word of the pair). When examining any given word from the list, we actually don’t need to search backward in the part of the list before that word because if that word forms a pair with another word that comes before in alphabetic order, we’ve already found the pair when processing that other word. So it would be more efficient and faster to search only forward, i.e., to look for the reverse word in the part of the list coming after the word being examined. As a further exercise, modify the for loop to search words forward.

A.7.15.1  Comparing bisection search with hash lookup

Bisection search is fairly fast, but hash lookup is even faster. Although we haven’t studied hashes yet, try the following code:

my %hash = map { $_ => 1}, 'words.txt'.IO.lines;
for %hash.keys -> $word {
    my $reverse = $word.flip;
    say "$word and $reverse form a reverse pair" 
        if %hash{$reverse}:exists;
}
say now - INIT now;

Don’t worry about understanding the code for the time being, but notice how much shorter it is. And how much faster it runs: on the same laptop, the execution time is about 16 seconds (less than 0.15 millisecond per lookup). I hope this will whet your appetite for Chapter ??.

Note that the output of this example is not sorted because a hash does not keep the order of the input data, as we will see in Chapter ??. It would be fairly easy to keep the sort order, for example by using an array in addition to the hash, but that is not really the subject here.

A.7.15.2  Creating and using a module

Coming back to the bisect subroutine, copying and pasting this subroutine from the program of the previous exercise into the code of this exercise is not the best way to reuse code. Suppose a bug is found in that subroutine; it now needs to be fixed in two different programs; the chance that the bug gets corrected in one program and forgotten for the other is quite significant. Even if it is not forgotten, this is twice the same work, and this increases the chance of making another mistake in the process. The bug fix also needs testing twice. Even if there is no bug, we might need an enhancement and this again has to be done twice.

A good way to reuse software while maintaining only one copy of the reused subroutine is to insert it into a Perl module, i.e., in a separate file that will be loaded into our programs needing to use it.

The module file might be named BisectSearch.pm and contain the following code:

unit module BisectSearch;

sub bisect (Str $word, @word_list) is export {
    sub bisect2 ($low_idx, $high_idx) {
        my $mid_idx = (($low_idx + $high_idx) /2).Int;
        my $found = @word_list[$mid_idx];
        return $mid_idx if $word eq $found;
        return -1 if $low_idx >= $high_idx;
        if $word lt $found {
            # search the first half
            return bisect2 $low_idx, $mid_idx - 1;
        } else {
            # search the second half
            return bisect2 $mid_idx+1, $high_idx;
        }
    }
    my $max_index = @word_list.end;
    return bisect2 0, $max_index;
}

sub some-other-sub is export {
    # does something useful
}

Note that the module name given at the top of the code and the root file name have to correspond. The only other change to the code of the subroutine is adding the is export trait to the signature of the subroutine.

Now a Perl program will be able to load this module and to use the bisect and some-other-sub subroutines. For example:

use lib ".";  # tells Perl to look for modules in the current dir
use BisectSearch;

my @array = 'a'..'m';
for < a f w e q ab ce g > -> $search { 
    my $result = bisect $search, @array;
    if $result == -1 {
        say "did not find $search";
    } else {
        say "found $search : item # $result";
    }
}

Perl has a list of places to look for modules, which may vary from one Perl installation to another. The first line use lib "."; tells Perl to also look for modules into the current directory. This is just an example; you might prefer using a dedicated directory for your modules. The second line use BisectSearch; tells Perl to load the module and import the exported subroutines. Now, the program can use the bisect subroutine just as if it had been defined within the program.

That’s it, folks! Simple, isn’t it? Just try it! Well, there are a few more things to know about modules, but you already know enough to create and use modules.

You might want to review some of the other subroutines we have written so far and stick those that might be useful again into a module. Hint: some of the array and string subroutines we’ve seen are likely candidates.

A.7.16  Exercise ??: Interlocking Words (p. ??)

First, it seems that it was a good idea to create the BisectSearch module, it’s going to be reused immediately.

Second, we need some thinking. The first idea that might come to mind to solve the problem might be to have a nested loop on the word list in order to find all pairs of two words, interlock them, and see whether the resulting combined string exists in the list. But this is quite bad because this means creating 113,000 squared pairs, i.e., more than 12.5 billion pairs. Even if a relatively large part of these pairs can be eliminated before having to look up in the word list since a pair can be interlocked only if the letter count difference between the two words is 0 or 1, checking all these pairs will take ages.

Let us see what happens if we work the other way around: for each word on the word list, we “intersplit” the word into one string with the even-rank letters and one with the odd-rank letters, and then check if these substrings belong to the list. At most, we will need 226,000 searches–in fact much less because we don’t need to look up for the second string if the first string did not match anything.

This is our suggested solution:

use lib ".";
use BisectSearch;

my @array = 'words.txt'.IO.lines;
for @array -> $word {
    my ($word1, $word2) = intersplit($word);
    say "$word: $word1, $word2" if bisect($word1, @array) >= 0 
        and bisect($word2, @array) >= 0;
}

sub intersplit (Str $word) {
    my @letters = $word.comb;
    my $evens = join '', map {@letters[$_] if $_ %% 2}, @letters.keys;
    my $odds = join '', map {@letters[$_] if $_ % 2}, @letters.keys;
    return ($evens, $odds);
}

The intersplit subroutine is not optimal in the sense that it traverses the @letters array twice each time it is called. We can improve it using a pointy block taking two parameters (one odd- and one even-rank letters):

sub intersplit (Str $word) {
    my (@evens, @odds);
    for $word.comb -> $even, $odd {
        push @evens, $even;
        push @odds, $odd;
    }
    @evens.join, @odds.join;
}

As a further exercise, can you find any words that are three-way interlocked, that is, every third letter forms a word, starting from the first, second, or third? Hint: it will probably be easier if you start from the revised version of intersplit just above.

A.8  Exercises of Chapter ?? (Hashes)

A.8.1  Exercise at the end of Section ??: A hash Is a Mapping (p. ??)

Here’s how to populate one pair at a time:

my %wages;
%wages{"Liz"} = 3000;
%wages{"Bob"} = 2500;
%wages{"Jack"} = 2000;
%wages{"Betty"} = 1800;
say "Bob's salary is %wages{'Bob'}";
for <Liz Jack> -> $employee {
    say "The salary of $employee is %wages{$employee};
}

You can avoid quotation marks around the keys by using the <...> angle brackets operator:

my %wages;
%wages<Liz> = 3000;
%wages<Bob> = 2500;
# ...
say "Bob's salary is %wages<Bob>";

And here’s how to assign the full hash in one go:

my %wages = Liz => 3000, Bob => 2500, Jack => 2000, Betty => 1800;
say %wages; # -> Betty => 1800, Bob => 2500, Jack => 2000, Liz => 3000

A.8.2  Exercise ??: Storing the Word List into a Hash (p. ??)

The standard way to store the word list in a hash might be to read each line of the file in a for loop and store each word as the key of the hash. The content of the value is not important; we will store 1 (it may also make sense to store the True Boolean value):

my %words;
for 'words.txt'.IO.lines -> $line {
    %words{$line} = 1
}

An alternative approach is to assign to the hash the output of a map expression returning a pair for each line of the file:

my %hash = map { $_ => 1}, 'words.txt'.IO.lines;

A.8.3  Exercise ??: Memoizing the Ackermann Function (p. ??)

The original implementation of the Ackermann function looked like this:

sub ack ($m, $n) {
    return $n + 1 if $m == 0;
    return ack($m - 1, 1) if $n == 0;
    return ack($m - 1, ack($m, $n-1));
}

It is not possible to memoize the cases where either $m or $n is zero, because the other value is unknown. Only the code corresponding to the last code line can be memoized, but that’s okay because it does the bulk of the work anyway.

The next problem is that the hashes seen so far had only one key, but the Ackermann function takes two parameters. The simple workaround is to create a composite key, i.e., to concatenate the two parameters with a separator to create the keys of the hash. This leads to this possible solution:

my %ack-memo;
sub mem-ack (Int $m, Int $n) {
    return $n + 1 if $m == 0;
    return mem-ack($m - 1, 1) if $n == 0;
    %ack-memo{"$m;$n"} = mem-ack($m - 1, mem-ack($m, $n-1))
        unless %ack-memo{"$m;$n"}:exists;
    return %ack-memo{"$m;$n"};
}
say mem-ack 3, 4;

To benchmark the two solutions, you may use the following code:

my %ack-memo;
sub mem-ack (Int $m, Int $n) {
    return $n + 1 if $m == 0;
    return mem-ack($m - 1, 1) if $n == 0;
    %ack-memo{"$m;$n"} = mem-ack($m - 1, mem-ack($m, $n-1)) 
        unless %ack-memo{"$m;$n"}:exists;
    return %ack-memo{"$m;$n"};
}
my $start = now;
say mem-ack 3, 4;
say "mem-ack runtime: ", now - $start;
dd %ack-memo;

sub ack ($m, $n) {
    return $n + 1 if $m == 0;
    return ack($m - 1, 1) if $n == 0;
    return ack($m - 1, ack($m, $n-1));
}
$start = now;
say ack 3, 4;
say "ack runtime: ", now - $start;

But don’t try to run it with values of $m greater than 3; it is useless. If we were to find an Ackermann value for a pair of numbers already seen, that would mean that we have entered an infinite loop. So there is in fact no point trying to memoize the Ackermann function.

We have used composite keys for %ack-memo, but we can have multidimensional hashes just as there are multidimensional arrays (see Section ??. We only need to have two keys, each between its pair of curly brackets:

my %h;
%h{'a'}{'b'}= 'ab';
%h{'a'}{'c'}= 'ac';
%h{'a'}{'d'}= 'ad';
%h{'b'}{'c'}= 'bc';
dd %h; 
# -> Hash %h = {:a(${:b("ab"), :c("ac"), :d("ad")}), :b(${:c("bc")})}

or use a semi-colon to separate the keys:

my %i;
%i{'a';'b'} = 'ab';
%i{'a';'c'} = 'ac';
%i{'b';'c'} = 'bc';
dd %i; # -> Hash %i = {:a(${:b("ab"), :c("ac")}), :b(${:c("bc")})}

A.8.4  Exercise ??: Finding Duplicates with a Hash (p. ??)

We need to loop on the array, store the array elements in a hash and detect whether an element is found in the hash. Here’s one way to do that:

sub has-duplicates (@array) {
    my %seen;
    for @array -> $elmt {
        return True if %seen{$elmt}:exists;
        %seen{$elmt} = 1;
    }
    return False;
}

As a further exercise, generate a list of 50,000 random integers between 0 and 1,000,000,000, and then, using the various methods we have demonstrated, check to see whether this list contains any duplicates and measure the runtime of these various methods. If you encounter difficulties doing this, take a look at the solutions to the “has duplicates” (see Subsection ??) and “birthday paradox” (see Subsection ??) exercises to get some coding clues. An example of simple benchmarking is presented in the exercise just above.

Once your subroutines are working properly, launch the whole process at least 10 times to see if the differences are significant.

A.8.5  Exercise ??: Rotate Pairs (p. ??)

Consider the word “iron” and rotate it by three letters. This gives the word “folk”. This also means that if “folk” is rotated by 23 letters, we will get “iron.” Since we are going to scan all the words of our word list, we will find this “rotate pair” when we try a shift of three letters on “iron”, so that there no need to try a 23-letter rotation on “folk.” More generally, we need to try only rotations between 1 and 13 letters.

The following code iterates through the words of the list, rotates each of them by every shift between 1 and 13, and looks up the result in the hash:

sub rotate-one-letter (Str $letter, Int $shift) {
    my $last  = 'z'.ord;       # last lower-case letter
    my $rotated-ord = $letter.ord + $shift;
    if $letter ~~ /<[a..z]>/ { 
        $rotated-ord -= 26 if $rotated-ord > $last;
    } else {
        return $letter;
    }
    return $rotated-ord.chr;
}

sub rotate-one-word (Str $word, Int $shift) {
    my $rotated-word = "";
    for 0..$word.chars - 1 {
        $rotated-word ~=  rotate-one-letter substr($word, $_, 1), $shift;
    }
    return $rotated-word;
}

my %words = map { $_ => 1}, 'words.txt'.IO.lines;

for %words.keys -> $string {
    for 1..13 -> $shift {
        my $rotated = rotate-one-word $string, $shift;
        say " $string and $rotated are shifted by $shift"
            if %words{$rotated}:exists;
    }
}

Rotating each word of a 113,000 list by each shift between 1 and 13 is quite long. Running the program on the word list will take some time, probably about 10 to 15 minutes. Using the .trans built-in (see documentation in https://docs.perl6.org/routine/trans) might speed up the process. Try it and judge for yourself.

A.8.6  Exercise ??: Homophones (p. ??)

We are looking for words that sound the same when we remove either the first or the second letter.

This is a solution using both the words.txt word list used before and the CMU phonetic dictionary:

my %phonetic;

sub load-phonetic ($file-name) {
    for $file-name.IO.lines -> $line {
        next if $line !~~ /^\w/; 
        my ($key, $val) = $line.split("  ", 2);
        $key = lc $key;
        %phonetic{$key} = $val;
    }
}

load-phonetic('cmu_dict.txt');
my %words = map { $_ => 1}, 'words.txt'.IO.lines;

say "Starting the search";

for %words.keys -> $word {
    next unless %phonetic{$word}:exists;
    my $shorter = $word.substr(1);
    next unless %words{$shorter}:exists;
    next unless %phonetic{$shorter}:exists;
    next unless %phonetic{$word} eq %phonetic{$shorter};
    my $other-shorter = $word.substr(0, 1) ~ $word.substr(2);
    next unless %words{$other-shorter}:exists;
    next unless %phonetic{$other-shorter}:exists;
    next unless %phonetic{$other-shorter} eq %phonetic{$shorter};
    say "$word $shorter $other-shorter %phonetic{$shorter}"
}

But this is somewhat inefficient because we don’t actually need the word list, since the CMU dictionary is another word list that we can use (and we can’t use words that would be in the word list and not in the CMU dictionary, because the program wouldn’t be able to figure out how they sound). The following program uses only the CMU dictionary and saves the time to load the word list and do checks on it:

my %phonetic;

sub load-phonetic ($file-name) {
    for $file-name.IO.lines -> $line {
        next if $line !~~ /^\w/; 
        my ($key, $val) = $line.split("  ", 2);
        $key = lc $key;
        %phonetic{$key} = $val;
    }
}

load-phonetic('cmu_dict.txt');

for %phonetic.keys -> $word {
    my $shorter = $word.substr(1);
    next unless %phonetic{$shorter}:exists;
    next unless %phonetic{$word} eq %phonetic{$shorter};
    my $other-shorter = $word.substr(0, 1) ~ $word.substr(2);
    next unless %phonetic{$other-shorter}:exists;
    next unless %phonetic{$other-shorter} eq %phonetic{$shorter};
    say "$word $shorter $other-shorter %phonetic{$shorter}"
}

A.9  Exercises of Chapter ??

A.9.1  Exercise in Section ??: the given ... when Switch Statement (p. ??)

To test the switch statement with various values, you might write something like this:

for <5 42 43 101 666 1024 2048> -> $value {
    given $value {
        when 0..9      { say "$_: One digit"}
        when 10..99    { say "$_: Two digits" ; proceed; }
        when 42        { say "$_: Response to the question" }
        when /^\d**3$/ { say "$_: Three digits" }
        default        { say "$_: More than three digits" }
    }
    say '';
}

This will display the following result:

5: One digit

42: Two digits
42: Response to the question

43: Two digits
43: More than three digits

101: Three digits
(...)

You can see the error when the input value is 43.

As a solution, it is possible to change the order of the when clauses:

for <5 42 43 101 666 1024 2048> -> $value {
    given $value {
        when 0..9      { say "$_: One digit"}
        when 42        { say "$_: Response to the question"; proceed; }
        when 10..99    { say "$_: Two digits"}
        when /^\d**3$/ { say "$_: Three digits" }
        default        { say "$_: More than three digits" }
    }
    say '';
}

This now works correctly, but the output for 42 is no longer in the same order. If we want to keep the original order, we may need to add a when statement with an empty block:

for <5 42 43 101 666 1024 2048> -> $value {
    given $value {
        when 0..9      { say "$_: One digit"}
        when 10..99    { say "$_: Two digits"; proceed}
        when 42        { say "$_: Response to the question"; }
        when 10..99    { }
        when /^\d**3$/ { say "$_: Three digits" }
        default        { say "$_: More than three digits" }
    }
    say '';
}

Or we could remove the need for proceed by inserting the code for the 42 case into the two-digit block:

for <5 42 43 101 666 1024 2048> -> $value {
    given $value {
        when 0..9      { say "$_: One digit"}
        when 10..99    { say "$_: Two digits"; 
                         say "$_: Response to the question" if $_ == 42
                       }
        when /^\d**3$/ { say "$_: Three digits" }
        default        { say "$_: More than three digits" }
    }
    say '';
}

It would also be possible to nest a when subexpression within the when 10..99 expression:

for <5 42 43 101 666 1024 2048> -> $value {
    given $value {
        when 0..9      { say "$_: One digit"}
        when 10..99    { say "$_: Two digits"; 
                         when 42 {say "$_: Response to the question";}
                       }
        when /^\d**3$/ { say "$_: Three digits" }
        default        { say "$_: More than three digits" }
    }
    say '';
}

A.9.2  Exercise in Section ??: Constructing New Operators (p. ??)

The “!” negation operator is a prefix operator (i.e., placed before the term that it negates). For the factorial operator, we need a postfix operator (placed after the term upon which it acts), so this difference will be sufficient to enable the Perl compiler to distinguish between the two operators.

We use the reduction metaoperator to compute the result:

sub postfix:<!> (Int $n) {
    [*] 2..$n;
}
say 5!; # -> 120

The signature ensures the operand is an integer (failing which we get an error). We may want to guard against a negative integer, which we can do by raising an error if $n is negative. In addition, we can use the Test standard module to automatize our tests:

sub postfix:<!> (Int $n) {
    fail "The operand is not a positive integer" if $n < 0;
    [*] 2..$n
}
use Test;
plan 5;
dies-ok {(-1)!}, "Factorial fails for -1";
eval-dies-ok "(2.5)!", "Factorial fails for 2.5";
ok 0! == 1, "Factorial 0";
ok 1! == 1, "Factorial 1";
ok 5! == 120, "Factorial of a larger integer";
done-testing;

The plan 5; line says that the test plan contains five individual tests. Then the two first tests check that the factorial operator fails for invalid input values. And it checks the output for some valid input.

The done-testing specifies that the test has finished. This function is really useful when you don’t have a plan, for example when you don’t know yet how many test you’ll run. Here, we have a plan, so using done-testing isn’t necessary.

The following is the output of the tests:

1..5
ok 1 - Factorial fails for -1
ok 2 - Factorial fails for 2.5
ok 3 - Factorial 0
ok 4 - Factorial 1
ok 5 - Factorial of a larger integer

If we had a test error on test 3, we would have obtained something like this:

ok 1 - Factorial fails for -1
ok 2 - Factorial fails for 2.5
not ok 3 - Factorial 0

# Failed test 'Factorial 0'
# at test_fact.pl6 line 8
ok 4 - Factorial 1
ok 5 - Factorial of a larger integer
1..5
# Looks like you failed 1 test of 5

Here, we have put the tests in the same file as the subroutine definition for the sake of simplicity of the example. Normally, the tests would be in a separate file, usually in a “t” directory and with a .t extension.

Testing and the Test module are further discussed in Section ?? (p. ??). More information about testing can be found at: https://doc.perl6.org/language/testing.

A.9.3  Exercise in Section ??: Sets, Bags and Mixes (p. ??)

We can’t just replace the %histogram with a bag, because bags are immutable (i.e., cannot be changed after creation) and the %histogram hash is populated progressively as the lines of the book are being read from the file. You may use a baghash (the mutable version of a bag) and are encouraged to try it.

However, the aim here is to extract the words of the book that are not in the word list. In other words, we no longer care about word frequencies, but just need a unique list of words that appear at least once in the book, so a set would be sufficient to satisfy our needs. The question is how to populate the set at creation time.

We can change the process-line subroutine so that it processes the line as previously but, instead of populating a hash, just returns the list of words. And we can create the set with a map function calling that subroutine:

my $skip = True;                    # flag to skip the header
sub process-line(Str $line is copy) {
    $skip = False if defined index $line, "*END*THE SMALL PRINT!";
    next if $skip;
    $line ~~ s:g/<[-']>/ /;         # Replacing dashes and 
                                    # apostrophes with spaces
    $line ~~ s:g/<[;:,!?.()"_`]>//; # removing punctuation symbols
    $line = $line.lc;               # setting string to lower case
    return $line.words;
}

my $book-set = set map { process-line $_},  "emma.txt".IO.lines; 
my $word-list = set "words.txt".IO.lines;
my $unknown-words = $book-set (-) $word-list;
say $unknown-words.keys.head(20);

This works well, but once we’ve done that, we can also get rid of the $book-set data structure and just filter directly the words extracted from the book:

my $skip = True; # flag to skip the header

sub process-line($line is copy) {
    # (same as above)
}

my $word-list = set "words.txt".IO.lines;
my @unknown-words = unique grep {$_ ∉ $word-list}, 
                    grep { $_ }, 
                    map { | process-line $_},  
                    "emma.txt".IO.lines; 
say @unknown-words.head(20);

Testing such a program may take some time, because it has to process the full book each time. For the purposes of initial testing, one tip is to reduce the amount of input data to speed up the tests. You may achieve that by preparing a smaller file with just a limited number of lines from the original emma.txt file. Another simple way is to read only some lines from the book, which you can do with a slice on the code line that reads the file. For example, to read only the first 2,000 lines of the book file:

my @unknown-words = unique grep {$_ ∉ $word-list}, 
                    grep { $_ }, 
                    map { | process-line $_},  
                    ("emma.txt".IO.lines)[0..1999]; 

This can also be used to get rid of the header. Since the actual text of the book starts on line 254, we can have:

my @unknown-words = unique grep {$_ ∉ $word-list}, 
                    grep { $_ }, 
                    map { | process-line $_},  
                    ("emma.txt".IO.lines)[253..1999]; 

and remove from process-line the code to skip the header.

A.9.4  Exercise in Section ??: Random Words (p. ??)

We have made a BisectSearch module containing a bisect subroutine. It would be great to reuse it, but we can’t because it is currently doing string comparisons and we need numerical comparisons.

The best solution at this point is probably to make a copy of the subroutine and modify it to make numeric comparisons. The subroutines can have the same name provided they are declared as multi subroutines and have a different signature: the first parameter of the new multi subroutine should be an Int instead of a Str. Since the changes to be made are quite small and easy, this is left as an exercise for the reader.

The program using that module might look like this:

use lib ".";
use BisectSearch;
my %histogram;

sub process-line(Str $line is copy) {
    $line ~~ s:g/<[-']>/ /; 
    $line ~~ s:g/<[;:,!?.()"_`]>//; 
    $line = $line.lc; 
    return $line.words;
}
%histogram{$_}++ for grep {$_},
                     map { | process-line $_}, 
                     ("emma.txt".IO.lines)[253..*]; 
my (@words, @freqs);
my $total_freq = 0;
for %histogram.kv -> $word, $freq {
    $total_freq += $freq;
    push @words, $word;
    push @freqs, $total_freq;
}
my $rand_int = $total_freq.rand.Int;
my $idx = bisect $rand_int, @freqs;
say @words[$idx];

A.9.5  Exercise in Section ??: Markov Analysis (p. ??)

Before we present our solution to the exercise, we want to briefly introduce a functionality that is useful for retrieving and validating the command-line arguments passed to a program: the MAIN subroutine.

A.9.5.1  The MAIN subroutine

The arguments passed to a program are usually stored in the @*ARGS special array. You can just browse the items of this array to retrieve the arguments. The following one-liner is an example of this:

$ perl6 -e 'say $_ for reverse @*ARGS' one two three
three
two
one

There is however another way to do it, the MAIN subroutine that we briefly discussed in Section ?? (p. ??). If there is a subroutine called MAIN in the program, then the program will start by executing this subroutine, whose parameters will be the arguments passed to the program. This means that the signature of the MAIN subroutine will make it possible to retrieve the parameters and check their validity.

In our example solution below, the MAIN subroutine is declared as follows:

sub MAIN (Str $book, Int $word-count, Int $order = 2, 
            Int $start-line = 0) {
    # body of subroutine here
}

The program will thus check that the arguments passed to it match the MAIN subroutine signature. In the example, the first parameter has to be a string and the second one an integer; the third and fourth parameters are optional and will be defaulted respectively to 2 and 0 if the corresponding arguments are not provided.

If the arguments passed to the program don’t match the MAIN signature, the program will die after having printed an automatically generated usage message:

$ perl6  markov.pl6 emma.txt 100 2 foo
Usage:
  markov.pl6 <book> <word-count> [<order>] [<start-line>]

The $start-line parameter has to be an integer. Since the corresponding argument (“foo”) is not an integer, the program displays a message showing the program usage.

Validating the command-line arguments passed to a program can sometimes be a relatively tedious task. But, with this MAIN subroutine signature mechanism, it can often be reduced to a single line of code, the MAIN signature.

A.9.5.2  Solution to the Markov analysis exercise

This is a possible way to perform a Markov analysis of a text file:

my %prefixes;

sub MAIN (Str $book, Int $word-count, Int $order = 2, 
              Int $start-line = 0) {
    process-line($order, $_) for ($book.IO.lines)[$start-line..*]; 
    say make-text($order, $word-count);
}

sub process-line($order, Str $line is copy) {
    $line ~~ s:g/<[-']>/ /; 
    $line ~~ s:g/<[;:,!?.()"_`]>//; # removing punctuation symbols
    $line = $line.lc;               # setting string to lower case
    return unless $line ~~ /\w/;
    process-words($order, $line.words);
}

sub process-words ($order, @new-words) {
    state @word-buffer = ();
    push @word-buffer, |@new-words;
    while (@word-buffer.elems >= $order * 2) {
        my $key = @word-buffer.shift ~ " " ~ 
             (join ' ', @word-buffer[0..$order - 2]);
        my $value = @word-buffer[$order -1];
        push %prefixes{$key}, $value;
    }
}

sub make-text (Int $order, Int $w-count) {
    my @prefix = %prefixes.keys.pick.words;
    my $count = 0;
    my $text = join " ", @prefix;
    while $count <= $w-count {
        my @possible-suffixes = |%prefixes{join " ", @prefix};
        last unless @possible-suffixes;
        my $new-word = |@possible-suffixes.pick;
        $text ~= " $new-word";
        shift @prefix;
        push @prefix, |$new-word;
        $count++
    }
    return $text;
}     

This program may be called on the emma.txt file with the following syntax:

$ perl6  markov.pl6 emma.txt 100 2 253

A.9.6  Exercises on the Huffman Code in Section ?? (p. ??)

A.9.6.1  The Frequency Table (Section ??)

We have already seen problems similar to this one. This is a possible solution using the pipeline programming model described in Section ?? (page ??):

my %frequencies;
%frequencies{$_}++ for grep {/<[a..z]>/}, map {.lc}, 
    "goldbug.txt".IO.lines.comb;
my $total_count = [+] values %frequencies;
say "$_ :\t%frequencies{$_} \t", 
    sprintf "%5.2f", %frequencies{$_}*100/$total_count  
    for reverse sort {%frequencies{$_}}, %frequencies.keys;

This displays:

e :     7625    13.10
t :     5485     9.42
a :     4477     7.69
o :     4208     7.23
i :     4183     7.18
n :     3912     6.72
s :     3516     6.04
h :     3372     5.79
r :     3278     5.63
d :     2533     4.35
l :     2324     3.99
u :     1893     3.25
c :     1523     2.62
m :     1499     2.57
f :     1392     2.39
w :     1303     2.24
p :     1169     2.01
y :     1146     1.97
g :     1143     1.96
b :     1031     1.77
v :     525      0.90
k :     351      0.60
x :     120      0.21
j :     111      0.19
q :     60       0.10
z :     44       0.08

Remember that Edgar Allan Poe’s character claimed the succession of the most commonly used letters in English ran as follows:

e a o i d h n r s t u y c f g l m w b k p q x z

So it appears that Poe’s character was approximately right, but certainly not very accurate, in his estimates of the letter frequencies in an English text. It appears that he especially grossly underestimated the frequency of the “t” letter. Running the same program against the text of Jane Austen’s novel Emma that we have used previously produces very close results:

e :     87029   12.57
t :     60035    8.67
a :     54884    7.93
o :     53877    7.78
n :     47773    6.90
i :     47172    6.82
s :     42920    6.20
h :     42819    6.19
r :     41453    5.99
d :     28870    4.17
l :     27971    4.04
(...)

A.9.6.2  Huffman Coding of a DNA Strand (Section ??)

At each step in the algorithm, we need to look for the two letters with the lowest frequencies. Rather than having to repeatedly go through all the items in the frequency hash (or to sort the values each time), we will use a data structure maintaining the values sorted according to our needs.

We start with the %frequencies hash built in the previous exercise and transform it into a sorted collection of pairs mapping each letter to its frequency. We create a insert-pair subroutine that adds the newly created pairs (the dummy letters) at the right place in the pair array to keep the array sorted according to our needs:

my %code;
my @pairs;
push @pairs, $_ => %frequencies{$_} for 
    sort {%frequencies{$_}}, %frequencies.keys;

sub insert-pair (@list, $new-elem) {
    my $val = $new-elem.value;
    for @list.keys -> $i {
        if @list[$i].value >= $val {
            splice @list, $i, 0, $new-elem;
            return;
        }
    }
    push @list, $new-elem; # putting the new element at the end of 
                           # the list if right place not found earlier
}

We loop over the pairs, pick up the two with the smallest frequencies, merge them into a new pair, and add it at the right place with the insert-pair subroutine. The loop ends when there are only two pairs left. At the same time, we populate at each step of the loop the new %code hash with the partial codes found:

loop {
    my $least1 = shift @pairs;
    my $least2 = shift @pairs;
    my $new-pair = $least1.key ~ $least2.key => $least1.value + $least2.value;
    insert-pair @pairs, $new-pair;
    %code{$least1.key} =  $least1.key ~ $least2.key ~ "|.";
    %code{$least2.key} =  $least1.key ~ $least2.key ~ "|-";
    last if @pairs <= 2;
}
%code{@pairs[0].key} = ".";
%code{@pairs[1].key} = "-";

At the end of the loop, the pair array contains two pairs:

[c => 10 tga => 11]

and the %code hash contains the partial codes for each letter or dummy letter:

{a => ga|-, c => ., g => ga|., ga => tga|-, t => tga|., tga => -}

We then use another loop to substitute the pseudo-letters and get rid of them, until we are left with only the actual letters of the original input string:

loop {   
    my $done = True;
    for %code.keys -> $letter {
        next if $letter.chars > 1;
        my ($val, $code) = split '|', %code{$letter};
        next unless defined $val and defined $code;
        $done = False;
        my $result = %code{$val} ~ $code;
        %code{$letter} = $result;
    }
    last if $done;
}
my %encode;
%encode{$_} = %code{$_} for grep {$_.chars < 2 }, %code.keys;

The %encode hash contains the Huffman table:

c => .
t => -.
g => --.
a => ---

A.9.6.3  Huffman Coding of a More Complex String (Section ??)

For this question, we will use a small paragraph specially written to contain only a few letters of the alphabet:

Eastern Tennessee anteaters ensnare and eat red ants, detest ant antennae (a tart taste) and dread Antarean anteater-eaters. Rare Andean deer eat tender sea reeds, aster seeds and rats’ ears. Dessert? Rats’ asses.

As a first step, we will simplify a bit the problem by folding all letters to lowercase and use only the letters, eliminating spaces and punctuation from the computation of the frequency table:

my $string = "Eastern Tennessee anteaters ensnare and eat red ants, detest ant
antennae (a tart taste) and dread Antarean anteater-eaters. Rare
Andean deer eat tender sea reeds, aster seeds and rats’ ears. Dessert?
Rats’ asses."; 

my %frequencies;
%frequencies{$_}++ for grep { /\w/ }, $string.lc.comb;

This eloquent treatise on the eating habits of various fauna yields the following frequency table:

e :     40      23.53
a :     32      18.82
t :     24      14.12
s :     22      12.94
n :     20      11.76
r :     19      11.18
d :     13      7.65

Using the same code as in the previous question generates the following Huffman table:

a => ..
e => .-
s => -.-
n => -..
t => --.
d => ---.
r => ----

A.9.6.4  Encoding the Input String (Section ??)

We want not only to encode an input string with the Huffman code, but we also want to then be able to decode it and to recognize the original input. Because of that, we no longer want to filter out punctuation from the translation table, which will therefore grow much larger than before. Spaces (both horizontal spaces and line returns) will be handled differently: we’ll keep them unchanged in the encoded pseudo-Morse string, as this will make it easier to check and to display the result.

The frequency table now includes punctuation characters which exist in the input string:

%frequencies{$_}++ for grep {/<[\w] + [.,()’?-]>/}, $string.lc.comb;

The frequency table now has 14 entries:

e :     40      22.10
a :     32      17.68
t :     24      13.26
s :     22      12.15
n :     20      11.05
r :     19      10.50
d :     13       7.18
. :     3        1.66
, :     2        1.10
’ :     2        1.10
) :     1        0.55
- :     1        0.55
? :     1        0.55
( :     1        0.55

And the Huffman table (%encode hash) now looks like this:

e => .-
a => ---
s => -..
n => ..-
t => --.
r => ...
d => -.--
. => -.-.-.
( => -.-..-.
’ => -.-.--.
? => -.-..--
) => -.-...-
- => -.-....
, => -.-.---

The encoding subroutine is very simple:

sub encoding (Str $input, %encode) {
    my $output;
    for $input.lc.comb -> $letter {
        $output ~= %encode{$letter} // $letter;
    }
    return $output;
}

Each letter of the input is converted to lowercase (since we have limited our table to lower case), translated into its pseudo-Morse code equivalent, and concatenated to the output string. If a letter is not found in the %encode hash, then it is stored into the output as it is: this makes it possible to insert the spaces and end-of-line characters into the output string.

The result is as follows (slightly reformatted to fit in this book):

.-----..--..-.....- --..-..-..-.--..-...-.- ---..---..------..-...-..
.-..--....----....- ---..--.-- .------. ....--.-- ---..---.-..-.-.---
-.--.---..--..--. ---..---.
---..---..-..-..----.- -.-..-.--- --.---...--. --.----..--..--.-...- 
---..--.-- -.--....-----.-- ---..---.---....----..-
---..---..------..-...-.-.....------..-...-..-.-.-. ...---....-
---..--.--.----..- -.--.-.-... .------. --..-..--.--.-... -...----
 ....-.--.---..-.-.--- ----..--..-... -...-.--.---.. ---..--.-- 
 ...-----.-..-.-.--. .----...-..-.-.-. -.--.--..-...-...--.-.-..--
...-----.-..-.-.--. ----..-...--..-.-.-.

Interestingly, the input string has 213 characters and the output string has 589 bits. If we were storing the 14 different characters of the input with equal-length codes, we would need four bits per character, which would require 1052 bits. So Huffman coding achieved a compression ratio 1.78 times better than the best possible equal-length codes. And the ASCII encoding of the input string required 213 bytes, i.e., 1704 bits; the Huffman-encoded output required almost three times less.

A.9.6.5  Decoding the Pseudo-Morse String (Section ??)

For decoding efficiently the pseudo-Morse string, we need to reverse the Huffman table, i.e., create a hash in which the pseudo-Morse codes are the keys and the letters are the values. Reversing the %encode hash is straight forward:

my %decode = reverse %encode.kv;

The %encode.kv expression produces a list of keys and values, and the reverse statement transforms it into a list of values and keys. Assigning that list to a new hash produces a new hash in which keys and values are swapped. Note that this works because we know that the values are unique, so that there is no problem of duplicates when promoting them to hash keys.

Decoding the pseudo-Morse string is a bit more complicated than its encoding, because we don’t know in advance how many dots and dashes will be needed to obtain a letter. So we need to look at the first character (say a dot) of the pseudo-Morse string. If this character alone constitutes an entry in the translation table, then we have found our first letter, and we can start afresh with the next character as a starting point of a new letter; if not, we need to pick up the next character and see whether the two first characters together form an entry; if yes, we have found a letter and can start from the beginning again; if not we need to see whether the first three characters together form an entry in the table, and so on.

For example, with the beginning of the pseudo-Morse string:

.-----..--..-.....-

the first dot is not an entry but the “.-” combination is an “e”. The next dash is not an entry and neither is “--”, but “---” is an “a”. The next dash is not an entry and neither is “-.”, but “-..” is a “s”. Similarly, the next three characters, “--.”, form a “t”, and we can go on to decode the word “eastern”.

We might implement this with two nested loops: one to go through the string and the second one to consume the necessary number of dots and dashes until the end of a letter:

sub decoding (Str $input, %decode) {
    my @codes = $input.comb;
    my $output;
    loop {
        last unless @codes;
        my $current = shift @codes;
        $output ~= $current and next if $current ~~ /\s/;
        $output ~= %decode{$current} and next if %decode{$current}:exists;
        loop {           # we need more characters to complete a letter
            $current ~= shift @codes;
            if %decode{$current}:exists {
                $output ~= %decode{$current};
                last;    # we're done with a letter, go back to main loop
            }
        }
    }
    return $output;
 }

This works properly and the output is the same as the original input (except for the fact that we have folded everything to lowercase):

eastern tennessee anteaters ensnare and eat red ants, detest ant
antennae (a tart taste) and dread antarean anteater-eaters. rare
andean deer eat tender sea reeds, aster seeds and rats’ ears. dessert?
rats’ asses.

However, if you think about it, we don’t really need two nested loops in the decoding subroutine, which can be made a bit more concise as follows:

sub decoding (Str $input, %decode) {
    my ($output, $current);
    for $input.comb -> $in-code {
        $output ~= $in-code and next if $in-code ~~ /\s/;
        $current ~= $in-code;
        if %decode{$current}:exists {
            $output ~= %decode{$current};
            $current = "";
        }
    }
    return $output;
}

Here, the $current variable accumulates the dots and dashes from the input until it is found to be an entry in the translation table, at which point it is reset to an empty string to prepare for the next letter.

The solution presented above for finding the Huffman code uses the insert-pair subroutine to keep ordered the @pairs array of pairs. This makes it easy to find the remaining least common letters or pseudo-letters. You might remember from Section ?? that heaps are a good data structure when the aim is to access rapidly the smallest items of a collection. As a further exercise, you may want to rewrite the solution using a binary heap. David Huffman’s original solution actually used a tree (called the Huffman tree) very similar to a heap.

A.10  Exercises of Chapter ??: Regexes and Grammars

A.10.1  Exercise in Section ??: Getting the February Dates Right (p. ??)

We want to check whether the February dates are valid.

To begin with, let’s exclude February dates that are larger than 29. This can be done by just expanding the code assertion shown in the code to recognize dates:

my $string = "Leap day : 2016-02-29.";                                         
my token year { \d ** 4 }                                        
my token month {   
    1 <[0..2]>                            # 10 to 12                     
    || 0 <[1..9]>                         # 01 to 09                     
};
my token day { (\d ** 2) <?{1 <= $0 <= 31 }> }  
my token sep { '/' || '-' }                                                 
my rule date { [   <year> (<sep>) <month> $0 <day> 
                 || <day> (<sep>) <month> $0 <year> 
                 || <month>\s<day>',' <year>
                ] <!{ ($<day> > 30 and $<month> ==  4|6|9|11) or 
                       $<day> > 29 and $<month> eq '02' }>
}                         

if $string ~~ /<date>/ {
    say ~$/;                              # 2016-02-29
    say "Day\t= "   , ~$/<date><day>;     # 29
    say "Month\t= " , ~$/<date><month>;   # 02
    say "Year\t= "  , ~$/<date><year>;    # 2016
}                    

This is fine. February has 29 days since 2016 is a leap year. But this code would validate Feb. 29 for 2015 or 2017, which is wrong since they are not leap years.

A.10.1.1  Recognizing a leap year

In the old Julian calendar (named after Julius Caesar), leap years are years that are divisible by 4. It turned out that the Julian calendar had too many leap years to reflect the astronomical reality, so that the calendar drifted about 3 days for every period of four centuries.

The Gregorian calendar, introduced by Pope Gregory XIII in 1582, corrected the Julian calendar with the following additional rule: years divisible by 100 should be leap only if they are also divisible by 400. So, by the Gregorian calendar, 1700, 1800, 1900, and 2100 are not leap, but 2000 and 2400 are leap.

Depending on what kind of dates your program is going to encounter, you might decide to simplify the rules. If you are writing a module that is supposed to be accurate for any date far in the past or in the future, you probably want to implement the exact Gregorian rule. But if you know that you’re going to meet only dates of the current period, you might choose a much simpler rule.

In particular, since 2000 is an exception to the exception and is leap, any year between 1901 and 2099 is leap if it is divisible by 4 and not leap otherwise. This rule is likely sufficient for any business application written in 2017. There is probably no reason to make it more complicated than it needs to be (although it may be argued that it is the same type of reasoning that led to the great fear of the “Y2K” bug).

With this simplification in mind, a subroutine to find out if a year is leap should simply return true if it is divisible by 4 and might thus look like this:

sub is-leap ($year) { # works for years between 1901 and 2099
    return True if $year %% 4; 
    return False;
}

Or simpler:

sub is-leap ($year) { # works for years between 1901 and 2099
    return $year %% 4; 
}

If you want to implement the full Gregorian rule, it might look like this:

sub is-leap ($year) { # Gregorian rule for any year
    return False if $year % 4;   # no if not divisible by 4
    return True if $year % 100;  # yes if divisible by 4 and not by 100
    return False if $year % 400; # no if divisible by 100 and not by 400
    True;                        # yes if divisible by 400
}

or, if you like concision (or obfuscation):

sub is-leap ($y) { $y %% 400 or ($y %% 4 and not $y %% 100) }

The code above is given as an example on how to compute whether a year is leap, since it is an interesting and classical problem, but Perl actually provides a method for that in the Dateish role. For example:

> say Dateish.is-leap-year(2016)
True
> say Dateish.is-leap-year(2015)
False

A.10.1.2  Back to the February date validation

You can add the rules for Feb. 29 in the code example above if you wish, but we would suggest this is getting slightly too complicated for a code assertion within the date rule: adding a quick Boolean condition in a code assertion within a rule is fine, but when the condition becomes more complicated, it will tend to make the rule more difficult to understand. Think about the person who will have to maintain the code in a year from now (and that person might be you).

We prefer to move the code performing the validation out of the date rule into a dedicated subroutine checking all dates for February:

sub feb-date-not-valid ($year, $day) {
    return False if $day <= 28;
    return True if $day > 29;
    return False if Dateish.is-leap-year($year);
    True;
}

The date rule now looks like this:

my rule date { [   <year> (<sep>) <month> $0 <day> 
                 || <day> (<sep>) <month> $0 <year> 
                 || <month>\s<day>',' <year>
               ] <!{ ($<day> > 30 and $<month> ==  4|6|9|11) or 
                     $<month> eq '02' and feb-date-not-valid $<year>, $<day>}>
} 

I had originally called the new subroutine check-feb-29 but I changed it to feb-date-not-valid in order to better show that it returns a true value if the date is not valid. This may seem secondary, but choosing good names for your identifier is important because that self-documents your programs and clarifies their semantics.

Once we’ve introduced this minimal subroutine, we might go one step further and move the rest of the code assertion into the subroutine, so that the final code assertion would contain only a call to the new version of the subroutine. This is left as a further exercise for the reader.

A.10.2  Exercise ?? (p. ??): A Grammar for an Arithmetic Calculator

Here’s one possible way to implement an arithmetic calculator.

A.10.2.1  The Grammar

Here’s one way to write the grammar:

my grammar Calculator {
    rule TOP            { <expr> }
    rule expr           { <term> + % <plus-minus-op> }
    token plus-minus-op { [< + - >] }
    rule term           { <atom> + % <mult-div-op> }
    token mult-div-op   { [< * / >] }
    rule atom {
        | <num> { make +$<num> }
        | <paren-expr> { make $<paren-expr>.made}
    }
    rule num            { <sign> ? [\d+ | \d+\.\d+ | \.\d+ ] }
    rule paren-expr     { '(' <expr> ')' }
    token sign          { [< + - >] }
}

This solution is quite simple.

An expression (expr) is made of one or several terms separated by “+” or “-” operators. A term is made of one or several atoms separated “*” or “/” operators. An atom may be a bare number or a parenthesized expression.

This guarantees that precedence rules are satisfied. Multiplications and divisions will be evaluated before additions and subtractions, since, when parsing an expression, you need to evaluate the individual terms before you can complete the expression evaluation. Similarly, since a parenthesized expression is an atom, it will have to be evaluated before the term in which it appears can be fully evaluated. Note that, in the case of a parenthesized expression, the expr rule is called recursively.

A.10.2.2  The actions

Notice that we have included two actions in the grammar (in the atom rule). One reason was for convenience: since the atom rule covers two very different named subrules, it is a bit easier to include the action just in the context of the sub-rules. If an action had been attached to the atom rule, it would have required finding out which sub-rule had been matched to know which action to perform. Nothing difficult, but doing so would have made the code slightly more complex. The other reason was for pedagogical purposes: although it often makes sense to create an actions class, it is useful to know that actions may be inserted in the grammar part. For a very simple grammar, it might be over-engineering to create an actions class with just one or two actions.

The actions class might look like this:

class CalcActions {
    method TOP ($/) {
        make $<expr>.made
    }
    method expr ($/) {
        $.calculate($/, $<term>, $<plus-minus-op>)
    }
    method term ($/) {
        $.calculate($/, $<atom>, $<mult-div-op>)
    }
    method paren-expr ($/) {
         make $<expr>.made;
    }
    method calculate ($/, $operands, $operators) {
        my $result = (shift $operands).made;
        while my $op = shift $operators {
            my $num = (shift $operands).made;
            given $op {
                when '+' { $result += $num; }
                when '-' { $result -= $num; }
                when '*' { $result *= $num; }
                when '/' { $result /= $num; }
                default  { die "unknown operator "}
            }
        }
        make $result;
    }
}

The calculate method computes expressions (terms separated by addition or subtraction operators) and terms (atoms separated by multiplication or division operators) from left to right, since these operators are left associative.

This grammar for a calculator and its associated actions class may be tested with the following code:

for |< 3*4 5/6 3+5 74-32 5+7/3 5*3*2 (4*5) (3*2)+5 4+3-1/5 4+(3-1)/4 >,
    "12 + 6 * 5", " 7 + 12 + 23", " 2 + (10 * 4) ", "3 * (7 + 7)" { 
    my $result = Calculator.parse($_, :actions(CalcActions));
    # say $result;
    printf "%-15s %.3f\n", $/,  $result.made if $result;
}

which will display the following results:

3*4             12.000
5/6             0.833
3+5             8.000
74-32           42.000
5+7/3           7.333
5*3*2           30.000
(4*5)           20.000
(3*2)+5         11.000
4+3-1/5         6.800
4+(3-1)/4       4.500
12 + 6 * 5      42.000
 7 + 12 + 23    42.000
 2 + (10 * 4)   42.000
3 * (7 + 7)     42.000

You might wonder whether this code works correctly with nested parenthesized expressions. I originally thought, when I wrote this code, that it might malfunction and that I might have to change or add something to get nested parenthesized expressions right and properly balanced. It turns out that it works fine out of the box. For example, consider the following test code with relatively deeply nested parenthesized expressions:

for "(((2+3)*(5-2))-1)*3", "2 * ((4-1)*((3*7) - (5+2)))"  { 
    my $result = Calculator.parse($_, :actions(CalcActions));
    printf "%-30s %.3f\n", $/,  $result.made if $result;
}

The result is correct:

(((2+3)*(5-2))-1)*3            42.000
2 * ((4-1)*((3*7) - (5+2)))    84.000

As an additional exercise, you might want to add exponentiation to the list of allowed operators. Remember that exponentiation has higher precedence than multiplication and division (so you probably want to put it somewhere near the atom level). In the event that you want to handle nested exponentiation operators, also remember that they are usually right associative:

2**3**2 = 2**(3**2) = 2 ** 9 = 512;  # Not: (2**3)**2 or 64

A.11  Exercises of Chapter ??: Functional Programming

A.11.1  Exercise ??: Making a Functional Implementation of Quick Sort)

Here’s one way to implement the quick sort algorithm in functional programming style.

sub quicksort (@input) {
    return @input if @input.elems <= 1;
    my $pivot = @input[@input.elems div 2];
    return flat quicksort(grep {$_ < $pivot}, @input), 
        (grep {$_ == $pivot}, @input), 
        quicksort(grep {$_ > $pivot}, @input);
}

This functional version of the program reflects directly the approach of the quick sort algorithm:

  • If the array has less than two items, it is already sorted, so return it immediately (this is the base case stopping the recursion).
  • Else, choose an item as a pivot (here, we pick the middle element or one immediately near the middle).
  • Partition the array into three sublists containing items respectively smaller than, greater than, and equal to the pivot.
  • Sort the first two sublists by a recursive call to the quicksort function, but don’t call quicksort on the sublist containing items equal to the pivot: not only is it already sorted (all elements are equal), but it would fail to meet the base case and would enter into infinite recursion.
  • Return the list obtained by concatenating the three sublists.

As noted earlier, the ideal pivot would be the median of the values, but the cost of finding the median would be prohibitive.

In principle, you could choose any item as the pivot, including for example the first or the last item of the array. But for some specific input (such as arrays already almost sorted, forward or backward), this can increase significantly the run time because the partitioning becomes totally unbalanced, thereby losing the advantage of the divide and conquer strategy. Picking an element in the middle, as we did here, strongly reduces the probability of such pathological behavior. Another possible way to prevent such risk is to select the pivot at random among the array elements.

Are you using one of our books in a class?

We'd like to know about it. Please consider filling out this short survey.


Think DSP

Think Java

Think Bayes

Think Python 2e

Think Stats 2e

Think Complexity


Previous Up Next