This HTML version of Think Perl 6 is provided for convenience, but it is not the best format of the book. You might prefer to read the PDF version. Chapter 5 Fruitful SubroutinesMost of the Perl functions we have used, such as the math functions, produce return values. But most of the subroutines we’ve written so far are void: they have an effect, like printing a value, but they don’t have a return value. In this chapter you will learn to write fruitful functions. 5.1 Return ValuesCalling a fruitful function generates a return value, which we usually assign to a variable or use as part of an expression: my $pi = 4 * atan 1; my $height = $radius * sin $radians; Many of the subroutines we have written so far are void. Speaking casually, they have no usable return value; more precisely, their return value may be Any, Nil, (), or True. In this chapter, we are (finally) going to write fruitful subroutines. The first example is area, which returns the area of a circle with the given radius: sub area($radius) { my $circular_area = pi * $radius**2; return $circular_area; } We have seen the return statement before, but in a fruitful function the return statement includes an expression. This statement means: “Return immediately from this function and use the following expression as a return value.” The expression can be arbitrarily complicated, so we could have written this function more concisely: sub area($radius) { return pi * $radius**2; } On the other hand, temporary variables like
Sometimes it is useful to have multiple return statements, for example one in each branch of a conditional: sub absolute_value($num){ if $num < 0 { return -$num; } else { return $num; } } Since these return statements are in an alternative conditional, only one runs. This could also be written more concisely using the statement modifier syntax: sub absolute_value($num){ return -$num if $num < 0; return $num; } Here again, only one of the return statements runs: if the number is negative, the first return statement is executed and the subroutine execution stops there; if the number is positive or zero, then only the second return statement is executed. As soon as a return statement runs, the function terminates without executing any subsequent statements. Code that appears after an unconditional return statement, or any other place the flow of execution can never reach, is called dead code. In a fruitful function, it is a good idea to ensure that every possible path through the program hits a return statement. For example: # WARNING: faulty code sub absolute_value($num){ if $num < 0 { return -$num; } if $num > 0 { return $num; } } This subroutine is incorrect because if $num happens to be 0, neither condition is true, and the subroutine ends without hitting a return statement. If the flow of execution gets to the end of a function, the return value is (), which basically means “not defined” and is clearly not the absolute value of 0: > absolute_value(0) () By the way, Perl provides a built-in function called abs that computes absolute values. As an exercise, write a compare subroutine that takes two numbers, $x and $y, and returns 1 if $x > $y, 0 if $x == $y, and -1 if $x < $y. Solution: ?? 5.2 Incremental DevelopmentAs you write larger functions, you might find yourself spending more time debugging. To deal with increasingly complex programs, you might want to try a process called incremental development. The goal of incremental development is to avoid long debugging sessions by adding and testing only a small amount of code at a time. As an example, suppose you want to find the distance between two points, given by the Cartesian or rectangular coordinates (x1, y1) and (x2, y2). By the Pythagorean theorem, the distance is:
The first step is to consider what a distance function should look like in Perl. In other words, what are the inputs (parameters) and what is the output (return value)? In this case, the inputs are two points, which you can represent using four numbers. The return value is the distance represented by a numeric value. Immediately you can write an outline of the function: sub distance($x1, $y1, $x2, $y2) { return 0.0; } Obviously, this version doesn’t compute distances; it always returns zero. But it is syntactically correct, and it runs, which means that you can test it before you make it more complicated. To test the new function, call it with sample arguments: > distance(1, 2, 4, 6); 0.0 I chose these values so that the horizontal distance is 3 and the vertical distance is 4; that way, the result is 5, the hypotenuse of a 3-4-5 triangle. When testing a function, it is useful to know the right answer. At this point we have confirmed that the function is syntactically correct, and we can start adding code to the body. A reasonable next step is to find the differences x2 − x1 and y2 − y1. The next version stores those values in temporary variables and prints them: sub distance($x1, $y1, $x2, $y2) { my $dx = $x2 - $x1; my $dy = $y2 - $y1; say '$dx is', $dx; say '$dy is', $dy; return 0.0; } If the function is working, it should display Next we compute the sum of squares of $dx and $dy: sub distance($x1, $y1, $x2, $y2) { my $dx = $x2 - $x1; my $dy = $y2 - $y1; my $dsquared = $dx**2 + $dy**2; say '$dsquared is: ', $dsquared; return 0.0; } Again, you would run the program at this stage and check the output (which should be 25). Finally, you can use the sqrt built-in function to compute and return the result: sub distance($x1, $y1, $x2, $y2) { my $dx = $x2 - $x1; my $dy = $y2 - $y1; my $dsquared = $dx**2 + $dy**2; my $result = sqrt $dsquared; return $result; } If that works correctly, you are done. Otherwise, you might want to print the value of $result before the return statement. The final version of the subroutine doesn’t display anything when it runs; it only returns a value. The print statements we wrote are useful for debugging, but once you get the function working, you should remove them. Code like that is sometimes called scaffolding because it is helpful for building the program but is not part of the final product. When you start programming, you should add only a line or two of code at a time. As you gain more experience, you might find yourself writing and debugging bigger chunks. Either way, incremental development can save you a lot of debugging time. The key aspects of the process are:
Note that, at least for relatively simple cases, you can also use the REPL to test expressions and even multiline statements or subroutines in interactive mode before you commit them to your program code. This is usually fast and can save you some time. As an exercise, use incremental development to write a function called hypotenuse that returns the length of the hypotenuse of a right triangle given the lengths of the other two legs as arguments. Record each stage of the development process as you go. Solution: ??. 5.3 CompositionAs you should expect by now, you can call one function from within another. As an example, we’ll write a function that takes two points, the center of the circle and a point on the perimeter, and computes the area of the circle. Assume that the center point is stored in the variables $x-c and $y-c, and the perimeter point is in $x-p and $y-p. The first step is to find the radius of the circle, which is the distance between the two points. We just wrote a function, distance, that does that: my $radius = distance($x-c, $y-c, $x-p, $y-p); The next step is to find the area of a circle with that radius; we just wrote that, too: my $result = area($radius); Encapsulating these steps in a function, we get: sub circle-area($x-c, $y-c, $x-p, $y-p) { my $radius = distance($x-c, $y-c, $x-p, $y-p); my $result = area($radius) return $result; } The temporary variables $radius and $result are useful for development and debugging, but once the program is working, we can make it more concise by composing the function calls: sub circle-area($x-c, $y-c, $x-p, $y-p) { return area distance($x-c, $y-c, $x-p, $y-p); } The last line of the previous example now works like a data
pipeline from right to left: the 5.4 Boolean FunctionsFunctions can return Boolean values, which is often convenient for hiding complicated tests inside functions. For example: sub is-divisible(Int $x, Int $y) { if $x % $y == 0 { return True; } else { return False; } } It is common to give Boolean functions names that sound like yes/no
questions; Here is an example: > is-divisible(6, 4); False > is-divisible(6, 3); True The result of the == operator is a Boolean value, so we can write the subroutine more concisely by returning it directly: sub is-divisible(Int $x, Int $y) { return $x % $y == 0 } If there is no return statement, a Perl subroutine returns the value of expression on the last code line of the subroutine (provided the last code line is an expression that gets evaluated), so that the return statement is not required here. In addition, since 0 is a false value and any other integer a true value, this could be further rewritten as follows: sub is-divisible(Int $x, Int $y) { not $x % $y } The Int type declarations in the subroutine signatures above are not necessary. The subroutine would work without them, but they can provide some form of protection against using this subroutine with faulty arguments. Boolean functions are often used in statement modifiers: say "$x is divisible by $y" if is-divisible($x, $y); It might be tempting to write something like: say "$x is divisible by $y" if is-divisible($x, $y) == True; But the extra comparison is unnecessary: is-divisible returns a Boolean value that can be interpreted directly by the if conditional.
As an exercise, write a function Solution: ??. 5.5 A Complete Programming LanguageWe’ve seen in the section above several ways of writing a subroutine to check the divisibility of two integers. In fact, as briefly mentioned earlier, Perl 6 has a
“is divisible” operator, > 9 %% 3 True > 9 %% 4 False So there was no need to write the is-divisible subroutine. But don’t worry, that’s all right if you did not remember that. Speakers of natural languages are allowed to have different skill levels, to learn as they go and to put the language to good use before they know the whole language. The same is true with Perl. You (and I) don’t know all about Perl 6 yet, just as we don’t know all of English. But it is in fact “Officially Okay in Perl Culture” to use the subset of the language that you know. You are in fact encouraged to use what is sometimes called “baby Perl” to write programs, even if they are somewhat clumsy at the beginning. That’s the best way of learning Perl, just as using “baby talk” is the right way for a child to learn English. The number of different ways of accomplishing a given task, such as checking whether one number is divisible by another, is an example of one of Perl’s mottos: there is more than one way to do it, oft abbreviated TIMTOWTDI. Some ways may be more concise or more efficient than others, but, in the Perl philosophy, you are perfectly entitled to do it your way, especially if you’re a beginner, provided you find the correct result. We have only covered a small subset of Perl 6 so far, but you might be interested to know that this subset is a complete programming language, which means that essentially anything that can be computed can be expressed in this language. Any program ever written could be rewritten using only the language features you have learned so far (actually, you would need a few commands to control devices like the mouse, disks, networks, etc., but that’s all). Proving that claim is a nontrivial exercise first accomplished by Alan Turing, one of the first computer scientists (some would argue that he was a mathematician, but a lot of early computer scientists started as mathematicians). Accordingly, it is known as the Turing Thesis. For a more complete (and accurate) discussion of the Turing Thesis, I recommend Michael Sipser’s book Introduction to the Theory of Computation. 5.6 More RecursionTo give you an idea of what you can do with the tools you have learned so far, we’ll evaluate a few recursively defined mathematical functions. A recursive definition is similar to a circular definition, in the sense that the definition contains a reference to the thing being defined. A truly circular definition is not very useful: If you saw that definition in the dictionary, you might be annoyed. On the other hand, if you looked up the definition of the factorial function, denoted with the symbol !, you might get something like this:
This definition says that the factorial of 0 is 1, and the factorial of any other (positive integer) value, n, is n multiplied by the factorial of n−1. So 3! is 3 times 2!, which is 2 times 1!, which is 1 times 0!. Putting it all together, 3! equals 3 times 2 times 1 times 1, which is 6. If you can write a recursive definition of something, you can write a Perl program to evaluate it. The first step is to decide what the parameters should be. In this case it should be clear that factorial takes a number1: sub factorial($n){ } If the argument happens to be 0, all we have to do is return 1: sub factorial($n){ if $n == 0 { return 1; } } Otherwise, and this is the interesting part, we have to make a recursive call to find the factorial of n−1 and then multiply it by n: sub factorial($n){ if $n == 0 { return 1; } else { my $recurse = factorial($n-1); my $result = $n * $recurse; return $result; } } The flow of execution for this program is similar to the flow of countdown in Section ??. If we call factorial with the value 3: Since 3 is not 0, we take the second branch and calculate the factorial of $n-1... Since 2 is not 0, we take the second branch and calculate the factorial of $n-1...Since 1 is not 0, we take the second branch and calculate the factorial of $n-1...Since 0 equals 0, we take the first branch and return 1 without making any more recursive calls. The return value, 2, is multiplied by Figure ?? shows what the stack diagram looks like for this sequence of function calls. The return values are shown being passed back up the stack. In each frame, the return value is the value of result, which is the product of n and recurse. In the last frame, the local variables recurse and result do not exist, because the branch that creates them does not run. A seasoned Perl programmer might write a more concise or more idiomatic subroutine2: sub factorial($n){ return 1 if $n == 0; return $n * factorial $n-1; } This is not better than our initial version, and will probably not run significantly faster, but this is arguably clearer, at least once you get used to this type of syntax. 5.7 Leap of FaithFollowing the flow of execution is one way to read programs, but it can quickly become overwhelming. An alternative is what may be called the “leap of faith.” When you come to a subroutine call, instead of following the flow of execution, you assume that the subroutine works correctly and returns the right result. In fact, you are already practicing this leap of faith when you use built-in functions. When you call math functions such as cos or sqrt, you don’t examine the bodies of those functions. You just assume that they work because the people who wrote the built-in functions were likely to be good programmers (and because you can safely assume that they have been thoroughly tested). The same is true when you call one of your own subroutines. For
example, in Section ??, we wrote a subroutine called
The same is true of recursive programs. When you get to the recursive
call, instead of following the flow of execution, you should assume
that the recursive call works (returns the correct result) and then ask
yourself, “Assuming that I can find the factorial of Of course, it’s a bit strange to assume that the subroutine works correctly when you haven’t finished writing it, but that’s why it’s called a leap of faith! 5.8 One More ExampleAfter factorial, the most common example of a recursively defined mathematical function is fibonacci, which has the following definition (see http://en.wikipedia.org/wiki/Fibonacci_number):
In plain English, a Fibonacci sequence is a sequence of numbers such as: 1, 1, 2, 3, 5, 8, 13, 21, ... where the two first terms are equal to 1 and any other term is the sum of the two preceding ones. We briefly covered the Fibonacci sequence in Exercise ?? of the previous chapter and implemented it with a for loop. Let’s now translate the recursive definition into Perl. It looks like this: sub fibonacci ($n) { return 1 if $n == 0 or $n == 1; return fibonacci($n-1) + fibonacci($n-2) } If you try to follow the flow of execution here, even for fairly
small values of 5.9 Checking TypesWhat happens if we call factorial and give it 1.5 as an argument? It seems that we get an infinite recursion. How can that be? The subroutine has a base case—when $n == 0. But if $n is not an integer, we can miss the base case and recurse forever. In the first recursive call, the value of $n is 0.5. In the next, it is -0.5. From there, it gets smaller (more negative), but it will never be 0. We have two choices. We can try to generalize the factorial function to work with noninteger numbers, or we can make factorial check its argument. The first option is called the gamma function and it’s a little beyond the scope of this book. So we’ll go for the second. We have already seen examples of subroutines using the signature to verify the type of the argument. So we can add the Int type to the parameter in the signature. While we’re at it, we can also make sure the argument is positive or zero: sub factorial(Int $n where $n >= 0){ return 1 if $n == 0; return $n * factorial $n-1; } The Int type checking in the signature handles nonintegers, this is not new. The where $n >= 0 part is a parameter constraint: if the parameter is negative, the subroutine should fail. Technically, the constraint is implemented here within the signature using a syntax feature called a trait, that is a property imposed on the parameter at compiletime. If the argument passed to the function is not an integer or if it is negative, the program prints an error message to indicate that something went wrong: > say factorial 1.5 Type check failed in binding $n; expected Int but got Rat in sub factorial at <unknown file> line 1 in block <unit> at <unknown file> line 1 > say factorial -3 Constraint type check failed for parameter '$n' > say factorial "Fred" Type check failed in binding $n; expected Int but got Str in sub factorial at <unknown file> line 1 in block <unit> at <unknown file> line 1 If we get past both checks, we know that Another way to achieve a similar result is to define your own subset of the built-in types. For example, you can create an Even-int subset of integers and then use it more or less as if it were a new type for declaring your variables or typing your subroutine parameters: subset Even-int of Int where { $_ %% 2 } # or : … where { $_ % 2 == 0 } # Even-int can now be used as a type my Even-int $x = 2; # OK my Even-int $y = 3; # Type mismatch error Similarly, in the case of the factorial subroutine, we can create a nonnegative integer subset and use it for checking the parameter passed to the subroutine: subset Non-neg-int of Int where { $_ >= 0} # ... sub factorial(Non-neg-int $n){ return 1 if $n == 0; return $n * factorial $n-1; } If we pass a negative integer to the subroutine, we get a similar error as before: Constraint type check failed for parameter '$n'... This program demonstrates a pattern sometimes called a guardian. The signature acts as a guardian, protecting the code that follows from values that might cause an error. The guardians make it possible to prove the correctness of the code. 5.10 Multi SubroutinesIt is possible to write multiple versions of a subroutine with the same name but with different signatures, for example a different arity (a fancy word for the number of arguments) or different argument types, using the multi keyword. In this case, the interpreter will pick the version of the subroutine whose signature matches (or best matches) the argument list. For example, we could rewrite the factorial function as follows: multi sub fact(0) { 1 }; multi sub fact(Int $n where $n > 0) { $n * fact $n - 1; } say fact 0; # -> 1 say fact 10; # -> 3628800 Here, we don’t enter into infinite recursion because, when the parameter passed to fact is 0, it is the first version of the multi subroutine that is called and it returns an integer value (1), and this ends the recursion. Similarly, the Fibonacci function can be rewritten with multi subroutines: multi fibonacci(0) { 1 } multi fibonacci(1) { 1 } multi fibonacci(Int $n where $n > 1) { fibonacci($n - 2) + fibonacci($n - 1) } say fibonacci 10; # -> 89 Many built-in functions and most operators of Perl 6 are written as multi subroutines. 5.11 DebuggingBreaking a large program into smaller functions or subroutines creates natural checkpoints for debugging. If a subroutine is not working, there are three possibilities to consider:
To rule out the first possibility, you can add a print statement at the beginning of the function and display the values of the parameters (and maybe their types). Or you can write code that checks the preconditions explicitly. For the purpose of debugging, it is often useful to print
the content of a variable or of a parameter within a string
with surrounding characters, so that you may visualize
characters that are otherwise invisible, such as spaces or
newlines. For example, you think that the if $var eq "two" { do-something() } But it fails and the do-something subroutine is never called. Perhaps you want to use a print statement that will ascertain
the content of say "[$var]"; if $var eq "two" { do-something() } This might print: [two ] or: [two ] Now, you know that the equality test fails because If the parameters look good, add a print statement before each return statement and display the return value. If possible, check the result by hand. Consider calling the function with values that make it easy to check the result (as in Section ??). If the function seems to be working, look at the function call to make sure the return value is being used correctly (or used at all!). Adding print statements at the beginning and end of a function can help make the flow of execution more visible. For example, here is a version of factorial with print statements: sub factorial(Int $n) { my $space = ' ' x (4 * $n); say $space, 'factorial ', $n; if $n == 0 { say $space, 'returning 1'; return 1; } else { my $result = $n * factorial $n-1; say $space, 'returning ', $result; return $result; } } The $space variable is a string of space characters that controls the indentation of the output. Here is the result of factorial(4) : factorial 4 factorial 3 factorial 2 factorial 1 factorial 0 returning 1 returning 1 returning 2 returning 6 returning 24 If you are confused about the flow of execution, this kind of output can be helpful. It takes some time to develop effective scaffolding, but a bit of scaffolding can save a lot of debugging. 5.12 Glossary
5.13 ExercisesExercise 1 Draw a stack diagram for the following program. What does the program print? Please try to answer these questions before trying to run the program. [fontshape=up] sub b(Int $z) { my $prod = a($z, $z); say $z, " ", $prod; return $prod; } sub a(Int $x is copy, Int $y) { $x++; return $x * $y; } sub c(Int $x, Int $y, Int $z) { my $total = $x + $y + $z; my $square = b($total) ** 2; return $square; } my $x = 1; my $y = $x + 1; say c($x, $y + 3, $x + $y); Exercise 2
The Ackermann function, A(m, n), is defined as follows:
See http://en.wikipedia.org/wiki/Ackermann_function. Write a subroutine named ack that evaluates the Ackermann function. Use your subroutine to evaluate ack(3, 4), which should be 125. What happens for larger values of m and n? Solution: ??. Exercise 3
A palindrome is a word that is spelled the same backward and forward, like “noon” and “redivider.” Recursively, a word is a palindrome if the first and last letters are the same and the middle is a palindrome. The following are subroutines that take a string argument and return the first, last, and middle letters: [fontshape=up] sub first_letter(Str $word){ return substr $word, 0, 1; } sub last_letter(Str $word){ return substr $word, *-1, 1; } sub middle_letter(Str $word){ return substr $word, 1, *-1; } Don’t worry about how they work for the time being; we will see that in Chapter ?? on strings. For now:
Solution: ??. Exercise 4
An integer number, a, is a power of b if it is divisible by b
and a/b is a power of b. Write a function called
Solution: ?? Exercise 5
The greatest common divisor (GCD) of a and b is the largest number that divides both of them with no remainder. One way to find the GCD of two numbers is based on the observation that if r is the remainder when a is divided by b, then gcd(a, b) = gcd(b, r). As a base case, we can use gcd(a, 0) = a. Write a function called
Credit: this exercise is based on an example from Abelson and Sussman’s Structure and Interpretation of Computer Programs. Solution: ?? |
Are you using one of our books in a class?We'd like to know about it. Please consider filling out this short survey.
|