Exploring ReasonML

## 18 Recursion

In this chapter, we take a look at an important technique in functional programming: recursion.

### 18.1 Required knowledge

It is best to read this chapter after you have read the chapter on lists (and all preceding chapters, on which it builds).

It also helps if you are familiar with the following imperative features (however, they should be relatively self-explanatory):

• Mutable state via `ref()`: Consult “Mutation” in the ReasonML docs (until the corresponding chapter in this book is finished).
• `for` loops: Consult “Imperative Loops” in the ReasonML docs (until the corresponding chapter in this book is finished).

### 18.2 Processing data via recursive function calls

Recursive function calls are a convenient tool for processing recursive data. Before we get into the function calls, let’s first review how recursive data is built.

#### 18.2.1 Building recursive data

As an example, consider the following recursive data type:

``````type intTree =
| Empty /* atomic */
| Node(int, intTree, intTree); /* compound */``````

To create an `intTree`:

• Atomic: We start by creating one or more atomic pieces. In general, there may be several kinds of atomic pieces and each kind may or may not contain data. In the case of `intTree`, all atomic pieces are `Empty`. For example:

``Empty``
• Compound: We continue by creating compound pieces, by assembling atomic pieces and/or other compound pieces. That is, a compound piece is one that contains one or more other pieces. Therefore, the corresponding part of the type definition recursively mentions the type. For example, the compound `Node` refers to `intTree`, twice. The following expression creates a tree with a single node that holds the integer `123`:

``Node(123, Empty, Empty)``

#### 18.2.2 Recursing over recursive data

The way we process recursive data is similar to how we build it. Take the following code:

``````let rec summarizeTree = (t: intTree) =>
switch t {
| Empty => 0 /* base case */
| Node(i, leftTree, rightTree) => /* recursive case */
i + summarizeTree(leftTree) + summarizeTree(rightTree)
};``````

There are two kinds of cases:

• One or more base cases process atomic data. This is where the recursion stops.
• One or more recursive cases process compound data. Processing continues via recursive function calls.

Aside for advanced readers: this example and several upcoming examples are not tail-recursive. We’ll learn later how to change that.

#### 18.2.3 Example: lists

Lists are a recursive data structure. Let’s examine code for lists through the lens of what we have just learned:

``````let rec len = (myList: list('a)) =>
switch myList {
| [] => 0 /* base case */
| [_, ...tail] =>  /* recursive case */
1 + len(tail)
};``````

Once again, we have:

• A base case that processes empty lists. Recursion stops at this point.
• A recursive case that processes non-empty lists. Recursion continues with the tail of the list.

#### 18.2.4 Example: natural numbers

Even when you work with natural numbers, you often encounter a similar pattern:

``````let rec factorial = (x) =>
if (x <= 0) { /* base case */
1;
} else { /* recursive case */
x * factorial(x - 1);
};``````

This function hints at the fact that you can define natural numbers via a recursive data type:

``type nat = Zero | Succ(nat);``

This type says: A natural number `nat` is either zero or the successor of a natural number (i.e., one plus that number). The conversion from `int` to `nat` can also be defined recursively:

``````let rec natFromInt = (x: int) =>
if (x <= 0) { /* base case */
Zero;
} else { /* recursive case */
Succ(natFromInt(x-1));
};

natFromInt(2);
/* Succ((Succ(Zero))) */``````

### 18.3 Looping via recursion

Another core functional technique is to use recursion instead of loops.

#### 18.3.1 Example: `repeat()`

Consider the following typical example of using a loop in an imperative language: repeating a string `str` zero or more times.

``````let repeat = (~times: int, str: string) => {
let result = ref("");
for (_count in 1 to times) {
result := result^ + str;
};
result^;
};
repeat(~times=3, "abc"); /* "abcabcabc" */``````

The same functionality can be implemented via recursion like this:

``````let rec repeat = (~times: int, str: string) =>
if (times <= 0) { /* base case */
"";
} else { /* recursive case */
str ++ repeat(~times = times - 1, str);
};``````

Once again, there is a base case and a recursive case. If you come from an imperative language, using recursion in this manner initially looks weird, but you eventually get used to it.

#### 18.3.2 Example: `summarizeList()`

Computing the sum of a list of ints is another typical example of looping in imperative languages:

``````let summarizeList = (l: list(int)) => {
let result = ref(0);
for (index in 0 to List.length(l)-1) {
let elem = List.nth(l, index);
result := result^ + elem;
};
result^;
};
summarizeList([1, 2, 3]); /* 6 */``````

Aside: Lists are not very efficient w.r.t. random-access operations such as `List.length()` and `List.nth()`. However, I only used lists, because I wanted the code to be similar to the more functional implementations. In real-world code, you have the option of using an array.

If, instead, you process the list recursively, you get to use pattern matching, via `switch`:

``````let rec summarizeList = (l: list(int)) =>
switch l {
| [] => 0
};``````

Read on for another way of implementing `summarizeList()` that does not involve recursion, yet is also functional.

### 18.4 Looping via higher-order helper functions

One key technique in functional programming are higher-order helper functions for processing data structures. They are an alternative to using recursion. We can, for example, use `fold_left()` to re-implement `summarizeList()` from the previous section:

``````let summarizeList = (theList: list(int)) =>
ListLabels.fold_left(
~f = (sum, elem) => sum + elem,
~init = 0,
theList
);``````

`~f` is invoked for each element `elem` of `theList`:

• `sum` is the current state (the sum computed thus far).
• The result of `~f` becomes the next state.

The final state is returned by `fold_left()` as its result. `~init` specifies the initial state.

Other higher-order functions for processing lists are (module `ListLabels`):

• `map: (~f: ('a) => 'b, list('a)) => list('b)`
Produces a new list by applying `~f` to each element of the list.
• `filter: (~f: ('a) => bool, list('a)) => list('a)`
Produces a new list that contains only those elements of the input list for which `~f` returns `true`.
• `for_all: (~f: ('a) => bool, list('a)) => bool`
Returns `true` if `~f` returns `true` for all elements of the list.
• `exists: (~f: ('a) => bool, list('a)) => bool`
Returns `true` if `~f` returns `true` for at least one element of the list.
• `iter: (~f: ('a) => unit, list('a)) => unit`
Invokes `~f` for each element of the list. `iter()` only makes sense if `~f` has side effects (output, mutations, etc.).

Except for `filter()`, all of these functions are also available for arrays.

### 18.5 Recursive technique: accumulator

Sometimes, (naive) recursion is not how one thinks about a problem. Take for example, the following problem: Given a list of ints, return the greatest element. If we only use recursion, we get the following code. Alas, it is neither intuitive nor efficient:

``````let rec maxList = (l: list(int)) =>
switch (l) {
| [] => min_int
| [_, ...tail] => maxList(tail) /* B */
};
maxList([-3, 12, 1]); /* 12 */``````

In this code, we used two `switch` cases at the end (line A, line B); one of which has a `when` clause. Instead, we could have used a single case and an `if` expression, but the flat code is easier to read.

So how does this code work? For each list, we compute the maximum of the tail and compare it with the head. The first time the recursive call in line A directly returns a value is at the end of the list (when `l` is `[]`). That, in turn, enables the caller to return a value. Then the caller’s caller. Etc.

In contrast, the imperative version of `maxList()` is more intuitive. It is also more efficient (apart from the random accesses):

``````let maxList = (l: list(int)) => {
let max = ref(min_int);
for (index in 0 to List.length(l)-1) {
let elem = List.nth(l, index);
if (elem > max^) {
max := elem;
}
};
max^;
};``````

Is it possible to write a functional version that is as intuitive? Yes it is, via a technique called accumulator. The following code uses this technique:

``````let maxList = (theList: list(int)) => {
let rec aux = (~max=min_int, l) =>
switch l {
| [] => max
| [_, ...tail] =>
aux(~max, tail);
};
aux(theList); /* B */
};``````

`~max` is the accumulator: a parameter that is continually updated during recursion and returned at the very end. Similar to the first parameter of the callback of `fold_left()`, it represents the current state. In line A, you can see that `~max` can be used almost as if it were mutable: It is set to a new value that is used from now on.

One downside of this technique is that you need to manage the additional parameter for the accumulator. There are two ways of doing so.

We have already seen the first way: Creating an internal recursive helper function called `aux` or `helper` or `maxListRec` (etc.). I like to set up the parameters of `aux` as follows:

• The initial value of the accumulator is provided via a parameter default value.
• Some parameters of `maxList` are passed on to `aux` and change there. I do that at the end of `maxList` (line B).
• Sometimes, outer functions have parameters that don’t change. Then `aux` can just refer to them.

Alternatively, you can introduce the accumulator right away as a parameter and give it a default value. Callers will normally ignore it, but it is used by recursive calls. That looks as follows:

``````let rec maxList = (~max=min_int, l: list(int)) => {
switch l {
| [] => max
| [_, ...tail] =>
maxList(~max, tail);
};
};``````

### 18.6 Tail call elimination

Given that one prefers recursion to loops in functional programming – won’t performance suffer due to all of those function calls? It turns out that tail calls, a specific kind of function calls, can be highly optimized.

#### 18.6.1 What are tail calls?

A function call is a tail call if it is the last thing that the surrounding function does. Before we look at an example, let’s examine a function that does not have tail calls:

``````let rec repeat = (~times: int, str: string) =>
if (times <= 0) { /* base case */
"";
} else { /* recursive case */
str ++ repeat(~times = times - 1, str); /* A */
};``````

The recursive call in line A is not a tail call: after it returns, its result is appended to `str` and returned. That is, the appending is the last thing that `repeat()` does. Therefore, when we call `repeat()`, we must save two kinds of information:

• The values of parameters and local variables, because we need `str` after `repeat()` returns.
• The return address, which indicates where to continue computation after the function call is finished.

The following call tree visualizes what happens when we make the function call `repeat(~times=2, "a")`:

``````repeat(~times=2, "a")
"a" ++ repeat(~times=1, "a")
"a" ++ repeat(~times=0, "a")
""``````

Each indentation indicates that our storage needs grow. Parameters, variables and return addresses are usually stored on a stack, which explains why you get a stack overflow if there are too many nested (non-tail) function calls.

How can we make `repeat()` more efficient? By introducing an accumulator (`~result`):

``````let repeat = (~times: int, str: string) => {
let rec aux = (~result="", n) =>
if (n <= 0) { /* base case */
result;
} else { /* recursive case */
aux(~result = result ++ str, n-1); /* A */
};
aux(times);
};``````

Now the function call in line A is a tail call: it is the result of the surrounding function. Because no information needs to be saved for the caller, the call in line A is effectively a jump and does not increase our storage needs.

The function call `repeat(~times=2, "a")` is executed as follows:

``````aux(~result="", 2)
aux(~result="a", 1)
aux(~result="aa", 0)
"aa"``````

The flatness indicates that the function calls are more like jumps and less like recursive calls.

Terminology:

• A function whose recursive calls are tail calls is called tail-recursive.
• A tail-recursive algorithm is also called iterative. Iterative algorithms can also be implemented via loops.

Accumulators are a common way of introducing tail calls. In the following sections, we make more functions tail-recursive. And each time, we use an accumulator to do so.

#### 18.6.2 Example: `summarizeList()`

The following implementation of `summarizeList()` calls itself recursively in line A and it is not a tail call:

``````let rec summarizeList = (l: list(int)) =>
switch l {
| [] => 0
head + summarizeList(tail) /* A */
};``````

If we introduce an accumulator, we get a tail call in line A:

``````let summarizeList = (theList: list(int)) => {
let aux = (~sum=0, l) =>
switch l {
| [] => sum
};
aux(theList);
};``````

#### 18.6.3 Example: `reverse()`

The following function is code we haven’t seem before – a recursive implementation of reversing a list `l`:

``````let rec reverse = (l: list('a)) =>
switch l {
| [] => l
};
reverse(["a", "b", "c"]);
/* ["c", "b", "a"] */``````

Line A says: a reversed list is the reversed tail followed by the head (`append()` concatenates lists).

What would an iterative version of `reverse()` look like? Consider the following intuitive algorithm for reversing a list `l = ["a", "b", "c"]`:

• The initialization value is the empty list:

``let result0 = [];``
• We start with the head of `["a", "b", "c"]`:

``let result1 = ["a", ...result0]; /* ["a"] */``
• We continue with the head of `["b", "c"]`:

``let result2 = ["b", ...result1]; /* ["b", "a"] */``
• We continue with the head of `["c"]`:

``let result3 = ["c", ...result2]; /* ["c", "b", "a"] */``
• Final result: `result3`

As it turns out, this algorithm is easy to implement via an accumulator:

``````let rec reverse = (theList: list('a)) => {
let rec aux = (~result=[], l) =>
switch l {
| [] => result
};
aux(theList);
};``````

When we invoke `reverse(["a", "b", "c"])`, the following steps are taken:

``````aux(~result=[], ["a", "b", "c"])
aux(~result=["a"], ["b", "c"])
aux(~result=["b", "a"], ["c"])
aux(~result=["c", "b", "a"], [])
["c", "b", "a"]``````

It is interesting to observe how both implementations process data in a different order:

• The recursive `reverse()` works with:
• Current data: `head`
• Future data: `reverse(tail)`
• The iterative `reverse()` works with:
• Past data: `result`
• Current data: `head`

In this case, the order in which the accumulator version processes data is a better fit for our intuitive understanding of the algorithm.

#### 18.6.4 Example: `append()`

If, however, we implement `append()`, then a recursive algorithm is simpler:

``````let rec append = (l1: list('a), l2: list('a)) =>
switch l1 {
| [] => l2
[head, ...append(tail, l2)] /* A */
};
append(["a", "b"], ["c", "d"]);
/* ["a", "b", "c", "d"] */``````

Line A says: the concatenation of two lists `l1` and `l2` is the head of `l1` followed by the concatenation of the tail of `l1` and `l2`.

If we think about appending iteratively, we get the following algorithm:

• Start with `l2`.
• Prepend the last element of `l1` in front of the previous result.
• Prepend the second last element of `l1` in front of the previous result.
• Etc.

That is, we must reverse `l1` via `List.rev()` (line A), before we can work with it iteratively:

``````let append = (l1: list('a), l2: list('a)) => {
let rec aux = (~result, l) =>
switch l {
| [] => result
};
aux(~result=l2, List.rev(l1)); /* A */
};``````

#### 18.6.5 Example: `contains()`

The following function checks whether `theList` contains a given `value`:

``````let rec contains = (~value: 'a, theList: list('a)) =>
switch theList {
Conveniently, this recursive implementation is also iterative: all recursive calls are tail calls. Because we are doing the recursion ourselves, we can stop as soon as we have found `value` (line A).