Exploring ReasonML
Please support this book: buy it or donate
(Ad, please don’t block.)

14 Lists

In this chapter, we look at the ReasonML data structure lists.

Lists are an immutable data structure for sequences of elements that all have the same type. It is especially well-suited for being processed via pattern matching.

14.1 Brief excursion: type signatures

In this chapter, you’ll see many type signatures such as this one:

let map: ('a => 'b, list('a)) => list('b);

Type signatures may still seem cryptic to you. But you’ll profit from learning how to read them. Let’s explore what the type signature tells us about map – without us knowing what map is or does (which will be explained later). Insights:

14.2 Standard library functionality for lists and arrays

The original standard library module for lists is List. It has functions such as:

let map: ('a => 'b, list('a)) => list('b);

Module List is also available in a labeled version, ListLabels. It has functions such as:

let map: (~f: 'a => 'b, list('a)) => list('b);

You can use a trick and “replace” several normal modules with their labeled versions, by opening the module StdLabels:

open StdLabels;

This module has the submodules Array, Bytes, List and String which are aliases for the global ArrayLabels etc. Therefore, if you open it, you override the unlabeled implementations of Array etc.

14.3 The structure of lists

14.3.1 Lists via a self-recursive parameterized variant

list is a self-recursive parameterized variant. If you were to define it yourself, this is what it would look like:

type mylist('a) =
  | Nil;
  | Cons('a, mylist('a))

The names Nil and Cons (“construct”) are historical and originated with the Lisp programming language. You nest Cons to create lists:

# let abc = Cons("a", Cons("b", Cons("c", Nil)));
let abc: mylist(string) = Cons("a", Cons("b", Cons("c", Nil)));

mylist has the type parameter 'a. It is passed on to Cons and its recursive use of mylist. That means two things: First, the elements of mylist can have any type. Second, they must all have the same type. In the previous interaction with ReasonML, you can see that it automatically inferred that for abc, 'a is string.

abc is a singly-linked list. In memory, it may look like this:

The two parts of a cons pair are called:

14.3.2 Creating and pattern-matching ReasonML’s lists

ReasonML has special syntax for lists. The two constructors are:

Therefore, pattern matching works like this:

switch myList {
| [] => ···
| [head, ...tail] => ···
}

This is one way of recreating the list abc from the previous section:

# let abc = ["a", ...["b", ...["c", ...[]]]];
let abc: list(string) = ["a", "b", "c"];

You can see that rtop suggests the following, more compact, syntax, which is equivalent:

# let abc = ["a", "b", "c"];
let abc: list(string) = ["a", "b", "c"];

Let’s use pattern matching to compute the length of any list myList:

let rec len = (myList: list('a)) =>
  switch myList {
  | [] => 0
  | [_, ...tail] => 1 + len(tail)
  };

We recurse over the two constructors:

The type parameter 'a makes function len generic. But we are never interested in the type of the elements, only in the structure of the list. The following interaction uses len with various lists:

# len([]);
- : int = 0
# len(["a", "b"]);
- : int = 2
# len([1, 2, 3, 4]);
- : int = 4

14.3.3 Printing lists

ReasonML has no built-in support for printing complex data structures. But BuckleScript lets you use JavaScript’s console.log(). It’s best to use this function like this:

Js.log(Array.of_list(myList));

Before we print the list myList, we convert it to an array. But why? It leads to nicer output, because ReasonML lists are represented as nested 2-element arrays in JavaScript (an encoding of cons pairs).

14.3.4 More ways of creating lists

The triple dot constructor is also called spread operator. This operator lets you prepend zero or more elements before an existing list:

# [...["a", "b"]];
- : list(string) = ["a", "b"]
# ["a", ...["b", "c"]];
- : list(string) = ["a", "b", "c"]
# ["a", "b", ...["c", "d"]];
- : list(string) = ["a", "b", "c", "d"]

Alas, this operator only works at the end. In JavaScript, you can use it anywhere, but in ReasonML, you can’t:

# [...["a", "b"], ...["c", "d"]];
Error: Syntax error

ReasonML has its own operator for concatenating lists:

# ["a", "b"] @ ["c", "d"];
- : list(string) = ["a", "b", "c", "d"]

Note that concatenating lists is comparatively slow, because you must prepend each element of the first operand to the second operand:

let rec append = (l1: list('a), l2: list('a)) =>
  switch l1 {
  | [] => l2
  | [head, ...tail] => [head, ...append(tail, l2)]
  };

This is how you use append:

# append([1,2,3], [4,5]);
- : list(int) = [1, 2, 3, 4, 5]

This is, once again, type inference at work:

14.4 Examples: creating lists

range() creates a list of ints:

/**
 * Compute a list of integers starting with `start`,
 * up to and excluding `end_`.
 */
let rec range = (start: int, end_: int) =>
  if (start >= end_) {
    [];
  } else {
    [start, ...range(start + 1, end_)];
  };

end is a keyword in ReasonML and therefore not a legal variable name. That’s why the parameter end_ has the underscore in its name. Let’s try out range():

# range(0, 0);
- : list(int) = []
# range(0, 1);
- : list(int) = [0]
# range(0, 5);
- : list(int) = [0, 1, 2, 3, 4]

fill() creates a list filled with the value ~element:

/**
 * Create a list of length `~length` where each
 * element is `~element`.
 */
let rec fill = (~element: 'a, ~length: int) =>
  if (length <= 0) {
    [];
  } else {
    [element, ...fill(~element, ~length=length-1)];
  };

ReasonML uses the type of ~element to infer the type of the result:

# fill("x", 4);
- : list(string) = ["x", "x", "x", "x"]
# fill(0, 3);
- : list(int) = [0, 0, 0]

14.5 Examples: reading lists

14.5.1 Computing the sum of a list

summarize() computes the total of all the ints in a list:

/**
 * Compute the sum of all the ints in the list `l`.
 */
let rec summarize = (l: list(int)) =>
  switch l {
  | [] => 0
  | [head, ...tail] => head + summarize(tail)
  };

summarize([]); /* 0 */
summarize([3]); /* 3 */
summarize([1, 2, 3]); /* 6 */

14.5.2 Accessing the n-th list element

getElementAt() retrieves a list element by index:

/**
 * Get the list element at index `~index`.
 * The head of a list has the index 0,
 * the head of its tail the index 1, etc.
 */
let rec getElementAt = (~index: int, l: list('a)) =>
  switch l {
  | [] => None
  | [head, ...tail] =>
    if (index <= 0) {
      Some(head);
    } else {
      getElementAt(~index=index-1, tail);
    }
  };

We can eliminate the if-then-else expression if we use a when clause and an additional case for switch. The resulting flat structure is slightly easier to read:

let rec getElementAt = (~index: int, l: list('a)) =>
  switch l {
  | [] => None
  | [head, ..._] when index <= 0 => Some(head)
  | [head, ...tail] => getElementAt(~index=index-1, tail)
  };

A few things are noteworthy in this code:

The standard library has ListLabels.nth() that works like getElementAt(), but it throws an exception for illegal indices, it does not use option.

14.6 Examples: changing lists

Given that lists are immutable – how do you change them? To find an answer, consider that, so far, we have seen two kinds of algorithms:

To change a list, we combine both approaches: We create a completely new list and include data from the existing list or derive data from it, as needed.

14.6.1 removeAll()

The following is a first example of a function that changes an existing list.

/**
 * Remove all elements from the list `l` that are
 * equal to `~value`.
 */
let rec removeAll = (~value: 'a, l: list('a)) =>
  switch l {
  | [] => []
  | [head, ...tail] when head == value => removeAll(~value, tail)
  | [head, ...tail] => [head, ...removeAll(~value, tail)]
  };

The first case means that we are done. The third case makes an exact copy of the existing list. The second case removes elements that are equal to ~value.

This is removeAll() in action:

# removeAll(~value=9, [1,9,2,9,3]);
- : list(int) = [1, 2, 3]

14.6.2 replaceAll()

replaceAll() replaces values:

/**
 * Inside the list `l`, remove all occurrences of the value `~value`
 * with the value `~with_`.
 */
let rec replaceAll = (~value: 'a, ~with_: 'a, l: list('a)) =>
  switch l {
  | [] => []
  | [head, ...tail] when head == value =>
    [with_, ...replaceAll(~value, ~with_, tail)]
  | [head, ...tail] =>
    [head, ...replaceAll(~value, ~with_, tail)]
  };

The first case means we are finished. The third case makes an exact copy. The second case makes a replacement.

We can make replaceAll() more compact via an internal helper function replaceOne():

let rec replaceAll = (~value: 'a, ~with_: 'a, l: list('a)) => {
  let replaceOne = (x) => if (x == value) with_ else x;
  switch l {
  | [] => []
  | [head, ...tail] =>
    [replaceOne(head), ...replaceAll(~value, ~with_, tail)]
  };
};

This is replaceAll() in action:

# replaceAll(~value=1, ~with_=11, [1, 2, 1, 3]);
- : list(int) = [11, 2, 11, 3]

14.6.3 drop()

drop() removes list elements:

/**
 * Remove the first `~count` elements of `theList`.
 */
let rec drop = (~count, theList: list('a)) =>
  switch theList {
  | [] => []
  | l when count <= 0 => l
  | [_, ...tail] => drop(~count=count-1, tail)
  };

Let’s use drop():

# drop(~count=0, ["a", "b", "c", "d"]);
- : list(string) = ["a", "b", "c", "d"]
# drop(~count=2, ["a", "b", "c", "d"]);
- : list(string) = ["c", "d"]
# drop(~count=2, ["a", "b"]);
- : list(string) = []
# drop(~count=2, ["a"]);
- : list(string) = []
# drop(~count=2, []);
- : list('a) = []

For the last result of drop(), ReasonML can’t infer the element type and leaves the type parameter 'a unbound.

14.7 Standard library functions for lists

ReasonML’s standard library is still in flux. Therefore, We are only looking at a few highlights here. You can read up on the rest (as it currently exists) via the documentation for ListLabels.

14.7.1 ListLabels.map()

Signature:

let map: (~f: 'a => 'b, list('a)) => list('b);

map() takes a list with elements of type 'a, applies the function ~f to each element and returns the results in another list.

# ListLabels.map(~f=x => int_of_string(x), ["7", "15", "6"]);
- : list(int) = [7, 15, 6]

This function is a classic tool for processing lists of data.

mapi() is a version of map() that passes both the current element and the element’s index to the callback ~f. We can use mapi() to non-destructively update lists:

/**
 * Create a copy of `theList` whose element at index `~index`
 * is `~value`.
 */
let setElementAt = (~index: int, ~value: 'a, theList: list('a)) =>
  ListLabels.mapi(
    ~f=(i,x) => if (i==index) value else x,
    theList
  );

The parameter ~f passes on all elements unchanged, except for the element at index ~index.

This is setElementAt() in use:

# setElementAt(~index=1, ~value="|", ["a", "b", "c"]);
- : list(string) = ["a", "|", "c"]

14.7.2 ListLabels.filter()

This is the function’s signature:

let filter: (~f: 'a => bool, list('a)) => list('a);

filter() applies the function ~f to each element of its positional parameter. If it returns true, the element is included in the result. If it returns false, it isn’t. It is used as follows.

# ListLabels.filter(~f=x=>x>5, [8, 4, 9, 7, 2]);
- : list(int) = [8, 9, 7]

14.7.3 ListLabels.for_all()

Signature:

let for_all: (~f: 'a => bool, list('a)) => bool;

for_all() returns true if ~f returns true for each element of the list. For example:

# ListLabels.for_all(~f=x=>x>3, [4,5,6]);
- : bool = true
# ListLabels.for_all(~f=x=>x>3, [3,4,5,6]);
- : bool = false

for_all stops processing as soon as ~f returns false. Then the result is guaranteed to be false. for_all is named after the mathematical operator ∀.

ListLabels.exists() is related to for_all(): It returns true if its callbacks returns true for at least one of the elements of its list. exists is named after the mathematical operator ∃.

14.7.4 ListLabels.flatten()

Signature:

let flatten: list(list('a)) => list('a);

flatten() converts a list of lists l, to a list, by concatenating the elements of l. That is, the following three expressions are equivalent:

flatten([l1, l2, l3])
ListLabels.append(l1, ListLabels.append(l2, l3))
l1 @ l2 @ l3

This is how flatten() is used:

# ListLabels.flatten([[1,2], [], [3,4,5]]);
- : list(int) = [1, 2, 3, 4, 5]

If you are wondering about arbitrarily nested lists, recall that, in ReasonML, all elements must have the same type. Therefore, if one list element is itself a list then all elements must be lists:

# ListLabels.flatten([[1,[2]], [], [3]]);
Error: This expression has type list('a)
but an expression was expected of type int
# ListLabels.flatten([[[1],[2]], [], [[3]]]);
- : list(list(int)) = [[1], [2], [3]]

Let’s continue by looking at use cases for flatten().

14.7.4.1 Use case: filtering and mapping at the same time

flatten() lets you filter and map at the same time. As an example, consider the trying to extract the first elements of several lists stored in a list. You could:

Or you could use flatten and do both at the same time:

module L = ListLabels;
let listFromHead = (l: list('a)) =>
  switch (l) {
  | [] => []
  | [head, ..._] => [head]
  };
let heads = (l: list(list('a))) =>
  L.flatten(L.map(~f=listFromHead, l));

First, we map each non-empty list to a list with its head and each empty list to an empty list. Then we flatten the result. This looks as follows:

# let l0 = [[1, 2], [], [3,4,5]];
let l0: list(list(int)) = [[1, 2], [], [3, 4, 5]];
# L.map(~f=listFromHead, l0);
- : list(list(int)) = [[1], [], [3]]
# let l1 = L.map(~f=listFromHead, l0);
let l1: list(list(int)) = [[1], [], [3]];
# L.flatten(l1);
- : list(int) = [1, 3]

These steps are equivalent to:

# heads([[1, 2], [], [3,4,5]]);
- : list(int) = [1, 3]

It is instructive to compare listFromHead to a function getHead that uses option to express failure:

let getHead = (l: list('a)) =>
  switch (l) {
  | [] => None
  | [head, ..._] => Some(head)
  };

That is, None expresses “l does not have a head”:

# getHead(["a", "b"]);
- : option(string) = Some("a")
# getHead([1, 2, 3]);
- : option(int) = Some(1)
# getHead([]);
- : option('a) = None

With listFromHead, we used the empty list instead of None and a one-element list instead of Some.

14.7.4.2 Use case: mapping to multiple values

Let’s assume that we have created a list of people and their children:

type person = Person(string, list(string));
let persons = [
  Person("Daisy", []),
  Person("Della", ["Huey", "Dewey", "Louie"]),
  Person("Marcus", ["Minnie"])
];

If we want to collect the children in a list, ListLabels.map() almost gives us what we want, but not quite:

# ListLabels.map(~f=(Person(_, ch)) => ch, persons);
- : list(list(string)) = [[], ["Huey", "Dewey", "Louie"], ["Minnie"]]

Alas, this is a list of lists of strings, not a list of strings. We can fix this by applying ListLabels.flatten() to this nested list:

let collectChildren = (persons: list(person)) =>
  ListLabels.flatten(
    ListLabels.map(
        ~f=(Person(_, children)) => children,
        persons));

collectChildren(persons);
  /* ["Huey", "Dewey", "Louie", "Minnie"] */

Now we get the desired result:

# collectChildren(persons);
- : list(string) = ["Huey", "Dewey", "Louie", "Minnie"]
14.7.4.3 Use case: conditionally inserting values

Sometimes, you create lists where some elements are added or omitted depending on a condition (cond in the following example):

let func = (cond: bool) => ListLabels.flatten([
  if (cond) ["a"] else [],
  [
    "b",
    "c"
  ]
]);

This is how func() is used:

# func(true);
- : list(string) = ["a", "b", "c"]
# func(false);
- : list(string) = ["b", "c"]

14.7.5 ListLabels.fold_left()

Signature:

let fold_left: (~f: ('a, 'b) => 'a, ~init: 'a, list('b)) => 'a;

fold_left() works as follows:

To compute the result, fold_left() depends on its parameter, the function ~f. It calls ~f for each element elem of its input list:

let nextIntermediateResult = f(intermediateResult, elem);

intermediateResult is what has already been computed. The first intermediate result is ~init. The last nextIntermediateResult is the result of fold_left().

Let’s look at a concrete example.

14.7.5.1 fold_left() by example: summarize()

We have already encountered the function summarize() which computes the total of a list of ints. Let’s implement that function via fold_left():

let rec summarize = (l: list(int)) =>
  ListLabels.fold_left(~f=(r, elem) => r + elem, ~init=0, l);

To understand how summarize() works, consider the following expression:

summarize([1,2,3]) /* 6 */

To compute the result 6, the following steps are taken:

/* Parameter */
let f = (r, elem) => r + elem;
let init = 0;

/* Steps */
let result0 = f(init, 1); /* 1 */
let result1 = f(result0, 2); /* 3 */
let result2 = f(result1, 3); /* 6 */

result2 is the result of fold_left().

14.7.5.2 Another way of looking at fold_left()

Another way of looking at fold_left() is that takes a binary operator ~f and turns it into an n-ary operator for lists. An example in mathematics for going from binary to n-ary is the binary operator + also existing as an n-ary version (the operator Σ). summarize() went from + to Σ. It could also be written like this:

# ListLabels.fold_left(~f=(+), ~init=0, [1, 2, 3]);
- : int = 6

I find fold_left easiest to understand if it works in this mode – with an ~f that is commutative (order of parameters doesn’t matter). But there is much you can do with it – read on for an example.

14.7.5.3 A more complex example: finding values

The function contains() uses it to find a value in a list:

let contains = (~value: 'a, theList: list('a)) => {
  let f = (found, elem) => found || elem == value;
  fold_left(~f, ~init=false, theList);
};

14.7.6 Converting lists to arrays via iteri()

Signature:

let iteri: (~f: (int, 'a) => unit, list('a)) => unit;

iteri() call ~f for every element of its list. The arguments are the index of the element and the element. It returns unit, which means that anything useful that iteri does, it does via side effects.

The following function uses iteri() to fill an array. It does so as a side effect, by writing to an array arr:

let arrayFromList = (~init: 'a, l: list('a)) => {
  let arr = ArrayLabels.make(ListLabels.length(l), init);
  ListLabels.iteri(~f=(i, x) => arr[i]=x, l);
  arr;
};

~init is a required parameter, because make() needs it (why is explained later).

arrayFromList() in action:

# arrayFromList(~init=0, [1,2,3]);
- : array(int) = [|1, 2, 3|]

Now is a good time to read the chapter on recursion in this book. It will give you a deeper understanding of recursion and tell you how you can make the code in this chapter more efficient (via tail calls).