# Composing traversals

· ·

This post assumes basic knowledge on Haskell, including what `Functor` and `Monad` are.

## The problem

A few months ago, I was at a functional programming conference where the following problem1 was posed at a talk

Given a container `Tree`, write a function `update :: Tree a -> [a] -> Maybe (Tree a)` that, when called as `update t as`, updates the leaves of `t` by the new tags in `as` (listed in preorder).

If there are not enough elements on `as`, just return `Nothing`

This post attempts to explain a concise solution to that problem. You might want to try it on your own at first and then come back and read my solution. If you think your solution is better, please send it to me!

We will need the following modules:

``````import Data.Tree
import Data.Maybe
import Data.Functor.Compose
``````

## Multi-way `Tree`s

In order to solve the problem, we will make use of multi-way trees, also known as rose trees, which are defined as

``````data Tree a = Node a [Tree a]
``````

That is, a `Tree` is made from the value of the root node (of type `a`) and a (possibly empty) list of children trees. To warm-up we will need a simple function2 that creates a singleton tree from a value.

``````singleton :: a -> Tree a
singleton x = Node x []
``````

That was easy! Now, to simplify the problem a little, let’s first try writing an

``````unsafeUpdate :: Tree a -> [a] -> Tree a
``````

function. If you don’t know how to approach this problem, you can think of it on imperative terms.

The following is a plausible solution on Python, assuming some tree type with methods `tag` and `children` and a predicate `isLeaf` that tells you whether a tree is a leaf.

``````def unsafeUpdate(t, as):
if isLeaf(t):
t.tag = as.pop(0)
else:
for child in t.children:
unsafeUpdate(t,as)
``````

As we shall see, the solution present in this post is strikingly similar in appearance to that one, yet it remains purely functional. Of course, other possibilities exist, yet I believe this one is concise, efficient and it is easily transformed into a safe version.

## Purely functional `State`

This section introduces the `State` monad. If you already know about it, you may skip it.

The `Control.Monad.State` module can be scary-looking at first, yet we will make simple use of it. It defines a monad3 `State`, that can be thought of as having this definition:

``````newtype State s a = State {runState :: s -> (a,s) }
``````

A value `m :: State s a` encapsulates a transformation that makes use of a state `s`, it (possibly) transforms it and obtains a value of type `a`. For example, a function from `System.Random` such as `random` `:: StdGen -> (a, StdGen)` could be encapsulated inside a `randomState :: State StdGen a` value that produces a random value and has as an internal state the seed of type `StdGen`.

By encapsulating state transformations in this way, we can remain pure while simulating computations that use some sort of state.

Unsurprisingly, for any possible type `s`, `State s` is a functor, with mapping given by

``````f <\$> m = State \$ \s -> let (a,s) = runState m s in (f a, s)
``````

That is, we run the state modifying computation `m` and then apply `f` to its result, leaving the state unchanged.

It also is an `Applicative` and a `Monad`. For simplicity, we only show the definitions of `pure` and `>>=`, leaving `<*>` as an exercise4.

`pure x` is the computation that returns `x`, leaving the state unchanged.

``````pure x = State \$ \s -> (x,s)
``````

The bind operator is a bit more complicated; on this setting its type would be

``````>>= :: State s a -> (a -> State s b) -> State s b
``````

that is, it takes a computation that would produce a value of type `a`, runs it and applies the given function.

Its definition is equivalent to

``````m >>= f = State \$ \s ->
let (a, s') = runState m s
in runState (f a) s'
``````

State also defines a set of functions that allow us to get and manipulate the internal state, and to get the final value of the computation. We will use (with simplified type signatures):

• `get`, that gets the state,
``````get :: State s s
get = State \$ \s -> (s,s)
``````
• `modify`, that applies a function to the internal state,
``````modify :: (s -> s) -> State s ()
modify f = State \$ \s -> ((), f s)
``````
• `evalState`, that gets the final value of the computation,
``````evalState :: State s a -> s -> a
evalState = fst . runState
``````

If you feel like you need further examples on `State` before moving on, you can check the Wikibooks tutorial.

## The base case

We are now ready to write a skeleton of the `unsafeUpdate` function, and to write a the base case.

As you might expect, the idea is to write it in terms of a function using `State`. The internal state will be the list of remaining new leaves, and thus we want to write a function with type

``````unsafeUpdateSt :: Tree a -> State [a] (Tree a)
``````

By using what we learned in the last section, we could then write:

``````unsafeUpdate = evalState . unsafeUpdateSt
``````

But let’s not get ahead of ourselves. If we look at the base case on the Python version we want to simulate the line `t.tag = as.pop(0)`, that is, extracting the first value of the `as` list and returning it.

Let’s try and write a version of `pop` (named `unsafePop`) using the `State` monad. What would its type be? We don’t really need to specify the index here, since we will only take the first element, hence, it is just a `State` value. The internal state would be the list of tags `[a]` and the result of the computation would be the first element of the list, with type `a`. Therefore:

``````unsafePop :: State [a] a
``````

Now, using `do` notation we can write the function easily: we first get the internal state, then we delete the first element from it and lastly we return this element:

``````unsafePop = do
xs <- get
modify tail
``````

That almost looks like imperative code!

Now, for tackling the base case, we just need to get a `State [a] (Tree a)` from that `State [a] a` value, therefore:

``````unsafeUpdateSt (Node _ []) = singleton <\$> unsafePop
``````

The inductive case will need further machinery.

## List traversals

If we look at the Python code, we can see that the gist of the inductive case is to apply the function `unsafeUpdateSt` recursively to the list of children,

``````for child in t.children:
unsafeUpdate(t,as)
``````

and pass the list of remaining new tags around.

Of course, there are no for loops on Haskell, (at least not exactly) so we need to resort to (implicit) recursion. Now, the usual way to apply a function to a list is by treating the list as a functor and using the `map` function, whose definition (with `:` written in prefix notation) is

``````map :: (a -> b) -> [a] -> [b]
map _     []  = []
map f  (x:xs) = (:) (f x) (map f xs)
``````

yet in this case or function `unsafeUpdateSt :: Tree a -> State [a] (Tree a)` has a type that looks more like `a -> f b` for a certain container `f` (namely, `State [a]`) and thus `map` is not useful in this situation5. A generalization is needed.

The `Traversable` class provides just that generalization (and much more). We will use it only on lists, but it can be used on any `Functor` that supports an iterator-like interface (for example, most data structures on the `containers` library). Its type (specialized to lists) is

``````traverse :: (Applicative f) => (a -> f b) -> [a] -> f [b]
``````

Its definition6 is pretty straightforward, namely,

``````traverse -     [] = pure []
traverse f (x:xs) = (:) <\$> f x <*> traverse f xs
``````

which (excluding the use of `<*>` and `<\$>`) looks very much like the `map` definition. It is a generalization since we can use the `Identity` functor to get back

``````map f = runIdentity . traverse (Identity . f)
``````

Using this function we can apply `unsafeUpdateSt` recursively to the list of children and then reconstruct the tree by mapping the `Node` function appropriately. That is,

``````unsafeUpdateSt (Node x children) = Node x <\$> traverse unsafeUpdateSt children
``````

This feels very similar to the imperative code! In fact, the `Data.Traversable` module defines `for = flip traversable` so we could define

``````unsafeUpdateSt (Node x children) = Node x <\$> for children unsafeUpdateSt
``````

So, to recall, we can define `unsafeUpdate` in terms of `unsafePop` as:

``````unsafeUpdateSt :: Tree a -> State [a] (Tree a)
unsafeUpdateSt (Node _ []) = singleton <\$> unsafePop

unsafeUpdate :: Tree a -> [a] -> Tree a
``````

Now on to the actual problem!

## Composing functors

The `Data.Functor.Compose` module defines the `Compose` newtype,

``````newtype Compose f g a = Compose { getCompose :: f (g a) }
``````

This type allows us to compose two type constructors (with kind `f, g :: Type -> Type`) into a new one.

The interesting feature that `Compose` allows is to define instances for `Functor`, `Applicative` or `Traversable` for the composition of two `Functor`s/`Applicative`s/`Traversable`s.

As an example, suppose `f, g :: Type -> Type` are both functors. Then `Compose f g` is also a functor, where the `fmap` is given by

``````fmap :: (a -> b) -> Compose f g a -> Compose f g b
fmap f (Compose x) = Compose (fmap (fmap f) x)
``````

where the first `fmap` is the one for `f` and the second the one for `g`.

Similar instances can be defined for `Applicative` and `Traversable` (check the source code if you are curious!).

How can we take advantage of `Compose` for our problem? Let’s start by refining the pop function. Recall from the previous sections that we defined a function `unsafePop :: State [a] a`.

The problem with this function is that it is unsafe: if the state is the empty list, the function will throw an error. To solve this, we change the type to `State [a] (Maybe a)`, returning `Nothing` if there are no remaining elements in the list.

We wrap it up with the `Compose` constructor, having

``````pop :: Compose (State [a]) Maybe a
pop = Compose \$ do
xs <- get
modify' (drop 1)
pure (listToMaybe xs)
``````

Now, our `unsafeUpdateSt` function can be used as is for the safe version of the function. We only need to replace the `unsafePop` function by `pop`:

``````updateSt :: Tree a -> Compose (State [a]) Maybe (Tree a)
updateSt (Node _ []) = singleton <\$> pop
``````

Notice that both `<\$>` and `traverse` here are using the instance of `Functor` and `Traversable` for the type `Compose (State [a]) Maybe`.

Finally, the solution to our problem will be to unwrap the one provided by `updateSt`.

``````update :: Tree a -> [a] -> Maybe (Tree a)
update = evalState . getCompose . updateSt
``````

As you can see, once we had the unsafe version, creating the safe one was just a matter of updating the `pop` function appropriately and dealing with the wrapping and unwrapping of `Compose`.

## Conclusions

In this post we have seen a possible solution to a problem dealing with trees using Haskell.

The use of `Compose`, `Traversable` and `Functor` allowed us to translate an imperative-looking solution into purely functional code, producing a concise yet complete solution. Furthermore, the solution could be built step-by-step, by considering a simplified problem first.

This might not be the best most efficient solution for this problem7, but it is an excellent way to present such type-classes’ usefulness.

As I said at the beginning, any improvements or alternative solutions are welcome. Until next time!

1. That was not the exact statement; the point of the talk was to show how to use dependent types to statically avoid the possibility of not having the correct number of tags on the list, but this version is better suited for the purpose of this post. It also used binary trees while we will use multi-way trees.

2. The `Data.Tree` module already includes such function on the `Applicative` instance (i.e., the `pure` function), but I think that makes the code slightly more confusing (since `Tree`’s behavior as an `Applicative` is far from trivial).

3. Well, technically a class of monads, so as to allow for the use of “monad transformers”, but we will not make use of transformers on this post.

4. Although at present time it might be more beginner-friendly to restrict the talk to Monads and use `return` instead of `pure`, the “Monad of no return” proposal seems likely to be accepted, thus in the long term sticking to `pure` is the right choice. `return` is also discouraged in other Haskell-inspired languages such as Idris.

5. If you have used monads (specially if you did so before the Burning Bridges proposal of GHC 7.10 was put in place) it is likely that you have met with `mapM`. Although `mapM` can be used for defining `unsafeUpdateSt`, it will not be applicable (directly) in the safe version of update, because of the `Monad` restriction. Furthermore, I believe it should be discouraged in favor of `traverse`, since in current versions of GHC `mapM` is just `traverse` with a restricted type.

6. One should use `foldr` here and that is what the Prelude does but for didactic purposes I chose this definition.

7. In particular, although I have not checked it, it seems likely that you would need to deal with laziness to avoid a space leak.