Today I ran a workshop on functional programming in object-oriented languages at a London Software Craftsmanship Community meetup. I’ll post up the problem I asked people to solve in the next couple of days, but as I’m currently physically incapable of sleeping, I thought I’d take some time and talk to you about lists.
Lists are one of the most powerful and versatile constructs we have in our day-to-day toolkits. We use them everywhere. I can probably count the number of times I’ve implemented anything even slightly complex without using a list on the fingers of one hand. Today, I want to explain why we use them wrongly.
First, let’s identify the most common operations we perform on lists. We loop over them, inspecting and using each element of the list. We add items to them, and remove others. We don’t often retrieve arbitrary items from them (how many times have you asked specifically for element 12 of a list?), but we do often need just the first or last element.
Most mainstream programming languages—Java, Ruby, Python and C# among
them—use an array to implement the default list type. In memory, arrays
are continuous chunks of memory, allocated all at once. The array object
itself is represented by the memory address of the 0th element, its
length, and the size of each individual element (which must be the same
across the entire array). Arrays, and therefore lists, are optimised for
random access. Element 34 lives at location
start + 34 * element_size.
While retrieving elements is a simple process, adding a new one is relatively complicated. If I have a hundred items in my array and I want to add another, I need to create a new array of 101 elements, copy everything over, and add the last one. As you can imagine, this is pretty time-consuming. The normal way for “ArrayLists” (that’s what Java calls ’em) to scale is to multiply the size by a constant factor (normally 1.5 or 2) using the above method every time the user exceeds its capacity, and store the actual number of elements in addition to the length of the internal array. That means they don’t have to duplicate the array every time the user adds another item—just every so often, and it happens less frequently as you add more elements.
Enter the linked list. If you, like me, studied programming at school or university, you may remember the linked list as an interesting concept, but with no real practical use. I want to change that.
The wonderful thing about linked lists is that if you treat them right, they’re entirely immutable. And treating them right is quite simple: only add values to the beginning of the list.
Say I have the numbers from 1 to 10:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
I then want to prefix the number 0, because all good programmers count from 0.
numbers_from_zero = 0 : numbers
That colon (
:) is the cons operator. Odd name, I know, but it’s a
simple enough concept. It takes an object on the left and a list on the
right, and returns a new list which is the old one prefixed with the
object on the left. The lovely thing about this is that I can keep the
pointer to the old list and it hasn’t changed at all. It still points to
the right place. In this case,
numbers still points to the first
element of my linked list,
1, and goes on from there.
numbers_from_zero points to the
0, which points to the
1. I can
use these lists independently even though they share most of their
Say I want to decompose the list. Let’s give it two methods:
head returns the first item of the list, and
the rest. This is the exact opposite of the cons operation, which took a
head and a tail and stuck them together. Now we’re breaking them back
So let’s do something simple. How about multiplying everything in our list by two? Let’s define a function that does exactly that.
multiply_by_two (list) = (list.head * 2) : multiply_by_two(list.tail)
Well, there’s one problem here. That will recurse forever. Let’s fix that.
multiply_by_two (list) = if list.isEmpty then  else (list.head * 2) : multiply_by_two(list.tail)
We’ve introduced a new operation here. As well as pulling the head and
tail out of the list, we need to be able to test it to see whether it’s
empty or not. The
isEmpty function here does exactly that. We then
return nil, the empty list, if it is. This means that when we finally
get to the end of the list, we stop.
Let’s expand this one.
multiply_by_two([1, 2, 3, 4, 5]) = (1 * 2) : multiply_by_two([2, 3, 4, 5]) = 2 : (2 * 2) : multiply_by_two([3, 4, 5]) = 2 : 4 : (3 * 2) : multiply_by_two([4, 5]) = 2 : 4 : 6 : (4 * 2) : multiply_by_two() = 2 : 4 : 6 : 8 : (5 * 2) : multiply_by_two() = 2 : 4 : 6 : 8 : 10 :  = [2, 4, 6, 8, 10]
Done. We have an entirely new list which contains the same numbers as
the first, multiplied by two. And we did this without ever mutating the
original list. We just popped stuff off and pushed new things on. The
important thing to remember is that the
: operator creates new lists,
it doesn’t change the existing one.
We can generalise this a little. We often want to transform every item in the list. So let’s define a way of mapping a function—any function—to each item in the list. That way we can apply any function we like across an entire list without having to rewrite it for lists. This is the cornerstone of good functional programming: reusable, composable functions.
Let’s call it
map, and make it a method on the List class. It takes a
List.map (f) = if this.isEmpty then  else f(this.head) : this.map(f, this.tail)
I can then reimplement my
multiply_by_two function pretty easily.
multiply_by_two (list) = list.map((x) => x * 2)
Bam. Functional programming. Ain’t it tidy?
Here’s an exercise for you. How would you implement a method on List
that removes all elements that don’t match a predicate? Call it
filter. Here’s your test case:
isEven (x) = x % 2 == 0 [1, 2, 3, 4, 5, 6].filter(isEven) == [2, 4, 6]
I’ve written all these examples in a fictional language, so if you want to code it, you’ll have to reimplement them. That said, you don’t have to try and code it. Just think about it. When you’ve got it, post it in the comments.
If you enjoyed this post, you can subscribe to this blog using RSS. I personally use Feedly; you can subscribe here.
Maybe you have something to say. You can comment below, email me, or toot at me. I love feedback. I also love gigantic compliments, so please send those too.