Introduction
If you have already used functional languages, you are probably familiar with
fold, a high order function used to iterate on a collection of values to
combine them and return a result. You may be surprised that Common Lisp does
not have a fold function, but provides REDUCE
which works a bit differently.
Let us see how they differ.
Understanding REDUCE
In its simplest form, REDUCE
accepts a function and a sequence (meaning
either a list or a vector). It then applies the function to successive pairs
of sequence elements.
You can easily check what happens by tracing the function:
CL-USER> (trace +)
CL-USER> (reduce #'+ '(1 2 3 4 5))
0: (+ 1 2)
0: + returned 3
0: (+ 3 3)
0: + returned 6
0: (+ 6 4)
0: + returned 10
0: (+ 10 5)
0: + returned 15
15
In this example, the call to REDUCE
evaluates (+ (+ (+ (+ 1 2) 3) 4) 5)
.
You can reverse the order using the :from-end
keyword argument:
CL-USER> (trace +)
CL-USER> (reduce #'+ '(1 2 3 4 5) :from-end t)
0: (+ 4 5)
0: + returned 9
0: (+ 3 9)
0: + returned 12
0: (+ 2 12)
0: + returned 14
0: (+ 1 14)
0: + returned 15
15
In which case you will evaluate (+ 1 (+ 2 (+ 3 (+ 4 5))))
. The result is of
course the same since the +
function is associative.
You can of course provide an initial value, in which case REDUCE
will behave
as if this value has been present at the beginning (or the end with
:from-end
) of the sequence.
The surprising aspect of REDUCE
is its behaviour when called on a
sequence with less than two elements:
- If the sequence contains a single element:
- if there is no initial value, the function is not called and the element is returned directly;
- if there is one, the function is called on both the initial value and the single element.
- If the sequence is empty:
- if there is no initial value, the function is called without any argument;
- if there is one, the function is not called and the initial value is returned directly.
As a result, any function passed to REDUCE
must be able to handle being
called with zero, one or two arguments. Most examples found on the Internet
use +
or append
, and these functions happen to handle it (e.g. (+)
returns the identity element of the addition, zero). If you write your own
functions, you will have to deal with it using the &OPTIONAL
lambda list
keyword.
This can lead to unexpected behaviours. If you compute the sum of a sequence
of floats using (reduce #'+ floats)
, you may find it logical to obtain a
float. But if FLOATS
is an empty sequence, you will get 0
which is not a
float. Something to keep in mind.
Differences with fold
The fold function is traditionally defined as accepting three arguments: a function, an initial value — or accumulator — and a list. The function is called repeatedly with both the accumulator and a list element, using the value returned by the function as next accumulator.
For example in Erlang:
lists:foldl(fun(X, Sum) -> Sum + X end, 0, [1, 2, 3, 4, 5]).
An interesting consequence is that fold functions are always called with the
same type of arguments (the list value and the accumulator), while REDUCE
functions can be called with zero or two list values. This makes it
harder to write functions when the accumulated value has a different type from
sequence values.
Fold is also simpler than REDUCE
since it does not have any special case,
making it easier to reason about its behaviour.
It would be interesting to know why a function as fundamental as fold was not included in the Common Lisp standard.
Implementing FOLDL
We can of course implement a fold function in Common Lisp. We will concentrate on the most common (and most efficient) left-to-right version. Let us start by a simple implementation for lists:
(defun foldl/list (function value list)
(declare (type (or function symbol) function)
(type list list))
(if list
(foldl/list function (funcall function value (car list)) (cdr list))
value))
As clearly visible, the recursive call to FOLDL/LIST
is in tail position and
SBCL will happily perform tail-call elimination.
For vectors we use an iterative approach:
(defun foldl/vector (function value vector)
(declare (type (or function symbol) function)
(type vector vector))
(do ((i 0 (1+ i))
(accumulator value))
((>= i (length vector))
accumulator)
(setf accumulator (funcall function accumulator (aref vector i)))))
Finally we write the main FOLDL
function which operates on any sequence:
(defun foldl (function value sequence)
(declare (type (or function symbol) function)
(type sequence sequence))
(etypecase sequence
(list (foldl/list function value sequence))
(vector (foldl/vector function value sequence))))
At the point we can already use FOLDL
for various operations. We could of
course improve it with the addition of the usual :START
, :END
and :KEY
keyword arguments for more flexibility.