Nicolas Martyanoff – Brain dump About

Reduce vs fold in Common Lisp

Introduction

If you have already used functional languages, you are probably familiar with fold, a high order function used to iterate on a collection of values to combine them and return a result. You may be surprised that Common Lisp does not have a fold function, but provides REDUCE which works a bit differently. Let us see how they differ.

Understanding REDUCE

In its simplest form, REDUCE accepts a function and a sequence (meaning either a list or a vector). It then applies the function to successive pairs of sequence elements.

You can easily check what happens by tracing the function:

CL-USER> (trace +)
CL-USER> (reduce #'+ '(1 2 3 4 5))
  0: (+ 1 2)
  0: + returned 3
  0: (+ 3 3)
  0: + returned 6
  0: (+ 6 4)
  0: + returned 10
  0: (+ 10 5)
  0: + returned 15
15

In this example, the call to REDUCE evaluates (+ (+ (+ (+ 1 2) 3) 4) 5).

You can reverse the order using the :from-end keyword argument:

CL-USER> (trace +)
CL-USER> (reduce #'+ '(1 2 3 4 5) :from-end t)
  0: (+ 4 5)
  0: + returned 9
  0: (+ 3 9)
  0: + returned 12
  0: (+ 2 12)
  0: + returned 14
  0: (+ 1 14)
  0: + returned 15
15

In which case you will evaluate (+ 1 (+ 2 (+ 3 (+ 4 5)))). The result is of course the same since the + function is associative.

You can of course provide an initial value, in which case REDUCE will behave as if this value has been present at the beginning (or the end with :from-end) of the sequence.

The surprising aspect of REDUCE is its behaviour when called on a sequence with less than two elements:

  • If the sequence contains a single element:
    • if there is no initial value, the function is not called and the element is returned directly;
    • if there is one, the function is called on both the initial value and the single element.
  • If the sequence is empty:
    • if there is no initial value, the function is called without any argument;
    • if there is one, the function is not called and the initial value is returned directly.

As a result, any function passed to REDUCE must be able to handle being called with zero, one or two arguments. Most examples found on the Internet use + or append, and these functions happen to handle it (e.g. (+) returns the identity element of the addition, zero). If you write your own functions, you will have to deal with it using the &OPTIONAL lambda list keyword.

This can lead to unexpected behaviours. If you compute the sum of a sequence of floats using (reduce #'+ floats), you may find it logical to obtain a float. But if FLOATS is an empty sequence, you will get 0 which is not a float. Something to keep in mind.

Differences with fold

The fold function is traditionally defined as accepting three arguments: a function, an initial value — or accumulator — and a list. The function is called repeatedly with both the accumulator and a list element, using the value returned by the function as next accumulator.

For example in Erlang:

lists:foldl(fun(X, Sum) -> Sum + X end, 0, [1, 2, 3, 4, 5]).

An interesting consequence is that fold functions are always called with the same type of arguments (the list value and the accumulator), while REDUCE functions can be called with zero or two list values. This makes it harder to write functions when the accumulated value has a different type from sequence values.

Fold is also simpler than REDUCE since it does not have any special case, making it easier to reason about its behaviour.

It would be interesting to know why a function as fundamental as fold was not included in the Common Lisp standard.

Implementing FOLDL

We can of course implement a fold function in Common Lisp. We will concentrate on the most common (and most efficient) left-to-right version. Let us start by a simple implementation for lists:

(defun foldl/list (function value list)
  (declare (type (or function symbol) function)
           (type list list))
  (if list
      (foldl/list function (funcall function value (car list)) (cdr list))
      value))

As clearly visible, the recursive call to FOLDL/LIST is in tail position and SBCL will happily perform tail-call elimination.

For vectors we use an iterative approach:

(defun foldl/vector (function value vector)
  (declare (type (or function symbol) function)
           (type vector vector))
  (do ((i 0 (1+ i))
       (accumulator value))
      ((>= i (length vector))
       accumulator)
    (setf accumulator (funcall function accumulator (aref vector i)))))

Finally we write the main FOLDL function which operates on any sequence:

(defun foldl (function value sequence)
  (declare (type (or function symbol) function)
           (type sequence sequence))
  (etypecase sequence
    (list (foldl/list function value sequence))
    (vector (foldl/vector function value sequence))))

At the point we can already use FOLDL for various operations. We could of course improve it with the addition of the usual :START, :END and :KEY keyword arguments for more flexibility.

Share the word!

Liked my article? Follow me on Twitter or on Mastodon to see what I'm up to.