Martin’s Atelier: Church Numerals

The basic idea

Church numerals are representations of the natural numbers as functions. The web is blessed with many articles about this, including a fine one at Wikipedia,¹ but I wanted to write something to clarify matters in my own mind.

Caveat: Most of the stuff I found on the web is couched in the language of lambda calculus which I find less readable than Haskell² It's possible that I'll stray a bit from what's normally accepted as Church Numerals as a consequence.

Counting

Most people probably first encounter the natural numbers when learning to count: “no blocks, one block, two blocks” and so on. The central idea of Church Numerals is to count how many times a function is applied. More specifically, given some arbitrary function, f, and a value z, the Church Numeral for two is a function which will apply f twice to z.

For example:

two f z = f ( f z )

Now, it might not be immediately obvious that this is a reasonable representation of 2, but that's really because we don't have any good way to visualize what's happening. As is often the case, things become clearer when they get less abstract. Instead of arbitrary f and z, let's choose some specific examples. After some thought the following are useful:

f = (1 +)
z = 0

The odd looking definition for f is just a section³. It's the same as saying

f x = (1 + x)

To understand why we choose f and z thus, recall that the Church Numeral for two is supposed to apply f (which we're defining to just increment) to z (here 0) twice. Explicitly:

  two f z = f (f z)     -- definition of two
          = f (f 0)     -- definition of z
          = f (1)       -- applying f
          = 2           -- applying f

In other words these choices give us a pretty printer:

pp ch = ch f z
   where f x = x + 1
         z   = 0

pp two
> 2

There's nothing particularly special about these f and z though: they're just ways of visualizing the Church Numeral. For example:

two (1 +) 0
> 2
two ('*':) ""
> "**"

Small integers

It's not really conventional two start the integers with two, so let's go back to zero and one.

Recall that our guiding principle is that the Church Numeral for n should apply some other function n times. Now it's not too hard to find these:

zero f z = z
one  f z = f z
two  f z = f (f z)

And if you substitute f and z from above, you'll see they work as expected.

Most of the time it will be neater to omit z:

zero  f z = z
one   f = f
two   f = f . f
three f = f . f . f

which emphasizes that function composition is at the heart of our new method of counting. With our favourite choice for f:

three (1 +) = (1 +) . (1 +) . (1 +)
            = (3 +)

Obviously it will get a bit boring if we have to keep on generating the numerals by hand. However we could recursively define the numeral for any n:

church 0 f z = zero
church n f z = f ( church (n - 1) f z)

twelve = church 12
pp twelve
> 12

Arithmetic

Having mastered counting, most people move on to arithmetic. So, it's natural to ask if, given a couple of Church Numerals, can we make a new Church Numeral which corresponds to their sum.

It's probably worth reminding ourselves at this stage that the Church Numerals are functions: so the result of the addition will be another function. All things considered, it's a good thing we're using a functional language!

In the following examples, we'll use ci to denote the Church Numeral for i. Often we'll check that things work with the particular choice of f = (1 +). This isn't necessary of course, and you could prove all the results for general f if you so wanted.

These results will be handy and hopefully intuitively obvious:

ci (1 +) = (i +)
ci (j +) = (i * j +)

We'll also define operators like <+> to represent the addition of Church Numerals. Haskell's happy to let us do that.

Addition

Let's begin with the answer:

(<+>) ci cj f = (ci f) . (cj f)

The key idea here is that to add ci and cj, we need to apply f j times, then apply it i times (or vice versa):

With f = (1 +):

(ci <+> cj) (1 +) = (ci (1 +)) . (cj (1 +))
                  = (i +) . (j +)
                  = (i + j +)

Of if you're skeptical:

pp $ three <+> two
> 5

Multiplication

Again we'll begin with the answer:

(<*>) ci cj = ci . cj

This time, instead of applying f i times then j times, we want to apply f j times, then do that whole operation i times.

(ci <*> cj) f     = (ci . cj) f
                  = ci (cj f)

or with f = (1 +)

(ci <*> cj) (1 +) = ci (cj (1 + ))
                  = ci (j + )
                  = (i * j +)

Again for the skeptics:

pp $ three <*> two
> 6

Exponentiation

Following the usual pattern:

(<^>) ci cj = cj ci

For addition we applied each Church Numeral to its own copy of f. When multiplying we had one f which we applied both Church Numerals to in succession. Finally when exponentiating we apply one Church Numeral the other.

To see why this works recall that (cj g) applies g j times. So,

(ci <^> cj) = cj ci
            = ci . ((c(j-1)) ci)
            = ci . ci ((c(j-2)) ci)
            = ci . ci ... ci         -- j times

But ci applies f i times, so in total f gets applied i^j times.

Or to give an example:

pp $ three <^> two
> 9

Subtraction

Subtraction of Church Numerals is hard. There's no really good way to do it, but there is a hack.

Recall that adding one to a Church Numeral involves applying a function: accordingly subtracting one would involve unapplying the function, or finding the function's inverse. In general that's a tad tricky!

There is a hack though, which goes roughly as follows. Given a Church Numeral we can always construct the next one. So, we could construct an infinite list of all Church Numerals: then to work out the result of subtracting one from N we could look in the list until we find N, then return the preceding Church Numeral.

Having worked out how to implement the inverse function, we can handle subtraction by just applying it the relevant number of times.

In one final twist, it's worth remembering that our numbers 'start' at zero, i.e. we're dealing with the natural numbers. That's not a problem for addition, but we'll need to handle (0 - 1) as a special case. Canonically setting 0 - 1 to 0 seems popular.

One could do something analogous for division, but we'd need to store all possible products which would be even less efficient. We'd also have to decide how to handle e.g. 4 / 3 which doesn't have a solution in the naturals.

If you're interested in this, there's lots more about it on the Internet.

Operator precedence

In principle one should define the precedence of these operators to respect the normal rules of arithmetic, but I've not bothered here. However, it's worth noting that one can use parentheses to specify the evaluation order.

Example code

If you want to play with this you may find this toy example⁴ helpful. It defines operators for addition, multiplication and exponentiation, and zero, one, and two.

There's also a small demonstration which constructs Church Numerals for 0 to 9, and checks they evaluate properly:

demo_data = zip [zero, one, two, three, four,
                     five, six, seven, eight, nine ]
                [0 .. ]
    where four  = two   <*> two
          five  = two   <+> three
          six   = two   <*> three
          seven = six   <+> one
          eight = two   <^> three
          nine  = three <^> two

demo = concatMap test demo_data

test (ch, n) = "N: " ++ (show n )
               ++ " PP: " ++ (show n')
               ++ ok (n == n')
    where n' = pp ch
          ok True  = " OK\n"
          ok False = " FAIL\n"

You can check it thus:

$ ghci basic.hs
GHCi, version 6.12.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
[1 of 1] Compiling Main             ( basic.hs, interpreted )
Ok, modules loaded: Main.
*Main> putStr demo
N: 0 PP: 0 OK
N: 1 PP: 1 OK
N: 2 PP: 2 OK
N: 3 PP: 3 OK
N: 4 PP: 4 OK
N: 5 PP: 5 OK
N: 6 PP: 6 OK
N: 7 PP: 7 OK
N: 8 PP: 8 OK
N: 9 PP: 9 OK

A different view

Up until now, we've been playing with expressions and using a pretty printer which turns the Church Numeral into an integer. Perhaps though, it would be nice to actually see the source code of the function itself.

In principle I suppose one could ask the Haskell compiler to generate this, but I don't know how. It's probably also the case then even if one could get a function definition, it might not be the one we want. For example, the compiler might have optimized the expression.

If we want to have control, I think it's better that we assemble the code ourselves. In essence we'll work with pairs of a Church Numeral and a string representation of it.

When we make a Church Numeral we'll have to make a string version of it as well, and we'll need to modify our arithmetic operators to combine the string representations too.

In type terms:

basic_church  ::   (t -> t) -> t -> t
paired_church :: ( (t -> t) -> t -> t, String )

Although conceptually straightforward, it's a bit fiddly to get right. Here are a couple of examples:

three = (\f -> f . f . f, "\\s z -> (s . s . s) z")

(<^>) (ff,fs) (gf,gs) = (gf ff,
                         "\\s z -> ((" ++ gs ++ ") (" ++ fs ++ ")) s z")

You'll see that for consistency's sake all the string versions are lambda functions from s z.

If you're keen to play along at home, you can download code which implements this.⁵ The only new function is pq which returns the stringy version of the Numeral:

$ ghci paired.hs
GHCi, version 6.12.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
[1 of 1] Compiling Main             ( paired.hs, interpreted )
Ok, modules loaded: Main.
*Main> pp three
3
*Main> pq three
"\\s z -> (s . s . s) z"

The demo function also shows the expressions:

*Main> putStr  demo
N: 0 PP: 0 OK PQ: \s z -> z
N: 1 PP: 1 OK PQ: \s z -> s z
N: 2 PP: 2 OK PQ: \s z -> (s . s) z
N: 3 PP: 3 OK PQ: \s z -> (s . s . s) z
N: 4 PP: 4 OK PQ: \s z -> ((\s z -> (s . s) z) . (\s z -> (s . s) z)) s z
N: 5 PP: 5 OK PQ: \s z -> (((\s z -> (s . s) z) s) . ((\s z -> (s . s . s) z) s)) z
N: 6 PP: 6 OK PQ: \s z -> ((\s z -> (s . s) z) . (\s z -> (s . s . s) z)) s z
N: 7 PP: 7 OK PQ: \s z -> (((\s z -> ((\s z -> (s . s) z) .
   (\s z -> (s . s . s) z)) s z) s) . ((\s z -> s z) s)) z
N: 8 PP: 8 OK PQ: \s z -> ((\s z -> (s . s . s) z) (\s z -> (s . s) z)) s z
N: 9 PP: 9 OK PQ: \s z -> ((\s z -> (s . s) z) (\s z -> (s . s . s) z)) s z

So if one felt like obfuscating constants, one might replace

i = 7

with

i = (\s z -> (((\s z -> ((\s z -> (s . s) z) . (\s z -> (s . s . s)
      z)) s z) s) . ((\s z -> s z) s)) z) (1+) 0

There's no reason to limit ourselves to small numbers either (well unless we care about execution speed!). For example:

x = seven <*> thirteen <*> nineteen <*> twentythree <*> seventynine
    where seven        = two <*> three <+> one
          thirteen     = two <*> two <*> three <+> one
          sixteen      = (two <*> two) <^> two
          nineteen     = sixteen <+> three
          twentythree  = sixteen <+> seven
          seventynine  = two <*> three <*> thirteen <+> one

lets us represent 3141593 as:

\s z -> ((\s z -> ((\s z -> ((\s z -> ((\s z -> (((\s z -> ((\s z -> (s . s) z) . (\s z -> (s . s . s) z)) s z) s) . ((\s z -> s z) s)) z) . (\s z -> (((\s z -> ((\s z -> ((\s z -> (s . s) z) . (\s z -> (s . s) z)) s z) . (\s z -> (s . s . s) z)) s z) s) . ((\s z -> s z) s)) z)) s z) . (\s z -> (((\s z -> ((\s z -> (s . s) z) (\s z -> ((\s z -> (s . s) z) . (\s z -> (s . s) z)) s z)) s z) s) . ((\s z -> (s . s . s) z) s)) z)) s z) . (\s z -> (((\s z -> ((\s z -> (s . s) z) (\s z -> ((\s z -> (s . s) z) . (\s z -> (s . s) z)) s z)) s z) s) . ((\s z -> (((\s z -> ((\s z -> (s . s) z) . (\s z -> (s . s . s) z)) s z) s) . ((\s z -> s z) s)) z) s)) z)) s z) . (\s z -> (((\s z -> ((\s z -> ((\s z -> (s . s) z) . (\s z -> (s . s . s) z)) s z) . (\s z -> (((\s z -> ((\s z -> ((\s z -> (s . s) z) . (\s z -> (s . s) z)) s z) . (\s z -> (s . s . s) z)) s z) s) . ((\s z -> s z) s)) z)) s z) s) . ((\s z -> s z) s)) z)) s z

Try it for yourself in ghci:

let x = monstrous function from above (cut and paste it).
x (1+) 0
Be glad that all your arithmetic isn't this slow!

Conclusions

I can't say that this is particularly useful, though people who think about the basis of computation might disagree. On the other hand, I think it's indisputably fun, which is justification enough for me.

Although I'm sure I've encountered this before in the past, I was reminded of it by a fine geocaching puzzle. I'd like to thank the setter of the cache for this, but sadly can't really link to the relevant work without spoiling it for other people.

References

1. http://en.wikipedia.org/wiki/Church_numeral
2. http://en.wikipedia.org/wiki/Haskell_%28programming_language%29
3. http://www.haskell.org/tutorial/functions.html
4. http://www.mjoldfield.com/atelier/2011/01/church/basic.hs
5. http://www.mjoldfield.com/atelier/2011/01/church/paired.hs

Church Numerals