The upcoming C++ standard has added lambda expressions to the language. I thought it would an interesting (and arguably pointless) exercise to try and represent church numerals using lambda expressions. I only partly succeeded in the amount of time I spent thinking about it. Let’s refresh the definition of Church numerals. Church numeral C(N) for integer N is a function with the following properties :

- It takes as input, another function f.
- Its result is a function g.
- The property of g is such that g(x) = f(f(… N times …(x))

The definition is intentionally silent on the argument and return types of f and g. In fact, we don’t need to know the types for the definition. For a non-strict language like Haskell, Church numerals comes naturally. Unfortunately, C++0x lambda expressions are monomorphic, i.e., you have to say upfront what the argument and return types are. (I know, I know, you can put it inside a template class and use the template parameters as the types, but that’s not the same thing). So, as a first step, I started off defining the functions in a simple manner. They take one integer, and return one integer.

#include<iostream>

using namespace std;

typedef function<int(int)> F;

Now, I define church numeral to be a function that takes an argument of type ‘F’ and returns a result of type ‘F’. Here is the code for a function called ‘church’ that takes an unsigned integer and returns the church numeral representation.

function<F(F)> church(unsigned int i)

{

if(i == 0) {

return [=] (F f) { return id; };

}

return [=] (F f) {

F tmp = [=](int x) { return f(church(i-1)(f)(x)); };

return tmp;

};

}

When ‘i’ is zero, ‘church’ simply returns the identity (id) function. For other values, church(i) uses church(i-1) recursively. It passes the incoming function ‘f’ to church(i-1). The resulting function is applied on ‘x’ and the incoming function ‘f’ is applied on this result. This entire operation is wrapped inside a lambda expression which is the return value of church(i). Here is the reverse operation, namely unchurch, that takes a church numeral and return an unsigned int.

unsigned int unchurch(function<F(F)> f)

{

return f(plusone)(0);

}

plusone just returns the incoming argument incremented by one. unchurch just passes plusone to the incoming church numeral and applies the resulting function on 0. Now let us define Plus, that takes 2 church numerals and returns a church numeral that represents the sum of the incoming 2.

{

return [=] (F f) {

F tmp = [=](int x) { return M(f)(N(f)(x)); };

return tmp;

};

}

Similarly Mult, that returns the product.

{

return [=] (F f) {

return N(M(f));

};

}

Go ahead try out the following (works with gcc 4.5):

{

cout << unchurch(church(42)) << "\n";

cout << unchurch(Plus(church(41), church(1))) << "\n";

cout << unchurch(Mult(church(6), church(7))) << "\n";

return 0;

}

So far so good. The difficulty arises when we try to define exponentiation. In Haskell it is quite simple :

What I want to say in C++ is :

function<F(F)> Exp(function <F(F)> M, function <F(F)> N)

{

return N(M);

}

Of course, I can’t do that because ‘F’ is defined to be function<int(int)>. It really needs to be function<anything(anything)>. The monomorphic nature of lambda expressions is really limiting the expression of exponentiation here. I would really be interested in seeing a different way to define ‘F’ such that exponentiation can be expressed as naturally as in Haskell.

Are C++ lambdas real lambdas?

Pray tell, what are “real lambdas”?

This also works with Microsoft’s Visual C++ 2010 with very few edits. The only type of edit that I had to make in about 3 places was to add “-> RETURN_TYPE” to any lambda definition that contained more than a single return statement. Apparently their compiler has trouble inferring types if there are multiple steps involved.

For example (notice that I had to add a return type to the outer lambda here, but not the inner one):

return [=] (F f) -> F {

F tmp = [=](int x) { return f(church(i-1)(f)(x)); };

return tmp;

};

I tend to be more of a gcc bigot over VC, but I think VC might be more compliant in this case. At least during one early reading draft of the lambda proposal, I though that the only time you weren’t required to give a return type was if the lambda was a single return statement. I got the impression that it was be easier for compilers to implicitly turn ‘[](int x, y){return x+y;}’ into ‘[](int x, y) -> decltype(x+y) {return x+y;}’ by taking whatever expression was given to return and also giving it to decltype.

I’d like it if the compiler could be expected to be a little smarter and at least handle functions with a single return point. But I don’t want something that compilers are all going to accidentally implement differently.

Edd

function Exp(function M, function N)

{

int n = unchurch(N);

return n==1?

[=](F f) { return M(f); }:

Mult(M, Exp(M, church(n-1)));

}

Manjunath asked : Pray tell, what are “real lambdas”?

oh, for example do C++ lambdas allow partial application ? or as another example is it possible to bind the lhs in the lambda body

e.g. f = [=](…) {… f …}; // the f on the rhs is recursive?

I dont know the answer so I thought I’ll ask; anyway these do look interesting!

@c

@@c :

Partial application :

function<int(int,int)> Plus = [](int x, int y) { return x+y; };

function<int(int)> Plus2 = [](int x) { return Plus(x,2); };

Recursion :

function<int(int)> Fib = [](int i) { return i<2?i:(Fib(i-1)+Fib(i-2)); };

@Scott,@Eddward : I am a fan of conciseness. I believe the programmer should say as little as possible, and say more only when there is ambiguity. (There are readability and maintainability implications, but that’s another debate). Even if there are multiple return points, if the compiler can statically ascertain that all return points return the same typed value, then the ->return_type should not be required.

@Michah: What you have is logically correct (except for the missing N==0 case). But that defeats the whole point of Church numerals, which is to have an alternative encoding for numbers starting from first principles. Going forth and back to another encoding (in this case, the “regular” integers) is not allowed in the operations. We can as well do church(pow(unchurch(M), unchurch(N))) if that is allowed.

What about unary encoding of variables and values, with a runtime interpretation instead of compile-time expansion? This would give a uniform dat type (i.e., string) and support dynamic binding. An unary encoding clearly works for the baseline program, see code below. BTW: I handle the N=0 case just like N<0; namely its an invalid input. Here is the code, which obviously can be improved by use of references (although that isn't the point).

#include

#include

#include

using namespace std;

typedef function F;

static const F id = [=](string x) { return x; };

static const F plusone = [=](string x) { return x+”*”; };

static const F subone = [=](string x) { return x.substr(1); };

function church(string i)

{

if(i.size()==0) {

return [=](F f) { return id; };

}

return [=](F f) {

F tmp = [=](string x) { return f(church(subone(i))(f)(x)); };

return tmp;

};

}

string unchurch(function f)

{

return f(plusone)(“”);

}

function Plus(function M, function N)

{

return [=](F f) {

F tmp = [=](string x){ return M(f)(N(f)(x)); };

return tmp;

};

}

function Mult(function M, function N)

{

return [=](F f) {

return N(M(f));

};

}

int main(int argc, char *argv[])

{

cout << unchurch(church("*********************************************")) << "\n";

cout << unchurch(Plus(church("****************************************"),church("*****"))) << "\n";

cout << unchurch(Mult(church("*********"),church("*****"))) << endl;

return 0;

}

@Michah : Interesting. Is it possible to edit the comment with < and >? I started doing it but wasn’t sure what you intended in some places.

You are asking whether the c++0x syntax can support symbol-directed parsing of arbitrary functional value sequences, without resorting to templates. That would be an inline template, but you already said you don’t want templates. Still, there is another answer; namely, to use an unary encoding which can be evaluated at runtime in a manner similar to your baseline code (which is, after all, an interpreter for Church expressions, with the constraint that it not use induction to integers except for the final output.)

Yes, we can achievec this goal, even without compile-time expansion of templates. However, the encoding will be very strange, although I suggest it can be mechanically generated from the nice Haskell forms.

As you noted in the earlier blog, it is based on a recursive expression language such as Lisp. The expressiona can be rewritten, or compiled and run. So addition in unary is simply

(defunf add(x y)

(cond ((null y) x)

(t (add (cons x ‘1) (cdr y)))

)

)

Next, define 00 as the “end of input” symbol and encode items in an entity-prefix code, for example “101^n0″ denotes the Nth variable, and “1101^n0″ denotes the integer larger than zero. That is, the prefix 10 introduces a variable, whereas the prefix 110 introduces a number. Encoding of “end” “variable 2″, “value 2″, “variable 3″, “value 5″, “value 2″ “end”

00101101101101011101101111101101100

which is read more easily if we break it up as

10 110

110 110

10 1110

110 111110

110 1100

In this Turing-style manner can write the Church systems as a unary-encoded string which encodes any expression. In fact, I recall a couple ofvtxtbools tjstvdid exactly this, and went on to develop lemmas (ie pumping lemma, etc.) for the study of complexity.

which is exactly

begin

<

On above the phrae “the integer larger than zero” should read “the Nth integer larger than zero”.

/>

end