computer science: Introduction To Lambda Calculus

1. Introduction

   The Lambda Calculus was developed by Alonzo Church in the 1930s and published in 1941 as "The Calculi Of Lambda Conversion". It became important, along with Turing machines, in the development of computation theory, and is the theoretical basis of all functional programming languages, such as Lisp, Haskell and ML.

   Lambda calculus is a formal, abstract language where all functions are defined without giving a name. We can understand the foundations of functional programming by studying the properties of this formal language.

   It should be noted that the lambda notation is not frequently used in practical Lisp programming. The reason is simple: the use of a name is more convenient to refer to a function. Reference to a function without name is difficult for practical programming if you want the function to be called many times. However, as we have said, our interest lies in the role that lambda calculus plays as a foundation of functional programming. Once we understand an abstract model of computation, we can always develop a language based on it.

   It does not have any complicated formulae or operations. All it ever does is taking a line of letters (or symbols), and performing a little cut and paste operation on it. As you will see, the Lambda Calculus can compute everything that can be computed, just with a very simple cut and paste.

2. Language

Syntax of lambda calculus

The language of lambda calculus is simply defined as:

[function]      := (lambda (x) [expression])
[application]   := ([expression] [expression])
[expression]    := [identifiers] | [application] | [function]
[identifiers]   := a | b | ...

A line of symbols is called an expression. It might look like this: (λx.xy) (ab)
We only have the following symbols:

Single letters (like a, b, c, d...), which are called variables. An expression can be a single letter, or several letters in a row. More generally, we can write any two or more expressions together to get another expression.
Parentheses: ( ). Parentheses can be used to indicate that some part of an expression belongs together (just as the braces around this part of the sentence make it belong together). Where we don't have parentheses, we look at expressions simply from left to right.
The greek letter λ (pronounced, of course: Lambda), and the dot: . With λ and the dot, we can write functions. A function starts always with the λ and a variable, followed by a dot, and then comes an expression. The λ does not have any complicated meaning: it just says that a function starts here. The λ-variable-. part of a function is called its head, and the remainder (the expression) is called the body.

Q: What is the value or meaning of a variable?

A: None. Variables do not stand for anything. They are just empty names. Even the name is unimportant. The only thing that matters is: when two variables have the same name, they are the same. You can rename variables all you want, without changing the expression.

Q: What does a function calculate?

A: Nothing, really. It is just a kind of expression, with a head and a body. It just stands there. The only thing we can do with it is to resolve it.

Q: Why "λ"?

A: An accident, perhaps. Initially, Alonzo Church just drew a little roof to mark the head variable, like this: (ŷ xy) ab. In the typed manuscript, he put the roof in front of the head, so it became (⋀y.xy) ab. The typesetter turned it into (λy.xy) ab, which is visually close enough.

Slightly more formally, we can say: All variables are lambda terms (a valid expression in the lambda calculus). If x and y are lambda terms, then (x y) is a lambda term, and (λx.y) is a lambda term. From these three rules, we can construct all valid expressions. If we also agree to read all lambda expressions from left to right, we can omit a few of the parenthesis: (λy.xy) ab is the simplified version of (((λy.(x y)) a) b).

Cut & Paste

Functions can be resolved if they are followed by another expression. The resolution works by taking the variable mentioned in the head, and replacing all of its occurrences within the body with the expression after the function. We cut the expression after the function, and paste it into the body, in every place indicated by the head. Having done that, we throw the head away, because it has served its purpose: telling us which variable to replace.

   The resolution of functions is the only thing we can ever do in the Lambda Calculus. Once we have gotten rid of all the lambdas, or if there are no more expressions after the remaining functions, we cannot replace anything any more. We can go home now.

      Q: Can functions contain other functions?

      A: Absolutely. Functions are expressions, and expressions can contain other expressions, so functions can be parts of the bodies of other functions, or be part of the replacing expression. In fact, we have expressions like λx.λy.xzy so often that we like to abbreviate them as λxy.xzy. This means that we will try to replace the first variable in the head (x) with the first expression after the body (xzy), the second variable (y) with the next one after that, and so on. The variables mentioned in the head (the one tagged for replacement) are called bound variables. Unmentioned variables are free variables. Because functions can be part of other functions, a variable may be both bound and free in the same expression.

Church Numerals

So if we want to have numbers, we have to encode them as functions. Thankfully, Alonzo Church already came up with such an encoding, where the value of a numeral is equivalent to the number of times a function is applied to an argument. This can be written as λs.λz. sⁿ z, where n is the natural number represented and sⁿ means function s composed with itself n times (we’ll say “applied n times” for short).
Thus the first few natural numbers are encoded as follows:

0 = λs.λz. z the function s is applied to the argument z zero times
1 = λs.λz. s z the function s is applied once
2 = λs.λz. s (s z) the function s is applied twice

(the names s and z are short for successor and zero)

Lets also quickly look at how we can write simple functions with numbers.

succ = λn.λs.λz. s (n s z) (n + 1)

succ adds 1 to a number. Since a number n is defined as a function that applies the 1st argument to the 2nd argument n times, the result should be a function that applies the 1st argument one more time. When you substitute the variable n in succ with a value x, that’s what you get: a function that applies s one more time than x would:

(λn.λs.λz. s (n s z)) x → λs.λz. s (x s z)

Similarly, we define a + b so that it returns a function applying s b times and then a times

add = λa.λb.λs.λz. a s (b s z) (a + b)

To understand this more clearly, lets substitute values x and y for a and b:

(λa.λb.λs.λz. a s (b s z)) x y → (λb.λs.λz. x s (b s z)) y → λs.λz. x s (y s z)

As we can see, after applying x and y, the result is still a function that looks similar in shape to our original definition of numbers. It applies s to z y times, then applies s to the result x more times. Multiplication can be defined in a way that looks even simpler than the addition above:

mul = λa.λb.λs. a b s (a * b)

Substituting x and y gives us:

(λa.λb.λs. a b s) x y → (λb.λs. x b s) y → λs. x y s

It looks simpler, but is perhaps harder to grasp immediately: we used a trick and left out the z argument. Now the shape of the resulting function is different — it applies y to s x times, but what does it mean to apply a number y to just one argument? Remember that if y is a number, it must perform a computation of the shape λs.λz. s^y z. If we apply this to the s in the multiplication x times, we get the following (renaming the outer s to s’ to distinguish from the inner s):

λs’. (λs.λz. s^y z)^x s’

So in this case we see that the 2nd argument z is there in y, but we apply y only to s. This is similar to partial application in many languages. Lets see what happens if x = y = 1:

λs’. (λs.λz. s¹ z)¹ s’ = λs’. (λs.λz. s z) s’ → λs’.λz. s’ z
(this is equivalent to the definition of 1)

Lets do 2 * 2 as well, where the substitutions become more complex and we rename some variables to distinguish similarly-named ones:

λs’. (λs.λz. s² z)² s’ = λs’. (λs.λz. s (s z))² s’ = λs’. (λs.λz. s (s z)) ((λs’’.λz’’. s’’ (s’’ z’’)) s’) → λs’. (λs.λz. s (s z)) (λz’’. s’ (s’ z’’)) → λs’. λz. (λz’’. s’ (s’ z’’)) ((λz’’. s’ (s’ z’’)) z) → λs’. λz. (λz’’. s’ (s’ z’’)) (s’ (s’ z)) → λs’. λz. s’ (s’ (s’ (s’ z))))
(this is equivalent to definition of 4)

To recap, we defined multiplication using partial application of lambdas, without mentioning the second argument z that we usually had in numbers, but the end result still has the correct shape. We could also have defined multiplication in a more verbose way that includes the z argument. Give it a try, or try to come up with an alternative definition of multiplication that uses succ or add.

Church Booleans

Booleans can also be encoded as functions:

true = λt.λf.t 2-arg function returning the 1st arg
false = λt.λf.f 2-arg function returning the 2nd arg

Basically, the booleans represent if-then-else expressions. They both take two arguments, true returns the first (then) and false returns the second (else). We’ll take a look at some example programs with booleans in the next posts.

computer science

C

About Me

Thursday, 31 December 2015

Introduction To Lambda Calculus