Programming Language Design 2

The Simply Typed Lambda Calculus

Learning Objectives

You know what base types are.
You know of the hypothetical typing judgment.
You understand the typing rules of simply typed lambda calculus.
You know how the simply typed lambda calculus behaves with only one base type: .

Base Types

So far, we haven’t seen any base types, such as integers or booleans in the lambda calculus. That is on purpose, the theory of lambda calculus is mostly independent of base types, and even if we had integers or booleans, those could always be translated to abstractions with Church encodings. One way to say this is that integers and booleans are admissible, i.e. all properties of lambda calculus still hold even if we add more base types.

Not having integers or booleans is not a property…

The Simply Typed Lambda Calculus

The simply typed lambda calculus (STLC) is an extension of the untyped lambda calculus where each valid term has a type¹ and abstractions must specify the type of their quantified variable.

The highlighted in red is the only difference between simply typed and untyped lambda calculus.

However, in order for the simply typed lambda calculus to be useful it needs to have at least one base type, such as integers or booleans.

Booleans

Next, we will introduce booleans as part of the untyped lambda calculus. Then we can begin to analyze and define behavior for booleans from an untyped, and later from the typed point of view.

The syntax for writing an is .

Because the set of terms have changed, we have to extend the -reduction, otherwise all the if-then-else term is stuck, i.e. a non-reducible non-value. One of the benefits of expressing relations using inference rules is that we can just extend the rules ad-hoc. So let’s add the following evaluation rules to the new boolean lambda calculus.

Note that and can be non-value terms. Now, the complete set of evaluation rules for the untyped lambda calculus with let expressions and booleans is .

Make sure you know the details of how evaluation of if-then-else works by asking these questions from yourself:

What happens if the condition of an if-then-else is not a value?
If the condition evaluates to false, is the true branch reduced?
If the condition is a value but not a boolean, what happens?
What happens if the branches are the same term?

Loading Exercise...

Church Booleans

For example, the Church encoding of booleans is done by representing them with abstractions of abstractions:

is
with

Then, we can represent the if-then-else construction for three church booleans with

To analyze how this works, let’s assume is a Church encoded boolean and analyze both values:

. Now, we can see that the term reduces to :
. Now, we can see that the term reduces to :

Notice that the and don’t even need to be booleans.

It would be easy to define the following type for a lambda calculus with booleans

#[derive(Debug, Clone)]
pub enum TermSugar {
    // Booleans
    True,
    False,
    Ite(Box<TermSugar>, Box<TermSugar>, Box<TermSugar>),

    Var(String),
    Abs { var: String, body: Box<TermSugar> },
    App(Box<TermSugar>, Box<TermSugar>),
}

Then we could simply translate the sugared terms back to plain lambda terms using the Church encoding.

With understanding of booleans in the untyped calculus, we are ready to add types on top. The reason we added a base type is that plain STLC has no finite types¹, so it would be impossible to write a well-typed term with a finite type.

Infinite terms and types are outside the scope of this course.

This lambda term has type (or formally ). The arrow () type is used to denote the type of abstractions.

The following function would also be the identity function, but for functions:

The type of this identity function is

We also refer to free and bound variables in STLC by considering the definitions 6.3.1-3 after erasing types from STLC terms.

Type Erasure (6.5.2-3)

Types play no significance during evaluation of statically typed programming languages, like those based on STLC, so we can lower e.g. to its untyped variant. This step can be done to ensure that types are not used during evaluation.

One way to enforce the use of the type checker before, and only before evaluating a term would be to have the following architecture:

Where the is defined as

And the type is used as an existential/opaque type, which is not visible to the user.

This way evaluation can not externally be used before type checking and reduction doesn’t have access to type information.

Here are some example terms of (STLC with booleans):

The identity function for boolean endofunctions (endo meaning the domain and codomain have the same type.)

The type of this term is .
The negation function mapping each boolean to its negation. The type of this term is .

Let’s refer to it as .
The identity function for boolean endofunctions applied to the endofunction.

The type of this term is .
The identity function for boolean endofunctions applied to the endofunction which is then applied to the boolean term .

The type of this term is .
This function applies its argument to the value.

The type of this term is ².

Loading Exercise...

Typing

In part 4 chapter 7 type checking was simply a process of determining the type of subterms until they hit integer or boolean values. However, in STLC, type checking is not as simple because we have to deal with variables.

The term has type as applying the abstraction to some boolean argument reduces to the term which has type .

What we have just stated certainly makes sense, but how would we turn that into an algorithm? Especially the “some boolean ” part…

The logic for type checking terms in STLC is the most interesting property of the simply typed lambda calculus. The whole idea is that abstractions know the type of their parameter, so type checking the body of the abstraction should use that knowledge of the variable’s type.

The typing rules should therefore look something like this:

This “knowledge” is stored in a context.

Context

Typing context, typing environment, or just context, refers to the data structure which associates variable names to their types. Symbols are used to denote contexts.

Here are a couple examples of contexts:

The properties we need from a context, are that we can look up types by variable name, and extend them with new variables.

Hypothetical Typing Judgment

Unfortunately the typing relation (), which has gotten us this far, is no longer suitable for type checking, as it’s not aware of the context. We need to define a new relation, the hypothetical typing judgment which also has the context in it.

Note: When reading about lambda calculus and type theory, try to glance over the context and focus on the terms and types on the first reading.

Typing Rules for

Let’s start spelling out the typing rules for . First, for simplicity, let’s consider just the boolean fragment of the language.

Reminder, here are the boolean typing rules from part 4 of the course:

The rules for type checking terms named respectively, are as follows.

Note: the context is completely arbitrary, as are the terms and type . Notably, this rule correctly type checks function types: .

Now, let’s write down the rules for type checking the core or STLC.

Variables

Type checking a variable happens with the following logic:

If the typing association appears in the context, then we know that .

This can be written in a style without hypotheses as follows:

The rule is taken to mean that if appears in the context³, then .

Here’s an example of a typing derivation involving a variable inside an if-then-else:

Abstractions

The rule for type checking abstractions is the part where the context is extended with a new variable before recursively continuing. Type checking an abstraction happens with the following logic:

If under the additional hypothesis that , then .

Extending the context into the hypothesis is crucial for the potentially free parameter inside the abstractions to type check.

Example typing derivation:

Let’s name this derivation .

Applications

Type checking applications simply involves ensuring that the argument is in the domain of the arrow type.

Example typing derivation:

Here, the typing derivation is substituted in from the previous example as it precisely matches the expected format: and .

Let-expressions

The rule for type let-expressions is very similar to abstractions:

If under the additional hypothesis that , and then .

Even though the let expression knows the , we avoid using substitution in the rules. Substitution is an implementation detail of evaluation, and evaluation details should not be mixed with type checking, as type checking is a prerequisite for correctness of evaluation.

If we used substitution directly, we could write both of the assumptions as , but this makes reasoning about the rule much more difficult.

Loading Exercise...

Evaluation

The evaluation rules are inherited from the untyped lambda calculus, just with the added type in abstractions.

Implementation

Before diving into the exercise, here are some notes (and hints) for the implementation:

We define the context as a HashMap of variable names to types.

pub type Context = std::collections::HashMap<String, Ty>;

infer_type results in a Result<Ty, TypeError> where TypeError encodes the following set of possible type errors.

pub enum TypeError {
    UndefinedVariable(String),
    WrongAppTypeRight(Ty),
    WrongAppTypeLeft(Ty),
    Fail,
}

It is highly recommended to write unit tests for the implementations, although they are not graded.

Loading Exercise...

Properties of the Simply Typed Lambda Calculus

For proofs of these theorems, refer to the optional chapter 10 which covers formalization of programming language theory.

Footnotes

Well, the lack of base types makes the system less expressive and only interesting from a theory point of view. Without base types, the simply typed lambda calculus is “degenerate”. That is, there are no (finite) well-typed terms. This is why we need base types to make the system useful. However, the typing rules and other properties are still valid in this degenerate system. ↩ ↩²
When writing arrow types, it is useful to associate them to the right so that we don’t need parentheses to indicate that the return type is a function. This means that the type of “multi-argument” functions is just a chain of arrows without parentheses (unless the parameters have types with arrows).

which is not the same type as . ↩
Depending on the formal definition of the context, we might use a function and write in the assumptions and in the conclusion. This is closer to the Rust implementation where a HashMap is used as the context. ↩

← Reduction and Evaluation

STLC With Integers And Booleans →