Programming Language Design

From BASIC to Lambda Calculus

Learning Objectives

You remember the key parts of forming an interpreter for a line-based execution model.
You know of the limitations of the line-based execution model.
You understand how a call-stack execution model works.
You know of lambda calculus.

Simple Line-Based Execution Model

In the part Back to BASICS we implemented a simple BASIC interpreter that executed code line-by-line, keeping track of the current line number to determine the next statement to execute. The high-level flow of the interpreter is shown in Figure 1 below.

Fig 1. — Flow of the BASIC interpreter.

The key learnings from the part were the concepts tokenizer, parser, and interpreter. The tokenizer breaks the a code string into tokens, the parser converts the tokens into a structured representation of code, and the interpreter executes the code. In our interpreter, the interpreter executed the code line-by-line, updating the current line number as it goes.

As the interpreter executed code line-by-line, it was able to keep track of the current line number and determine the next statement to execute. On a high level, the execution model was simple and easy to understand.

The line-based execution model is, however, limited. For example, implementing a for loop in a line-based interpreter requires careful management of line numbers and jump statements (FOR, NEXT, etc.). While the interpreter can handle such loops, supporting more complex control flows can make the interpreter logic increasingly complex and harder to maintain.

Loading Exercise...

This approach does not scale. To handle more complex control flow structures, such as function calls, recursion, and loops, we need a more sophisticated execution model. One such model is the call-stack execution model.

Call-Stack Execution Model

The call-stack execution model is based on the idea of a call-stack, which is a data structure used by a program’s runtime to keep track of active function calls. The stack is a last-in-first-out (LIFO) structure, meaning that the last function call that was made is the first one to be completed.

Each entry in the stack is called a stack frame and typically contains information such as the function arguments, local variables, and the return address. When a function is called, a new stack frame is created and pushed onto the stack. When the function completes, the stack frame is popped off the stack, and control returns to the calling function.

Code execution in the call-stack model

Let’s briefly look into the code execution in the call-stack model. As an example, consider the following Gleam program — when the program is run, Gleam calls the main function.

fn sum(a, b, c) {
    a + b + c
}

pub fn main() {
    sum(1, 2, 3)
}

When the main function is called, a stack frame is created for the main function. The main function then calls the sum function with arguments 1, 2, and 3. A new stack frame is created for the sum function, and the arguments 1, 2, and 3 are stored in the stack frame. The sum function computes the sum of the arguments and returns the result. The stack frame for the sum function is then popped off the stack, and control returns to the main function.

Similarly, consider the following program. The key difference between the program above and the program below is that now, the value that the function sum returns is stored to a variable x.

fn sum(a, b, c) {
    a + b + c
}

pub fn main() {
    let x = sum(1, 2, 3)
    // do something with x
}

At the end of the function call sum, the value that the function returns is stored in the return address of the stack frame of the sum function. This value is then copied to the variable x in the stack frame of the main function. The stack frame of the sum function is then popped off the stack, and control returns to the main function — at this point, the variable x contains the value that the function sum returned. The value of x is then printed to the console.

The function call print is also a function call, and it would also create a new stack frame. However, for simplicity, we will not go into the details of the print function call in this example.

Recursive function calls

One of the key features of the call-stack execution model is its ability to handle recursive function calls. When a function calls itself, a new stack frame is created for each call. Each stack frame is kept separate, ensuring that the functions can be executed recursively without interfering with other calls.

As an example, let’s consider the following Gleam program that calculates the Fibonacci number of a given input n.

import gleam/io

fn fibonacci(n) {
    case n {
    0 -> 0
    1 -> 1
    _ -> fibonacci(n - 1) + fibonacci(n - 2)
  }
}

pub fn main() {
    let x = fibonacci(3)
    // do something with x
}

When the main function is called, it is added to the call stack. The main function then calls the fibonacci function with the argument 3, which is also added to the stack frame. The function then calls itself recursively to calculate the Fibonacci number of 3. Each recursive call creates a new stack frame, allowing the function to be executed recursively. The stack frames are popped off the stack as the function calls complete, and the final result is returned from the function.

The key problem with the call-stack model is that the stack has a limited size, and if the recursion depth exceeds the stack size, a stack overflow error occurs. This can happen if the recursion is too deep or if the stack frames are too large.

It is not a surprise that the popular programming help site Stack Overflow is named after the stack overflow error.

Loading Exercise...

Tail-call optimization

One way to mitigate the risk of stack overflow in recursive functions is through tail-call optimization. A tail-call is a function call that is the last operation in a function. Tail-call optimization is a technique that identifies that last operation of a function, and if it is a function call, reuses the current stack frame for the function call instead of creating a new one.

Tail-call optimization effectively prevents the stack from growing indefinitely and can help avoid stack overflow errors. It is a common optimization technique used in functional programming languages to optimize recursive functions.

For tail-call optimization to work, the recursive call must be in the tail position, meaning that the result of the recursive call is directly returned from the function without any additional computation. As an example, for our above Fibonacci function, the recursive calls fibonacci(n - 1) and fibonacci(n - 2) are not in tail position because their results are used in an addition operation, and tail-call optimization cannot be applied here.

To enable tail-call optimization, the recursive calls must be in tail position. This can be achieved by accumulating the result of the recursive call in an accumulator variable and passing it along to the next recursive call. This way, the recursive call is the last operation in the function, and tail-call optimization can be applied.

import gleam/int
import gleam/io

fn fibonacci_helper(n, a, b) {
  case n {
    0 -> a
    1 -> b
    _ -> fibonacci_helper(n - 1, b, a + b)
  }
}

fn fibonacci(n) {
    fibonacci_helper(n, 0, 1)
}

pub fn main() {
    let x = fibonacci(6)
    io.println(int.to_string(x))
}

Actor model and Gleam

As Gleam uses Erlang’s actor model for concurrency, the call-stack execution model is not the only model used in Gleam. In the actor model, each actor has its own call stack, and messages are passed between actors to communicate. This model allows for concurrent and distributed programming, where actors can run independently and communicate asynchronously.

Loading Exercise...

Lambda Calculus and Call Stack

As we’ve looked into functional programming, it is meaningful to also discuss lambda calculus. Lambda calculus is a formal system in mathematical logic and computation that is the basis for functional programming. Lambda calculus is an abstract framework and thus, it does not directly use a “call stack” as programming languages do. However, when lambda calculus is interpreted in terms of program execution models, we can map its behavior to a call stack-based model.

Basics of Lambda Calculus

Lambda calculus consists of expressions, variables, abstractions (functions), and applications.

A variable is a symbol representing a value x, y, etc.
An abstraction is a function defined as λx. M, where x is a parameter and M is the body of the function.
An application is the act of applying a function M to an argument N, denoted as (M N).

The key operations in lambda calculus are reduction and evaluation order. Reduction involves simplifying expressions by substituting arguments into functions, also known as β-reduction. The evaluation order can be normal order or applicative order, where normal order reduces the outermost (leftmost) function first, and applicative order reduces the arguments first before applying the function.

Numerals and arithmetics

The system for representing functions and applying them to arguments in lambda calculus is different from traditional programming languages, but the underlying concepts are similar. As an example, numbers are often represented using Church numerals, and arithmetic operations are defined using lambda expressions.

As an example, the number 0 is represented as λf. λx. x, the number 1 is represented as λf. λx. f x, and the number 2 is represented as λf. λx. f (f x). This can be read as a function that takes a function f and an argument x and applies f to x multiple times to represent the number — for number 3, f would be applied three times — f (f (f x)) — and so on.

Arithmetic operations like addition and multiplication can also be defined using lambda expressions. For example, addition would be defined as λm. λn. λf. λx. m f (n f x), where m and n are the numbers to be added, and f is a function that represents the successor operation.

Adding two numbers

Let’s concretely look into an example of adding 1 and 1. This would be represented as (λm. λn. λf. λx. m f (n f x)) (λf. λx. f x) (λf. λx. f x), where the reduction would involve substituting the numbers into the addition function to get the result.

First, we would substitute m with λf. λx. f x. This would result in λn. λf. λx. (λf. λx. f x) f (n f x).
Then, we would substitute n with λf. λx. f x, which would result in λf. λx. (λf. λx. f x) f (λf. λx. f x) f x.

This can be simplified into λf. λx. f ((λf. λx. f x) f x).

Now, the inner expression (λf. λx. f x) f x has a lambda abstraction λf. λx that represents a function that takes a function f and an argument x. The expressions are left-associative, so we can also write it as ((λf. λx. f x) f) x). Now, we would first apply the argument f, resulting in (λx. f x). Then, we would apply the argument x to the function λx. f x, resulting in f x. The final result of the inner expression is f x.

Now, we would have λf. λx. f ( f x ), which corresponds to Church numeral 2.

Thus, the result of adding 1 and 1 in lambda calculus is 2.

Lambda Calculus Execution in Terms of Call Stack

The applications of lambda calculus can be viewn as function calls in programming languages. When lambda calculus is implemented in programming languages, the call stack is used to manage the reduction process and the application of functions. As an example, for the lambda expression used to add 1 and 1, the reduction process involves multiple steps, each of which corresponds to a stack frame in the call stack-based model.

It is important to note that lambda calculus is an abstract formal system, and the call stack-based model is a way to interpret its behavior in terms of program execution. The actual reductions in lambda calculus are a form of term rewriting that do not directly map to a call stack.

Loading Exercise...

← Overview

Parsing and Recursion →