Peeking into Compilers

Code Generation

Learning Objectives

You know what code generation is and understand it’s position in the compilation pipeline.
You know of tasks that are involved when translating high-level language abstractions to machine code.

In modern compiler design, code generation is the step where the compiler translates an optimized intermediate representation (IR) into target-specific machine code or bytecode. This is the final phase of a compilation pipeline. The translation is nontrivial because it must account for architectural details, performance optimizations, and resource limitations such as registers and memory.

Here, we take a high-level view into code generation: transforming IR to machine code, translating high-level language constructs, instruction selection and scheduling, register allocation, stack frame management, memory organization, and exception handling.

In the code examples, we use a simplified assembly-like language to illustrate the concepts. The goal is to provide an overview of the code generation process without getting bogged down in the specifics of any particular architecture or language.

IR to machine code

The transformation from IR to machine code is often performed in multiple stages, involving several layers of lowering. Each stage converts the IR to a lower-level abstraction until the final machine instructions are produced. As an example, consider the following IR operation for addition t1 = a + b. A possible translation into assembly could be as follows.

    mov eax, [a]      ; Load variable a into register eax
    add eax, [b]      ; Add variable b to eax
    mov [t1], eax     ; Store the result in t1

Above, eax denotes a CPU register, and content within square brackets like [a] are memory locations.

The above sequence of instructions loads the value of a into the register eax, adds the value of b to the value in eax, and stores the result in t1. Alternatively, the compiler could load the variable b into a different register before the addition.

    mov eax, [a]      ; Load variable a into register eax
    mov ebx, [b]      ; Load variable b into register ebx
    add eax, ebx      ; Add ebx to eax
    mov [t1], eax     ; Store the result in t1

Already, the above examples illustrate how the compiler can generate different machine code sequences for the same high-level operation.

Loading Exercise...

Translating high-level language abstractions

High-level languages have constructs like loops, conditionals, and function calls. During code generation, these are broken down into lower-level operations, typically involving jump instructions and stack manipulation.

Jumps and stack manipulation

Jump instructions in assembly are used to change the flow of execution. For example, jmp is an unconditional jump, je is a jump if equal, and jne is a jump if not equal. The jumps can target labels, which are markers in the code that define a location. As an example, the following showcases an infinite loop in assembly.

loop_start:
    ; Loop body
    jmp loop_start

Stack manipulation, on the other hand, involves pushing and popping values onto the stack. The stack is a region of memory used to store function call information, local variables, and other data. As an example, the following code snippet pushes two values onto the stack and then pops them off.

    push eax
    push ebx
    pop ebx
    pop eax

Translating loops

As an example, a loop in a high-level language can be translated into a sequence of jump instructions that repeatedly execute the loop body until a termination condition is met.

for (var i = 0; i < 10; i++) {
    // Loop body
}

The above loop from 0 to 9 can be translated into the following assembly-like code. At first, we initialize the loop counter i to 0, placed in registry eax, which is followed by a definition of the loop start label. Within the loop, the value in the registry eax is compared with 10. If the value is greater or equal to 10, the jump instruction jge is used to jump to the loop end label. Otherwise, the loop body is executed, the counter is incremented, and the loop jumps back to the loop start.

    mov eax, 0         ; Initialize loop counter i = 0
loop_start:
    cmp eax, 10        ; Compare i with 10
    jge loop_end       ; Exit loop if i >= 10
    ; ... loop body ...
    inc eax            ; Increment i
    jmp loop_start     ; Repeat loop
loop_end:

This translation demonstrates how a high-level loop construct is transformed into a sequence of instructions that manage the loop counter, compare it with a threshold, and jump back to the loop body until the termination condition is met.

Conditionals

Conditionals such as if-else statements are also translated into conditional jump instructions that determine which block of code to execute based on a condition.

if (x > 0) {
    // Positive branch
} else {
    // Negative branch
}

For example, for the above code, we can translate it into the following assembly-like code. The code first loads the value of x into the register eax, compares it with 0, and jumps to the else_block if x is less than or equal to 0. Otherwise, the positive branch is executed, followed by a jump to the end_if label.

    mov eax, [x]
    cmp eax, 0
    jle else_block     ; Jump if x <= 0
    ; ... code for positive case ...
    jmp end_if
else_block:
    ; ... code for non-positive case ...
end_if:

Function calls

Function calls require the creation of a new stack frame to handle parameters, local variables, and the return address.

int sum(int a, int b) {
    return a + b;
}

void main() {
    int result = sum(3, 4);
    print(result);
}

The creation of a new stack frame involves saving the caller’s base pointer, establishing a new base pointer, and allocating space for local variables. The base pointer points to a fixed location within the stack frame (usually where the previous frame’s base pointer was saved).

The following assembly-like code demonstrates the function call and stack frame setup.

; Function: sum
sum:
    push ebp             ; Save caller's base pointer
    mov ebp, esp         ; Establish new stack frame
    mov eax, [ebp+8]     ; Load first parameter (a)
    add eax, [ebp+12]    ; Add second parameter (b)
    pop ebp              ; Restore base pointer
    ret                  ; Return (result is in eax)

; Function: main
main:
    push 4               ; Push second argument
    push 3               ; Push first argument
    call sum             ; Call function sum
    add esp, 8           ; Clean up the stack (2 parameters * 4 bytes)
    ; Continue with main...

Above, the main function pushes the values 4 and 3 onto the stack before calling the sum function. The call instruction further pushes the return address onto the stack. After calling the function, from the caller’s perspective, the stack would look like this:

    +-----------------+
    | Parameter 2: 4  | <- ebp + 12
    +-----------------+
    | Parameter 1: 3  | <- ebp + 8
    +-----------------+
    | Return Address  | <- ebp + 4
    +-----------------+
    | Caller's EBP    | <- ebp
    +-----------------+

The assumption is that both parameters are four-byte integers. Function calls typically follow a calling convention from right to left order (parameters are pushed onto the stack in reverse order).

In the sum function, the parameters are accessed relative to the base pointer (ebp), where the first parameter is at ebp+8 and the second at ebp+12. After the calculation, the result is stored in the eax register, and the ret instruction pops the return address from the stack and jumps back to the caller. The add esp, 8 instruction cleans up the stack by removing the parameters pushed earlier.

Note that the stack grows downwards in most architectures, meaning that the stack pointer (esp) decreases as items are pushed onto the stack. The stack is a LIFO (Last In, First Out) data structure — the last item pushed is the first to be popped. The base pointer (ebp) is used as a reference point to access parameters and local variables within the stack frame. That is, although 4 is pushed first, it ends up at a higher offset from ebp because the stack grows downwards.

Hardware-specific concerns

Much of the complexity in code generation comes from the need to generate efficient code that takes advantage of the target architecture. This includes instruction selection, scheduling, register allocation, stack frame management, memory organization, and exception handling.

As an example, if the target hardware has four general-purpose registers, the compiler must allocate these registers efficiently to minimize spills (storing values temporarily in memory). Similarly, the compiler must generate code that respects the memory layout of the target system, such as the stack, heap, and data segments.

Loading Exercise...

Exception handling and stack unwinding

Many modern languages also support exception handling, which requires additional metadata and runtime support. As an example, the try-catch construct is often used for exception handling.

try {
    // Code that might throw an exception
} catch (e) {
    // Exception handler
}

When using exception handling, the compiler generates additional structures to manage exceptions, such as exception tables that map code regions to their corresponding handlers. Then, when an exception is thrown, the call stack must be unwound to find the appropriate exception handler. Stack unwinding is used to traverse the call stack and locate the appropriate handler when an exception occurs.

Stack unwinding is also used to clean up resources allocated within a function before the exception occurred. This includes freeing memory, closing files, and releasing other resources. The unwinding process ensures that resources are properly released even if an exception occurs.

Programming languages can also provide functionality that allows mapping the exceptions back to the original source code, which can be useful for debugging and error reporting.

Metadata

In addition to the machine code instructions, the compiler often generates metadata that provides additional information about the program. This metadata can include symbol tables, debugging information, type information, and other details that are useful for debugging, profiling, and optimization.

Loading Exercise...

← Compiler Optimization Techniques

Linking, Loading, and Runtime Environment →