Fun and Functions with Gleam

Algebraic Data Types and Data Constructors


Learning Objectives

  • You know what algebraic data types are and how to create them in Gleam
  • You know how new elements of some type are created
  • You know how to match which data constructor an element was constructed with

Data Collections

You may recall data collection types like tuples and lists from the basics chapter. These are something we assume most programming languages to have, but have you ever considered how they are implemented?

In many programming languages, we think of data and data collections as a chunk of bytes in memory that we then interpret in different ways. This approach is very useful, when we actually need to be mindful about the underlying data representation, however, in the context of functional programming, it is often not our job to deal with low-level representations of data.

Algebraic Data Types

Just like in the case of functions, functional programming takes a “more mathematical” approach to representing data. Algebraic data types, first introduced in the 1970s1, are grouped into two fragments: sum types and product types.

Lists and enums are considered a sum type and tuples and records a product type. Sum types, product types and function types represent pretty much every construction the programmer can use to create functional programs with.

Consider the following type declarations:

AOrB𝐴(Int)|𝐵(Int)AAndB(𝐴:Int,𝐵:Int)

is read “is defined as” and is usually used to make abbreviations.

AOrB is a sum type, because its valid values are either of the variant 𝐴(Int) or then of the variant 𝐵(Int) making the total amount of possible values the amount of possible values an Int can have, the 𝐴 variant, plus the amount of possible values an Int can have, the 𝐵 variant.

This can be sum-marized (🥁) as |Int|+|Int|, where the vertical bars |𝑇| are used to denote the size of the type within.

The other type, AAndB is a product type, because the amount of possible values it can have is the product of the possible values for the 𝐴 field and the 𝐵 field, or with the bar syntax: |Int||Int|

When declaring an ADT, one automatically declares a set of special functions called data constructors, one for each variant of that ADT. For example, the AOrB type has two data constructors:

𝐴:IntAOrB𝐵:IntAOrB

Data constructors, unlike the constructor functions from object-oriented programming, are always automatically created based on the type declaration and cannot be defined or overloaded by the programmer. Crucially, the data constructors are the only way to create new values of the data type.

ADTs in Gleam

Official documentation.

Consider the following ADT TrafficLight.

TrafficLightRed|Yellow|Green

It consists of 3 variants, Red, Yellow and Green. This special case of an ADT contains only singleton variants (variants that don’t store any data, called enums in many imperative programming languages). It can be translated to Gleam by typing the type keyword, followed by the name of the type and its variants in curly braces.

type TrafficLight {
  Red
  Yellow
  Green
}

The variants can be created with a data constructor with the same name.

Run the program to see the output

As we can see in the example, custom ADTs can be printed to the output with io.debug.

Deconstructing ADTs with The case Syntax

The main (and pretty much only way) to form conditional expressions in Gleam is using the, very powerful, case expression. You might be familiar with similar concepts from other languages, usually called match or switch. In the example below, case is used to implement a simple function that returns the string “go” whenever a Green TrafficLight is matched and “wait” otherwise.

case t {
  Green -> "go"
  _ -> "wait"
}

The “match arms” which consist of a pattern, -> and target expression, are checked from top to bottom. The case expression evaluates to the target expression of the first match. The underscore (called a discard pattern) matches anything, which means it’s the “default” option and therefore should come last.

Note that because ADTs can only be constructed using their data constructors, pattern matching can be made exhaustive by matching each constructor. A default case in general is therefore not necessary.

Loading Exercise...

ADTs with Values

Enumerators are powerful for listing out singleton variants but the true power of algebraic data types lies in the fact that each variant can also store one or many values. Imagine a fancy traffic light, which when green shows the number of seconds until it turns red. This could be modelled by adding an integer to the Green variant:

FancyTrafficLightRed|Green(Int)

Here’s another example of the same concept, which in this case that stores either an Int or a String inside the different variants.

pub type NumOrText {
  Num(Int)
  Text(String)
}

Similarly to the traffic light example, one could implement a function for checking if a value of type NumOrText is a Num:

fn is_num(n: NumOrText) -> Bool {
  case NumOrText {
    Num(_) -> true
    _ -> false
  }
}

In this example, the underscore is again used to first match anything inside a Num, and then as the default arm.

Constructing a variant with a value inside, requires calling its data constructor function. The data constructor is a function when the variant holds values inside. For instance, the Num data constructor has the type signature Int -> NumOrText

Run the program to see the output

As shown in the example above, io.debug is a convenient function for debugging types and values in Gleam code. It prints out the value of a variable like it’s a function call to its data constructors.

Functions as Data

The fact that functions can be stored inside types should come as no surprise, so here’s a demo of that:

Run the program to see the output

The underscore _ there is called the function capture syntax and is used to quickly create new anonymous functions. Unfortunately it is very restricted and anonymous functions are usually required for anything more complex.

In gleam, any ADT variant can also be a product type. These are called records

Run the program to see the output

Extracting Values with case

To get back the value used when constructing a value of an ADT, we type a variable name inside the pattern. For example Num(n) -> ... matches any NumOrText which was created with the Num constructor and binds the variable n so that it is accessible in the target expression ...

The following function behaves similarly to the io.debug for the NumOrText type.

Run the program to see the output

The following example demonstrates how to pattern match an ADT with exact values.

Run the program to see the output

The match arm Green(0) -> Red only matches a green light with a 0 in it.

Destructuring lists and tuples

Another common use case for the case syntax and pattern matching in general is destructuring lists and tuples. The function in the first example below returns all elements of a list except for the first element (or an empty list for an empty list). Pay close attention to the pattern [head, ..tail] used to destructure a Gleam list into a head and a tail.

Run the program to see the output

Similarly, a tuple (created with #(a, b, c, ...)) can be pattern matched as follows:

Run the program to see the output

To Panic, or not to Panic, That is the Question

Notice that the tail function returns an empty list for both an empty list and a singleton list. This is useful if we want the tail function to do something remotely useful when given an empty list, but we have another option: to panic!

This is done using the panic keyword in gleam, which optionally accepts a message by writing panic as "message".

Run the program to see the output

Usually, in functional programming, we would avoid the panic by instead returning an Option, but those are covered in a later chapter.

In Gleam it is also possible to capture multiple values with destructuring. The example below returns the sum of the first three elements of the list (or fewer, if the list is short).

fn sum_of_first_three(l: List(Int)) -> Int {
    case l {
        [a, b, c, _] -> a+b+c
        [a, b, _] -> a+b
        [a, _] -> a
        _ -> 0
    }
}

Think about why the patterns are in this specific order.

In addition to capturing, case syntax can also be used for checking if an element of the list matches a specific value.

fn second_value_is_2(l: List(Int)) -> Bool {
    case l {
        [_, 2, _] -> True
        _ -> False
    }
}

In the final example, we match on multiple subjects simultaneously in a case expression. This is quite useful if you need/want to flatten multiple nested case expressions.

fn both_green(l1: TrafficLight, l2: TrafficLight) -> Bool {
    case l1, l2 {
        Green, Green -> True
        _ -> False
    }
}
Loading Exercise...

Loading Exercise...

Loading Exercise...

Footnotes

Footnotes

  1. See A Theory of Type Polymorphism in Programming by Robin Milner.