Algebraic Data Types and Data Constructors
Learning Objectives
- You know what algebraic data types are and how to create them in Gleam
- You know how new elements of some type are created
- You know how to match which data constructor an element was constructed with
Data Collections
You may recall data collection types like tuples and lists from the basics chapter. These are something we assume most programming languages to have, but have you ever considered how they are implemented?
In many programming languages, we think of data and data collections as a chunk of bytes in memory that we then interpret in different ways. This approach is very useful, when we actually need to be mindful about the underlying data representation, however, in the context of functional programming, it is often not our job to deal with low-level representations of data.
Algebraic Data Types
Just like in the case of functions, functional programming takes a “more mathematical” approach to representing data. Algebraic data types, first introduced in the 1970s1, are grouped into two fragments: sum types and product types.
Lists and enums are considered a sum type and tuples and records a product type. Sum types, product types and function types represent pretty much every construction the programmer can use to create functional programs with.
Consider the following type declarations:
is read “is defined as” and is usually used to make abbreviations.
is a sum type, because its valid values are either of the variant or then of the variant making the total amount of possible values the amount of possible values an can have, the variant, plus the amount of possible values an can have, the variant.
This can be sum-marized (🥁) as , where the vertical bars are used to denote the size of the type within.
The other type, is a product type, because the amount of possible values it can have is the product of the possible values for the field and the field, or with the bar syntax:
When declaring an ADT, one automatically declares a set of special functions called data constructors, one for each variant of that ADT. For example, the type has two data constructors:
Data constructors, unlike the constructor functions from object-oriented programming, are always automatically created based on the type declaration and cannot be defined or overloaded by the programmer. Crucially, the data constructors are the only way to create new values of the data type.
ADTs in Gleam
Consider the following ADT TrafficLight
.
It consists of 3 variants, Red
, Yellow
and Green
.
This special case of an ADT contains only singleton variants (variants that don’t store any data, called enum
s in many imperative programming languages).
It can be translated to Gleam by typing the type
keyword, followed by the name of the type and its variants in curly braces.
type TrafficLight {
Red
Yellow
Green
}
The variants can be created with a data constructor with the same name.
As we can see in the example, custom ADTs can be printed to the output with io.debug
.
Deconstructing ADTs with The case
Syntax
The main (and pretty much only way) to form conditional expressions in Gleam
is using the, very powerful, case
expression.
You might be familiar with similar concepts from other languages, usually called match
or switch
.
In the example below, case
is used to implement a simple function that returns the string “go”
whenever a Green
TrafficLight
is matched and “wait” otherwise.
case t {
Green -> "go"
_ -> "wait"
}
The “match arms” which consist of a pattern, ->
and target expression, are checked from top to bottom.
The case
expression evaluates to the target expression of the first match.
The underscore (called a discard pattern) matches anything, which means it’s the “default” option and therefore should come last.
Note that because ADTs can only be constructed using their data constructors, pattern matching can be made exhaustive by matching each constructor. A default case in general is therefore not necessary.
ADTs with Values
Enumerators are powerful for listing out singleton variants but the true power of algebraic data types lies in the fact that each variant can also store one or many values. Imagine a fancy traffic light, which when green shows the number of seconds until it turns red. This could be modelled by adding an integer to the variant:
Here’s another example of the same concept, which in this case that stores either an Int
or a String
inside the different variants.
pub type NumOrText {
Num(Int)
Text(String)
}
Similarly to the traffic light example, one could implement
a function for checking if a value of type NumOrText
is a Num
:
fn is_num(n: NumOrText) -> Bool {
case NumOrText {
Num(_) -> true
_ -> false
}
}
In this example, the underscore is again used to first match anything inside a Num
, and then as the default arm.
Constructing a variant with a value inside, requires calling its data constructor function.
The data constructor is a function when the variant holds values inside.
For instance, the Num
data constructor has the type signature Int -> NumOrText
As shown in the example above, io.debug
is a convenient function
for debugging types and values in Gleam code.
It prints out the value of a variable like it’s a function call to its data constructors.
Functions as Data
The fact that functions can be stored inside types should come as no surprise, so here’s a demo of that:
The underscore _
there is called the function capture syntax and is used to quickly create new anonymous functions.
Unfortunately it is very restricted and anonymous functions are usually required for anything more complex.
In gleam, any ADT variant can also be a product type. These are called records
Extracting Values with case
To get back the value used when constructing a value of an ADT, we type a variable name inside the pattern.
For example Num(n) -> ...
matches any NumOrText
which was created with the Num
constructor and binds the variable n
so that it is accessible in the target expression ...
The following function behaves similarly to the io.debug
for the NumOrText
type.
The following example demonstrates how to pattern match an ADT with exact values.
The match arm Green(0) -> Red
only matches a green light with a 0 in it.
Destructuring lists and tuples
Another common use case for the case
syntax and pattern matching in general is destructuring lists and tuples.
The function in the first example below returns all elements of a list except for the first element
(or an empty list for an empty list). Pay close attention to the pattern [head, ..tail]
used to destructure a Gleam list into a head
and a tail
.
Similarly, a tuple (created with #(a, b, c, ...)
) can be pattern matched as follows:
To Panic, or not to Panic, That is the Question
Notice that the tail
function returns an empty list for both an empty list and a singleton list.
This is useful if we want the tail
function to do something remotely useful when given an empty list, but we have another option: to panic!
This is done using the panic
keyword in gleam, which optionally accepts a message by writing panic as "message"
.
Usually, in functional programming, we would avoid the panic by instead returning an Option
, but those are covered in a later chapter.
In Gleam it is also possible to capture multiple values with destructuring. The example below returns the sum of the first three elements of the list (or fewer, if the list is short).
fn sum_of_first_three(l: List(Int)) -> Int {
case l {
[a, b, c, _] -> a+b+c
[a, b, _] -> a+b
[a, _] -> a
_ -> 0
}
}
Think about why the patterns are in this specific order.
In addition to capturing, case
syntax can also be used for checking if an element of the list matches a specific value.
fn second_value_is_2(l: List(Int)) -> Bool {
case l {
[_, 2, _] -> True
_ -> False
}
}
In the final example, we match on multiple subjects simultaneously in a case
expression.
This is quite useful if you need/want to flatten multiple nested case
expressions.
fn both_green(l1: TrafficLight, l2: TrafficLight) -> Bool {
case l1, l2 {
Green, Green -> True
_ -> False
}
}
Footnotes
Footnotes
-
See A Theory of Type Polymorphism in Programming by Robin Milner. ↩