Programming Language Design 2

Extensions to STLC


Learning Objectives

  • You know how STLC can be extended with the implementation of:
    • pairs (product types)
    • lists
    • sums
  • You understand the evaluation and typing rules related to these extensions

Pairs

You may recall that there are two kinds of algebraic data types, product types and sum types. The simplest interesting product type is a product type of two elements, a pair.

Let’s use the syntax (𝑑1,𝑑2)::𝑇1×𝑇2 to denote a pair. Here 𝑑1 and 𝑑2 are the Terms stored within the pair and 𝑇1 and 𝑇2 are their types. We express the product type of the entire pair with 𝑇1×𝑇2.

Some example of valid pairs could include:

  • (1,2)::β„€Γ—β„€
  • (1,True)::β„€Γ—πŸš
  • (1,(True,2))::β„€Γ—(πŸšΓ—β„€)

Working with pairs

Being able to define pairs with the (a, b) notation is nice but it’s not very useful as we’ve not yet defined a way to extract the values from the pair. Let’s define the projection functions (surjections) fst and snd that extract the first and second value of the pair respectively.

Using them in our language would look like this:

  • fst(1,2)⟢1
  • snd(1,False)⟢False
  • fst(snd(1,(2,True)))⟢2

Now that we’ve sketched out the features of pairs, we are ready to introduce their evaluation rules.

Dynamics of pairs

A pair is considered to be a value if both of its members are values, i.e. is_value(π‘Ž,𝑏)=is_valueπ‘Žβˆ§is_value𝑏. If a pair is not a value, it can be reduced.

The pair can be reduced in two ways; either by reducing its first element or then by reducing its second element. Therefore we can represent the evaluation rule for pairs as follows:

𝑑1βŸΆπ‘‘β€²1Pair1(𝑑1,𝑑2)⟢(𝑑′1,𝑑2)𝑑2βŸΆπ‘‘β€²2Pair2(𝑑1,𝑑2)⟢(𝑑1,𝑑′2)

Reduction of fst and snd is a bit more complicated but not unreasonable. The important part about their rules is that the rule for extracting an element from the pair requires the pair to be a value.

𝑣1,𝑣2are valuesFstPairfst(𝑣1,𝑣2)βŸΆπ‘£1𝑣1,𝑣2are valuesSndPairsnd(𝑣1,𝑣2)βŸΆπ‘£2𝑑1βŸΆπ‘‘β€²1Fstfst𝑑1⟢fst𝑑′1𝑑1βŸΆπ‘‘β€²1Sndsnd𝑑1⟢snd𝑑′1

Statics of pairs

In STLC, the statics of data types generally consist of two rule sets. The rule sets are the introduction and elimination rules. Such is the case also for the pair.

Introduction rules

In the case of pairs, we only have one introduction rule, since there’s only one way to construct the data type.

Ξ“βŠ’π‘‘1::𝑇1Ξ“βŠ’π‘‘2::𝑇2Ξ“βŠ’(𝑑1,𝑑2)::𝑇1×𝑇2

Elimination rules

There are two ways to deconstruct a pair and therefore also two elimination rules

Ξ“βŠ’π‘‘::𝑇1×𝑇2Ξ“βŠ’fst𝑑::𝑇1Ξ“βŠ’π‘‘::𝑇1×𝑇2Ξ“βŠ’snd𝑑::𝑇2

Now that we’ve formally defined how a pair would look like in our STLC, let’s try implementing it.

Loading Exercise...

Lists

Another, very interesting, special case of an ADT is the list. Unlike the pair, the list has two data constructors cons and nil. To make the life of our type checker (and our own life implementing it easier) we’ll have the nil constructor take a type as a parameter. In this way, typing nil, and therefore the entire list, becomes a trivial exercise in recursion. Let’s define the behavior for our list constructors as follows:

  • cons𝑇 (List𝑇)::List𝑇
  • nil𝑇::List𝑇

Above, we use the letter 𝑇 to refer to any type internal to the language, i.e. number, boolean, arrow, pair or list type. As expected, in cons𝑇 (List𝑇)::List𝑇 the 𝑇 refers to the same, although arbitrary, type.

Examples of list declarations:

  • cons1(cons2(nilβ„€))::Listβ„€
  • nil𝟚::List𝟚
  • consTrue(nil𝟚)::List𝟚

Similarly to pairs, lists also need deconstructors in order to be useful. In order to be distinguishable from more general ADTs, we call our list deconstruction expression lcase (short for list-case). We’ll define the syntax for it as follows: lcase(List𝑇)of|nil⇒𝑑1|consβ„Žπ‘‘β‡’π‘‘2

An example usage of the lcase in the form of an implementation for head_or_zero: lcasexsof|nilβ‡’0|consβ„Žπ‘‘β‡’β„Ž

Dynamics of Lists

Now that we’ve introduces the syntax, let’s take a look at the evaluation rules for lists.

𝑑1βŸΆπ‘‘β€²1Cons1cons𝑑1𝑑2⟢cons𝑑′1𝑑2𝑑2βŸΆπ‘‘β€²2Cons2cons𝑑1𝑑2⟢cons𝑑1𝑑′2

A List is considered a value if it is nil or if both its head and tail is a value, i.e. is_valuenil𝑇=1is_valuecons𝑑1𝑑2=is_value𝑑1∧is_value𝑑2

Similarly to pairs, we only deconstruct lists that are values, therefore the evaluation rules for lcase are:

𝑑1βŸΆπ‘‘β€²1LCase1(lcase𝑑1ofnil⇒𝑑2|consxhxt⇒𝑑3)⟢(lcase𝑑′1ofnil⇒𝑑2|consxhxt⇒𝑑3)LCaseNil(lcasenil𝑇1ofnil⇒𝑑2|consxhxt⇒𝑑3)βŸΆπ‘‘2vh,vtare valuesLCaseCons(lcase(consvhvt)ofnil⇒𝑑2|consxhxt⇒𝑑3)⟢[xh↦vh,xt↦vt]𝑑3

The most interesting one, is LCaseCons, where we substitute in the values vh and vt to the term 𝑑3.

Statics of Lists

Similarly to pairs, the typing rules for lists are also split into introduction and elimination rules. The List is in a sense the opposite of a pair because it has two introduction rules and only one elimination rule.

Introduction rules

NilΞ“βŠ’nil𝑇1::List𝑇1Ξ“βŠ’π‘‘1::𝑇1Ξ“βŠ’π‘‘2::List𝑇1ConsΞ“βŠ’cons𝑑1 𝑑2::List𝑇1

As mentioned earlier the nil constructor takes the type 𝑇1 as an argument in order to make typing it as easy as possible. If this weren’t the case, figuring out the type of nil would be difficult but not impossible1.

Elimination rules

One key consideration of the lists evaluation rule is the fact that both 𝑑2 and 𝑑3 have to be of the same type. This is because of the same reason that both branches of an if-then-else have to be of the same type.

Ξ“βŠ’π‘‡1::List𝑇1Ξ“βŠ’π‘‘2::𝑇2Ξ“,β„Ž::𝑇1,𝑑::List𝑇1βŠ’π‘‘3::𝑇2LCaseΞ“βŠ’(lcase𝑑1of|nil⇒𝑑2|consβ„Ž 𝑑⇒𝑑3)::𝑇2
Loading Exercise...

Sums

Sums (aka. coproducts) are types that can hold one of two types of values. We used Γ— to denote the products so let’s use + for sums.

We’ll define the following syntax for constructing sums: inl𝑑1𝑇2 and inr𝑑2𝑇1. Notice that similarly to nil, the sum constructors require the type of the other element in the sum. Some examples of sums:

  • inl1𝟚::β„€+𝟚
  • inr2𝟚::𝟚+β„€
  • inlTrue𝟚::𝟚+𝟚

Similarly to lists, deconstructing sums is also done with a case expression:

case𝑑of|inlπ‘₯1⇒𝑑1|inrπ‘₯2⇒𝑑2

The names inl and inr are short for injection left/right. In fact, like fst and snd for pairs are surjections, these functions are injections.

Notice that the syntax for matching sums is very similar to lists. In fact, we could even change lcase to case without ambiguity.

Algebraic data types can share a common syntax for pattern matching, as the constructors in the patterns uniquely determine the type of the subject. This is a common pattern in functional programming languages, such as Haskell.

case𝑑of|nil⇒…|consβ„Žtl⇒…

Here it’s clear that 𝑑 must be a list.

case𝑑of|inlπ‘₯1⇒…|inrπ‘₯2⇒…

Here it’s clear that it must be a sum type

case𝑑of|red⇒…|yellow⇒…|green⇒…

If there’s a data type called TrafficLight with constructors red,yellow,green, here 𝑑 must be a traffic light.

Dynamics of Sums

As one might expect, a sum is considered a value if it contains a value, therefore:

is_value(inl𝑑𝑇)=is_value𝑑is_value(inr𝑑𝑇)=is_value𝑑

The evaluation rules for sums are very similar to lists and pairs

π‘‘βŸΆπ‘‘β€²Inlinlπ‘‘βŸΆinlπ‘‘β€²π‘‘βŸΆπ‘‘β€²Inrinrπ‘‘βŸΆinrπ‘‘β€²π‘‘βŸΆπ‘‘β€²Casecase𝑑of|inlπ‘₯1⇒𝑑1|inrπ‘₯2⇒𝑑2⟢case𝑑′of|inlπ‘₯1⇒𝑑1|inrπ‘₯2⇒𝑑2

We reduce the sum until its a value and after that we evaluate the case and substitute the deconstructed value into body of the case.

𝑣is a valueCaseInlcase(inl𝑣 𝑇)of|inlπ‘₯1⇒𝑑1|inrπ‘₯2⇒𝑑2⟢[π‘₯1↦𝑣]𝑑1𝑣is a valueCaseInrcase(inr𝑣 𝑇)of|inlπ‘₯1⇒𝑑1|inrπ‘₯2⇒𝑑2⟢[π‘₯2↦𝑣]𝑑2

Statics for Sums

Similarly to Lists, Sums also have two introduction rules and only one elimination rule. Notably, providing the other summands type as an argument to the constructors helps us out a lot.

Introduction rules

Ξ“βŠ’π‘‘::𝑇1InlΞ“βŠ’inl𝑑 𝑇2::𝑇1+𝑇2Ξ“βŠ’π‘‘::𝑇2InrΞ“βŠ’inr𝑑 𝑇1::𝑇1+𝑇2

Elimination rules

Similarly to the lcase, the bodies of both branches have to be of the same type.

Ξ“βŠ’π‘‘::𝑇1+𝑇2Ξ“,π‘₯1::𝑇1βŠ’π‘‘1::𝑇Γ,π‘₯2::𝑇2βŠ’π‘‘2::𝑇CaseΞ“βŠ’(case𝑑of|inlπ‘₯1⇒𝑑1|inrπ‘₯2⇒𝑑2)::𝑇
Loading Exercise...

Footnotes

Footnotes

  1. In polymorphic type systems nil could be typed as βˆ€π‘‡,List𝑇. Type inference for this polymorphic type is always decidable as the type βˆ€π‘‡,List𝑇 is of rank 2. Types of higher-rank, e.g. ((βˆ€π‘‡,𝑇→𝑇)β†’π‘ˆ)β†’π‘ˆ don’t have a decidable type inference algorithm for generic type π‘ˆ.

    Read more in this Wikipedia article on polymorphism. ↩