Structs and Enums
Learning objectives
- You know how to define and use custom structs.
- You know how to define and use custom enums
- You know how to implement associated functions and methods for structs and enums.
Rust has two custom data types, structs and enums. These should be somewhat familiar already, since we've been using them extensively, starting with the String
struct that we used when looking at Rust ownership system, and the Option
and Result
enums for error handling. Now, we'll go through how to define and use our own structs and enums, and how to add custom functionality for them.
Structs
A struct (short for structure) is a custom data type that offers a practical approach for organizing related values together.
While Rust already has tuples that can be used to group values together, structs have some additional benefits. First, structs are named and can have named fields. For example, a tuple with two values of type f32
could represent a point in a 2D space, but it would be difficult to tell what the values represent. A struct named Point
with two fields, x: f32
and y: 32
, would be much clearer. Named fields also make structs more flexible as defining and accessing the values won't be dependent on the order. Second, and perhaps more importantly, a struct is its own type. This means a struct can have custom functionality associated with it through associated functions and methods
Structs in Rust come in three forms, C-like structs, tuple structs, and unit structs. The first one is the most common and useful one and it is also what is most often meant when generally talking about structs. We'll be covering that one first.
Defining structs
Declaring a new struct is done using the struct
keyword followed by the struct name and possible fields.
The above example creates a C-like struct FullName
with two fields first_name
and last_name
, both of type String
. By convention, the struct name is written PascalCase and the field names in snake_case. The compiler will emit a warning if the convention is not adhered to.
Since a struct is a type, nothing prevents us from using our structs as types in other structs.
Instantiating structs
Struct instances are created by writing the name of a struct and the values for the fields.
Printing a struct instance won't work out of the box, even when using debug formatting. The following code will cause a compiler error.
The error says that the struct does not implement Debug
. Debug
is a trait that is required for printing values with the debug formatting and custom types don't implement any functionality by default. We'll discuss traits in the next part of the course, but for now, we can add the opt-in functionality for debug printing using the #[derive(Debug)]
attribute as the compiler hints (we'll discuss attributes later too).
Accessing struct fields
Accessing the values of a struct instance is done by using the dot operator (.
) followed by the field name.
Like tuple and array values, struct fields can be destructured into variables using the let
keyword.
The syntax can be slightly cumbersome though, especially when the struct name is long.
Mutating field values
Struct mutability works the same as with arrays and tuples. Fields of a struct are mutable if and only if the struct itself is mutable. We cannot make the values of individual fields mutable.
Tuple structs
Tuple structs are structs with unnamed and ordered fields. They are defined with syntax similar to tuples and can be thought of as either named tuples or structs with unnamed values.
As can be seen in the example above, tuple structs are accessed by dot notation and index similarly as regular tuples. The benefit of tuple struts over mere tuples is the added meaning of a struct through its type.
C-like structs are more popular than tuple structs because having explicit field names is often preferred for code clarity. Tuple structs can still be a better option at times, for instance if struct field names cause more work than they help. Would you rather have the following as a C-like structs or a tuple structs?
Another use for tuple structs is to create a new wrapper type around a single value. This pattern is called the newtype pattern.
This way, the compiler can for instance distinguish between different units of measurement for the same numeric type and warn us when we mix them up.
Unit structs
Unit structs are struct that do not contain any data, they function similarly as the unit type ()
.
These structs may seem useless on the surface, but they can be useful for methods or traits that don't require stored values. For example the struct chrono::Utc
in the chrono crate is a unit struct that can be used to obtain current universal coordinated time (UTC).
In case you are more familiar with object oriented programming, unit structs can be thought of as classes that have no value fields but implement methods or interfaces.
Struct ownership
Using a variable to initialize a struct field is similar to assigning the variable value to a new variable. If an owned value in a variable is assigned to a struct field value, the value in the variable is either implicitly copied or moved.
If a field has the same name as a local variable, like is the case in our example with x
and first_name
, we can omit the colon and instead just write the variable/field name.
Moving field values
Assigning struct field values to variables works the same as assigning tuple values to variables. Moving a value out of a struct prevents the field being used any more. The rest of the values are still owned by the struct and can be accessed normally, but the struct as a whole is no longer usable.
A value borrowed from a struct has its lifetime bound to the lifetime of the struct. The struct that owns the value must live longer than all references to that value.
References in a struct
It is possible for a struct to contain references instead of an owned values. However, to be able to use references in a struct, we would need to specify the reference lifetimes explicitly.
The following code won't compile; the compiler will complain about missing lifetime specifiers.
We will discuss lifetimes later in the course when looking closer into Rust memory management and related advanced options. We'll then look at how to fix these errors. We'll conform to using only owned values in struct fields for now.
Borrowing and cloning structs
Borrowed structs are implicitly dereferenced when accessing struct fields.
Otherwise, we would need to write (*borrowed_name).first_name = ...
to access the field first_name
of borrowed_name
in the above example. This automatic dereferencing behavior works for arbitrarily deep references, such as &&&name
.
Similarly as structs can not be debug printed by default, we cannot copy or clone a struct without first implementing such behavior. We can add the functionality easily by extending the #[derive(Debug)]
attribute to include Clone
to allow cloning a struct (or we can have just Clone
if we don't care about printing). For structs where all fields are copiable, we can also add Copy
to allow implicit copying of a struct.
Note that Copy
cannot be implemented without Clone
.
Associated functions and methods
Structs in Rust contain only data, but it is possible to associate functions with the struct. Such associated functions can be either methods, i.e. functions that have access to the struct values, or simply functions that are defined as belonging to the struct. Associated functions are not limited to structs though, any type in Rust can have associated functions. For instance the primitive type f64
's round
and from_bits
are associated functions.
In many object-oriented programming languages, which have classes and objects for organizing related data and functionality together, such functions are defined within the curly braces of the class. In Rust, these functions are not defined within the type definition, e.g. inside the curly braces of the struct, but in separate implementation blocks.
Defining associated functions for a struct is done using the impl
keyword followed by the struct name.
Associated functions can be called using the ::
operator (the same as item path separator) after the struct name as can be seen in the above example.
The Vec::new()
function we have been using for instantiating a vector is also an associated function. Often such associated functions that return a new instance of the struct are called constructors. By convention, the name for a "default" constructor should be new
. We could have named the origin
constructor function for Point
as new
in the above example just as well, although origin
can be considered more clear.
When defining associated functions that refer to the struct type, e.g. constructors, we can use a special Self
type as an alias for the actual struct type. This can be especially convenient with long type names.
Methods
Methods are associated functions that are defined for a struct's instance — they are very similar to object methods in object-oriented programming. Methods have access to inner data of a struct instance and can be called using the dot syntax
To define a method, we specify an associated function with self
as the first parameter.
The receiver parameter self
is a special parameter whose type (Self
), can be omitted.
The to_string
in the example takes self
as an owned parameter. This means that the method takes ownership of the struct instance and consumes it. If we don't want to consume the struct instance, we can use a reference to the struct instance &self
as the receiver parameter.
Modifying a struct instance using methods
A method can modify a struct instance by taking a mutable reference to it (&mut self
) as the first parameter.
The self
parameter can also be mutable (mut self
) like any other parameter but this is more restrictive than taking a mutable reference as the method would take ownership of the passed struct instance.
Field and associated function visibility
Fields and associated functions in structs can be public or private (which is the default), indicating whether they can be read and modified or not.
This is useful for preventing misuse of a struct's fields or methods. For example, the Vec
struct has a field len
but it is not public. If it was, the length could be modified freely, making it possible to be different from the actual number of the values.
Like path items, struct fields and functions are marked public by prefixing them with the keyword pub
. The next example shows how we can create a type Probability
that can take only values between 0 and 1 (f64
).
To be able to modify the value of the Probability
type without making it public, we need a new public method that updates the value. We'll throw in additional struct for possible errors and a private helper function for validating the value.
Now we have created a type that we can be sure is always between 0 and 1. Creating a new instance is possible only through the new
associated function and updating the value is only possible through the set
method, which both ensure the set value is valid.
Enums
An enum, which stands for enumeration, is a custom type which consists of multiple exclusive variants. A variant can contain zero or more values in it.
The Switch
enum has two possible exclusive states: On
and Off
. No switch can be in any other state.
Variants with no values are similar to unit structs as can be observed from the example. Variants with values are defined in the same way as structs, but without the struct
keyword. Like with structs, we have two options: C-like and tuple-like variants.
Unlike struct fields, variants and their fields are always public, they cannot be made private.
Creating an instance of an enum is the same as creating an instance of its variant. Enum variants can be accessed with the same syntax as with items from modules or associated functions from structs.
To be able to debug print an enum instance we need to derive Debug
with #[derive(Debug)]
, just like with structs.
Collections in Rust, such as arrays and vectors, can hold values of only a single type. An enum is a single type that can hold values of different types. Thus, we can leverage them to create collections which can hold multiple types.
For example, we can create a Numeric
enum which can potentially hold any of the primitive types.
Handling enum variants
Although similar to structs, enum variants are not types. The enum is the type and the variants are the possible values of that type.
We should already be familiar with handling enum variants with match
and if let
from the control flow and error handling part of the course. Anyhow, let's refresh our memory by handling custom enums. We'll expand on the Probability
struct example by converting the ProbabilityError
to be an enum instead, following the example of the IntErrorKind
enum in the standard library.
Then, we can easily display custom error messages based on the error variants whenever needed.
The whole expanded example would then look like
Methods on enums
Enums can have associated functions and methods just like structs. The enum variants on the other hand are not types and therefore cannot have their own methods. This causes some verbosity in the methods as we must always pattern match the variants for any meaningful processing.
Summary of symbols
Symbol | Description |
---|---|
struct | defines a new struct |
impl | defines associated functions for a struct |
self | a special parameter name that refers to struct instance |
Struct { ..old } | shorthand for initializing a struct with values from an existing struct instance |
enum | defines a new enum |
Hi! Please help us improve the course!
Please consider the following statements and questions regarding this part of the course. We use your answers for improving the course.
I can see how the assignments and materials fit in with what I am supposed to learn.
I find most of what I learned so far interesting.
I am certain that I can learn the taught skills and knowledge.
I find that I would benefit from having explicit deadlines.
I feel overwhelmed by the amount of work.
I try out the examples outlined in the materials.
I feel that the assignments are too difficult.
I feel that I've been systematic and organized in my studying.
How many hours (estimated with a precision of half an hour) did you spend reading the material and completing the assignments for this part? (use a dot as the decimal separator, e.g 8.5)
How would you improve the material or assignments?