Loops and iteration
Learning objectives
- You know the various ways to loop over statements in rust.
- You know how to use Rust's iterators, iterator adapters and consumers
- You know how to use closures together with iterator methods
Looping
We often want to execute one or more statements in our code more than once. Programming code instructions that allow us to repeatedly execute parts of code are usually called loops. Rust provides us with three types of loops: loop
, while
, and for
.
Loop
The simplest form of a loop in Rust is an infinite loop. By placing the keyword loop
in front of a block expression that serves as the body of the loop, we can infinitely repeat executing the block expression.
In many cases we don't want to loop forever, but stop at some point. Breaking a loop can be done with a break
statement in the body of the loop — return
would also work, but that would exit the enclosing function too, not just the loop.
Like if statements, loops too can be used as expressions. The value of the loop expression is the value specified with break <value>;
(using break
works just like using return
, only it returns from a surrounding loop expression and not the surrounding function).
We may also want to skip the rest of the loop body and jump to the beginning of the loop. This can be done with the continue
statement.
While
The while
loop, while very similar to loop
, requires an additional stopping condition in the form of while <boolean expression> {<body of the loop>}
— loop
is equivalent to while true
. The body of the loop is executed as long as the condition evaluates to true.
The condition of a while-loop can be any expression (including one with side effects such as mutating variables) as long as the expression evaluates to a boolean value. Blocks are expressions in Rust, which means that we can do a dirty trick to turn Rust's "while do" into a "do while" loop.
While this works, it is heavily frowned upon because it is much less readable than the Rust's while
when used normally. Please refrain from using such hacks.
And then there is this, in the Rust compiler's test suite even.
The break
and continue
statements work in while
loops just like with loop
.
Question not found or loading of the question is still in progress.
For
The third type of loop in Rust is the for
loop that can be used to iterate over a set of values. The for
loop can be much more concise than loop
or while
, for instance when we want to loop over a range of values.
Similarly as we used ..=
for a range pattern previously, we can use the same syntax to create a range of values to iterate over with for
.
The same code using a while
loop would look something like
Leaving out the =
in the range pattern would create a range that does not include the last value, so 0..3
would iterate over the values 0
, 1
, and 2
.
The ..=
syntax is used for both ranges and patterns. In patterns, however the =
is required due to current compiler limitations (overlap with slice syntax parsing). Trying to use an exclusive range as a pattern causes an error.
Besides ranges, we can use a for
loop to iterate over other sets of values like vectors, arrays and hash maps.
The next example iterates over an array of degrees Celsius to create a new vector of the degrees converted to Fahrenheit.
Using for
to iterate over a hash map gives a tuple of the key and value of each entry.
Modifying through iteration
Using a for loop on an iterable collection implicitly calls the into_iter
method of the collection to get an iterator of immutable values over the collection. The for
loop then iterates over the values in the iterator.
To be able to modify values in a mutable collection using a for
loop, we can call the iter_mut
method to get an iterator over mutable references to the collection values.
In the next example, the array of degrees Celsius are converted in-place to degrees Fahrenheit.
The HashMap collection has some more options that allow us to easily iterate over just its values or keys (check out the documentation). For instance, we can loop over mutable references to the values of the map with the help of the values_mut
method.
The iter_mut
method also would works on HashMap
but it would give us mutable references to both the keys and values instead of just the values.
Remember that when passing a mutable value to a function as a reference, the reference needs to be explicitly marked as mutable with &mut
. Otherwise the function will receive an immutable reference and the values cannot be modified therein. We also need to dereference the references with the *
operator.
Iterators and adapters
We already familiarized ourselves with iterators when working with for
to iterate over collections, but let's now dive a bit deeper into the topic.
Iterators are types that provide useful methods for inspecting and processing iterable data structures like arrays and vectors. The iterator methods come in two flavours: iterator adapters are methods that return a new (possibly modified) iterator and iterator consumers are methods that consume the iterator. With the help of iterator adapters and consumers, iterators can often be used to do same things as for
loops but more expressively and concisely using functional syntax.
Iterators in Rust are an example of zero-cost-abstractions. They offer convenient functionality while posing no additional computational overhead compared to operating directly on the collection.
Let's look at how we can convert a list of degrees Celsius to degrees Fahrenheit, like we did before, but this time using the iterator consumer for_each
.
The for_each
method of Iterator
takes in a function and applies it on each value in the iterator. Notice that we used the same iter_mut
as with the for
loop. However, we needed to modify the celsius_to_fahrenheit
function to take in a mutable reference to the value.
Consuming iterators
Let's now take a look how we can use the adapter map
to transform the values from an iterator.
By running the code, we see that calling map
didn't actually apply the given function but the values in the iterator appear to be the same as original. This is because iterators are lazy and do not compute their values until they are consumed. Iterator laziness has multiple benefits in terms of performance, especially when chaining adapters or when needing only some of the values from an iterator.
To get the values from the iterator, we can use the collect
method of the iterator. This method consumes the iterator and collects the values into a collection. The solution to the above example shows how to collect the values into a Vec<f64>
.
When we want to iterate over owned values, we can use the into_iter
method. This moves or copies the values of a collection into the returned iterator, depending on whether the collection is copiable. If the collection owns the values (is not e.g. a slice), the iterator values are also owned.
Here, instead of specifying the type of the fahrenheits
variable, we used the "turbofish" syntax ::<>
to specify the output type of the collect
method.
Some methods, like String
's parse
and Iterator
's collect
, have generic output types. Whenever they are used, the compiler requires us to specify the target type of the output. This can be done either by explicitly specifying a type of the output variable
or by using the "turbofish" syntax ::<>
.
In some cases, the type information can also be inferred from the context, like when returning a value from a function.
Sometimes some of the type information can be inferred, but not all of it. In such cases, we can omit writing the whole type explicitly by using the _
symbol for the inferrable part(s).
In the example above, the iterator is collected to a Vec
and the type of the elements is inferred from the type of the iterator, which in turn is inferred from the type of the array numbers
.
In case we need owned values, but into_iter
returns references, we can use the copied
or cloned
method (depending on the type) to get owned copies of the values — we could of course alternatively use map
with a function that calls to_owned()
on its parameter.
Some other consuming methods than collect
are for example the for_each
that works like for
loop, nth
to get the nth value, and count
that computes the amount of items in the iterator — we used nth
previously to get the nth character in a string (in the "Indexing strings?" info snippet in the slices section). Rust also provides specialized consuming methods for numerical iterators, such as sum
and product
.
The following example shows how we get an error if we try to use an iterator after it has been consumed with count
. To fix the example, we can use iter()
again to create a new iterator when one is needed.
The nth
method does not consume the iterator fully, but only until the nth value. This means that we can use the iterator again after calling nth
. We need to be careful though, as the iterator will continue from where we left off.
Iterating one at a time
All iterators have a method next
which may or may not give back a value. Let's create an iterator to inspect an array using the next
method.
The next
method yields values of type Option
. After all, there might not be a next value in the iterator. We can see this when calling next()
thrice for an array of size 2 in the above example.
Notice that we defined the arr_iter
variable as mutable. This is required even though the array itself doesn't change — the iterator does change. The iterator "keeps getting smaller" each time we take the next
value from it. The next
method immediately consumes a value from the iterator, meaning that is is not lazy. The consuming iterator methods like collect
and nth
work non-lazily, i.e. eagerly, by calling next
repeatedly.
Nothing stops us from creating a range until the end of the universe (or at least until the maximum value of the numerical type) and iterating over it. However, for loops can't be used as expressions similarly as loop
and while
because the for
loop body needs to always return the unit value ()
.
The reason for this is that the for
loop is actually syntax sugar for working with iterators. The expanded form of a for loop is shown in the Rust documentation for the for
keyword.
Closures
Closures can be thought of as anonymous functions with concise syntax. A closure is an expression which we can call like regular functions. To create a closure, we write parameters inside pipes |
and follow them with the body of the closure.
The key difference to functions is that closures have access to variables in the scope they are defined in. In other words, closure capture their enclosing scope, hence closure. This gives closures a neat advantage over ordinary functions. Notice how in the previous example we didn't have to specify the types of the parameters a
and b
or the return type of the closure. The compiler is able infer the types from the context and the body of the closure so we don't have to specify them explicitly.
Capturing the enclosing scope also means that we can use variables from the enclosing scope in the body of the closure.
Closures and ownership
When capturing variables in closures, we need take care to adhere to Rust's ownership rules like we do when defining and using variables.
A closure can capture variables in three ways: immutable borrow, mutable borrow, or by move (take ownership). The compiler infers which one to use based on the closure body. Imagine a function that decides automatically whether it should take a mutable or immutable reference to a variable based on the body of the function. That's a closure for you. It won't solve all problem's though.
In the example below, we double the value of variable i
from the enclosing scope in the body of the closure. It doesn't compile however. Check the compiler error message to see an error message about immutable and mutable borrows that is familiar to what we saw when first looking into ownership and borrowing.
As a comparison, what we did here with the closure is equivalent to the following with just variables.
Note also that we needed the mut
keyword for the closure to allow the closure to modify the variable i
from the enclosing scope. Like variables, closures are immutable by default and won't take mutable references implicitly. We can also force a closure to take ownership instead of borrowing with the move
keyword. It is mostly used in parallel computing with multiple execution threads so we won't be needing move
for this course.
If the closure ownership handling feels like a lot to handle, no worries! We'll mainly be using closures without capturing variables from the enclosing scope, so there goes that problem.
Iterator methods and closures
Previously, we looked at an example, where degrees Celsius were converted into degrees Fahrenheit using for_each
. In the example, we had to use a new version of the function that modified the floating point numbers instead of returning the result of the computation. With closures, we can still use the old non-mutating celsius_to_fahrenheit
in the call to for_each
to mutate the values in a collection.
We haven't used the for_each
method before for printing, but always resorted to the for
loop. This is because the println!
macro is not a function, so we cannot pass it as an argument to for_each
. However, we can easily wrap the macro in a closure and pass it to for_each
.
Likewise, we can use the map
method together with a closure without having to specify extra functions with parameter types and all.
Here, we also used the zip
method to iterate over two collections at the same time. The zip
adapter method takes in another iterator and returns a new iterator that yields tuples of the values from the two iterators.
Filtering and finding
Yet another useful iterator adaptor, filter
, can be used effectively in conjunction with closures. The filter
method takes in a function that returns true
if the value should remain and false
if it should be skipped.
Let's filter an array using a simple comparison predicate.
In the example above, the closure takes a double reference: &&f64
so we need to dereference it twice with **
to get the actual value. This is because, unlike the map
adaptor, the filter
adaptor does not need ownership of the values it is filtering, in this case the values have type &f64
. If filter
took ownership of the values, it would need to return them back too...
Let's filter a map of country-population pairs next.
Because the iterator yields tuples, we can conveniently destructure them inside the pipes without needing to specify the types of the components. The underscore variable _
is used to ignore the first unused component of the tuple. The compiler warns about unused variables by default, so we need to explicitly tell it to ignore the variable with the underscore when we don't want to see such warnings.
To find a value which matches a given predicate, we can use the find
method on an iterator.
The find
method returns the first match, but the rest of the iterator is left intact like with the nth
method.
Remember that we could not remove from a vector directly by value but only by index? Well, actually we can, but we need a function or a closure. Vec
has a retain
method, which is very similar to the filter
method. It retains, i.e. keeps, all the values from the vector that match a given predicate. If we want to remove only a single value, we can use Iterator
's position
method to get the index of the first matching value. Then remove with Vec
's remove
based on the index.
The Rust standard library provides us with plenty of other iterator methods, for example reduce
, fold
, take_while
, flatten
, max_by_key
. Here are some examples of how to use them.
Enumerating iterators
Sometimes we would like to access an element's index in addition to the element itself. Calling enumerate
on an iterator returns a new iterator which adds an incrementing counter to each item in the form of a pair.
Next, we'll use a for
loop to iterate over the enumerated iterator and print the indices and values stored in each pair.
For a slightly more practical example let's use enumeration to split a vector into two vectors pushing every other element to the same vector.
Here in split_adjacent
function, instead of returning back references, we push new String
s created from the given string slice references to the corresponding vectors. If we wanted to directly push the &str
references to the vectors (returning a pair of Vec<&str>
), we would have to deal with the advanced concept of generic lifetimes. We will cover that option later in the course.
When collecting the results of enumerated string references, we cannot use the cloned
method to get owned copies from the references. This is because cloned
works only on iterators that yield references but the iterator yields tuples (usize, &String)
, which are not references but owned values (only the second value of the tuple is a reference). Instead, we can clone the &String
s inside the tuple with the map
method.
In this example, we could also opt to clone the string references using cloned
prior to enumerating with enumerate
, or use into_iter
to iterate over owned values instead of references. However, such options are not always available, e.g. when working with collections or iterators as function parameters.
Summary of symbols
Symbol | Description |
---|---|
loop | Loops until stopped by explicitly breaking the loop |
while | Loops as long as predicate evaluates to true |
for | Loops over an iterator by consuming the iterator values and binding them to variables. |
break | Breaks out of a loop |
continue | Skips the rest of the current loop iteration and continues with the next iteration. |
Hi! Please help us improve the course!
Please consider the following statements and questions regarding this part of the course. We use your answers for improving the course.
I can see how the assignments and materials fit in with what I am supposed to learn.
I find most of what I learned so far interesting.
I am certain that I can learn the taught skills and knowledge.
I find that I would benefit from having explicit deadlines.
I feel overwhelmed by the amount of work.
I try out the examples outlined in the materials.
I feel that the assignments are too difficult.
I feel that I've been systematic and organized in my studying.
How many hours (estimated with a precision of half an hour) did you spend reading the material and completing the assignments for this part? (use a dot as the decimal separator, e.g 8.5)
How would you improve the material or assignments?