Interaction: Input, Command Line and Operating System

Learning objectives

You know how to read and parse text input in a Rust application.
You know how to read command line arguments in a Rust application.
You know how to read environment variables in a Rust application.
You know how to read and write files in a Rust application.

Rust for the command line

Rust is a great language for writing command line applications, and not least for its speed, safety and multi-platform support. Many standard and time-honored tools have been rewritten in Rust. As an example fd and ripgrep are popular (faster) alternatives to the standard find and grep tools that offer basic (yet powerful) directory searching capabilities.

In this part, we will be making our own command line applications using Rust. We will make interactive programs by reading and parsing user input from the command line, reading command line arguments, reading and writing files, and recursing into directories.

Reading user input

A simple way to create an interactive program is to have the program read user input from the command line. To read user input in Rust we will be needing the io (input and output) module in the std library. Calling io::stdin() returns a Stdin struct which can be used to handle input from stdin. Stdin, short for standard input stream, handles the input text written on the command line.

In the above example, we use read_line(&mut input) to read one line from the input stream and write it to the predefined input string. The read_line method returns a Result type, where the Ok variant contains the number of bytes read from the stream and the Err variant contains an error message.

Note: there is no interactive input implemented for the embedded code editor. Instead, input can be specified in the "Inputs" text field above the code editor. The input is then passed to the program as if it was typed in the command line terminal. Try out modifying the input to see the output change.

For an authentic experience, you can run the program on your own computer in a terminal and type the input therein.

Reading Input in Rust with VS Code

To read input in Rust with VS Code, you need to open an integrated terminal (command line). Instructions for opening a terminal in VS Code can be found here. Alas, if you run a Rust program in VS Code via the default run button, you won't see an input prompt.

On your computer, you can also cause an error with read_line by passing invalid UTF-8 as input from e.g. an image file. File contents can be redirected to the standard input of a program process with the syntax command < file. For instance, the following example redirects the contents of my-image.png file into the standard input of the program launched with the command cargo run (when run in a Rust project directory that contains a my-image.png file).

cargo run < my-image.png

When we don't mind panicking on errors, we may opt to simply unwrapping the result.

The read_line method that we used to read a line from stdin is a blocking function. It will read from the underlying input stream until it encounters a newline \n (pressing Enter when inserting input in an interactive command line environment) or an EOF i.e. end of file marker. In other words, calling read_line will wait until a new line appears, which is the case when we press Enter in an interactive command line program, or the input stream ends.

In the online embedded environment, we can see printed input even when there is no newline in the input. This is because the input in the embedded editor is not an open stream but ends with EOF.

Let's look at another example.

Here, one line of input is read to the string name and then printed out. We can also notice (after testing out the code) that the ! ends up on the next line. This happens because the read_line method pushes the newline character to the string. We can solve this minor annoyance by using the trim method of str.

When one line (at a time) is not enough

We can get an iterator over the lines in the standard input stream by using the lines method of Stdin.

We use the iterator's take method to stop the iterator after going through three lines. Otherwise the program would run forever waiting for more lines when executed in a command line — in the embedded editor, the program ends nicely even with fewer lines because there the input stream is finite. Similarly as read_line gives out a Result in case input can't be converted to UTF-8, the lines method gives out an iterator of such Results.

As an alternative to read_line. We may take one line from the iterator given by calling lines.

Here we see something that can be a bit unpleasant to the eye: two unwraps in a row. The first unwrap is on the Option returned by next (the next line might not exist) that needs to be handled for any iterator. The second unwrap is on the Result for handling invalid input.

With the read_line method, we needed only one unwrap. With this approach, we don't need to trim the string and we don't need to define a mutable string to store the input. Choose your poison.

In the previous chapter we looked at various ways to iterate over finite collections. Iterators are not only useful for processing finite sequences of values, but also for processing infinite streams of values.

The collect method on iterators is useful for converting the iterator to a vector or a hash map. What happens if we collect the iterator from io::stdin()::lines() to a vector?

In the embedded editor this a bit of an anticlimax, since it seems to work just fine. But this is just because the embedded editor receives a finite input ending in an EOF. If we run the program in a terminal, it will keep waiting for more input and never end.

Notice also, that the type of the input validation result is io::Result<String> which does not have an error type specified. This is because the error type is io::Error and io::Result is an alias for Result<String, io::Error>.

Let's try out using another infinite iterator. The repeat method in std::iter creates an iterator that repeats the same value over and over again. To collect from it, we need to specify how many we want to collect. Otherwise we'll just be collecting till the end of time or space.

Parsing input into numbers

Next, we have a slightly more complex example than just reading input. We will read two numbers from the standard input and print out their sum.

To parse a string into a number, we can use the parse method on the str type (line 4 in the below example).

Parsing the string into i32 returns a Result, so we need to handle that too in addition to all the possible errors from reading input. The resulting code is a bit verbose, but it is necessary to keep the compiler happy.

rust is verbose input meme — Reading and parsing numeric input in Python vs Rust (the `use std::io::BufRead;'` and `.lock()` are no longer necessary as `std::io::Stdin` has nowadays a `lines` method that can be called directly). Speed and safety can come with a cost.

The error type of the Result returned by the parse method is ParseIntError, which represents multiple different error kinds that are defined in the enum IntErrorKind. For example, parsing a string that contains invalid characters will result in an IntErrorKind::InvalidDigit.

We can handle the different error kinds by first getting the kind enum from the error with kind(), and then using the match expression to handle the enum variants.

Command line arguments and environment variables

When running a program from a command line, we can provide arguments to the program after giving the program name, like run for cargo in cargo run. Passing arguments to a program is not that different from passing arguments (i.e. values) to a parameterized function.

Executing the command echo Hello prints out Hello, because the echo program just prints out all of its arguments. Notice that shells aka command line interpreters use a space character to separate the base program.

The following example works like the "echo" program, it prints out the arguments it is given.

In the example, we first import the std::env module, which contains various functions for getting information about the environment of the process (the instance of a computer program being executed). We then get an iterator of the program arguments with the std::env::args function and collect the iterator values before printing them out.

Running the above example with cargo run -- Hello World in a terminal prints out ["target/debug/echo", "Hello", "World"] (assuming the project name is echo). The first argument (at index 0) is the path to the program. The rest of the arguments are the arguments passed to the program. We need to include the -- argument to tell cargo that the arguments after the -- are not for cargo run, but for the program that is run.

Since we collect the arguments into a Vec, we can use the normal operations available on vector to process the arguments. We should still take care that our program is prepared for common user mistakes, such as forgetting to add all the required arguments. The number of arguments is exactly as many arguments as has been passed to the program, and the compiler cannot know that number beforehand.

Below, we have a program that reads two arguments and multiplies them together. It gets the arguments by using the indices 1 and 2, and then parses them into f64s. It doesn't handle the case where the user doesn't provide two arguments very nicely but provides an obscure message instead. With get, we can provide a better error explanation or a default value to use when the index is out of bounds.

With this sort of an application that simply multiplies its values, we could easily do much better than multiplying only two values. We can handle an arbitrary number of arguments by using the product method directly on the args iterator.

We'll want to ignore the filename argument at the beginning of the iterator though for our multiplication. For this, the skip method of the iterator comes in handy. The product method returns 1.0 if the iterator is empty, which makes this approach safe to use also when providing no arguments.

Environment variables

Environment variables are variables defined in a shell's environment that programs inherit when they are run in the shell. Environment variables are often used to configure a program. For example, cargo uses the RUST_BACKTRACE environment variable for enabling backtrace for Rust runtime errors.

In Rust, we can access environment variables with the env::vars function. It returns an iterator of environment variables names and values as tuples, which we can collect into a hash map for further use.

Let's see what environment variables are available to us in our program.

When running the code in the embedded editor, we get to see the used RUST_VERSION of the embedded editor (among a plethora of other variables). This is the same version the automatic exercise grader uses when grading exercises.

We can define or update existing environment variables in a shell by exporting them. To see the backtraces for runtime errors in Rust, we can set the RUST_BACKTRACE environment variable to 1 with

# sh (posix shell / unix-like systems)
export RUST_BACKTRACE=1
# Windows CMD
set RUST_BACKTRACE="1"
# Windows PowerShell
$env:RUST_BACKTRACE="1"

In Unix-like systems (e.g. Linux, Mac), we can overwrite or set new environment variables for just a single command in a shell. This is done by prefixing the command with VAR=value. As an example, to enable Rust backtrace for only one cargo run, we can run

RUST_BACKTRACE=1 cargo run

Managing files and directories

An operating system (OS) manages the resources our computer can use: memory, disks, networking, filesystem and drawing to the screen. Next, we will look at how to interact with the operating system by reading and writing files within Rust code

An operating system should not be taken for granted, not every programmable piece of machinery has one. Rust provides low-level access for working with hardware, so it can be used in an environment which has no operating system, like on an embedded microcontroller. There we don't have access to input (stdin), output (println) or the Internet.

Writing code for an embedded device is an advanced topic however, and we will not cover it in this course.

If you are interested in the topic and feel comfortable using Rust already (or after completing this course), you can read more about it in the Rust embedded discovery book (for those new to embedded programming) or the Rust embedded book (more advanced, for those with some experience in embedded programming).

Reading files

Reading a file requires knowing the path to it. In Unix-like operating systems, like Linux, the directories and files of the directory structure are separated by slashes / in the path. In Windows, the directories and files are separated by backslashes \. We use unix-like paths in this course material.

A path can start with ./ to indicate that it is relative to the directory the program is being run at. Let's say we are running the following program from the path /home/user/project/. We can use the std::fs::read function to read the contents of a file into a vector of bytes (Vec<u8>). We can then convert the bytes into a string with the String::from_utf8 function.

Calling fs::read("./src/main.rs") will try to read the /home/user/project/src/main.rs file. If the file exists, and the user's permissions are sufficient, the contents of that file will be saved in the bytes variable. Try to modify the path in the above example to a file that does not exist, e.g. /src/main.rs, to see a runtime error.

In the usual case, we want to read a file and convert its contents to a string, like we did with fs::read and String::from_utf8. Being such a common operation fs has a function for just that fs::read_to_string.

Writing to a file

We can use the standard library function fs::write to write a string to a file in the specified path. The fs::write first creates a file (if it doesn't already exist) and then writes to the file by combining the fs::File::create and io::Write::write_all functions into one convenient function.

We can also pass a byte vector to fs::write to write any binary data to a file, like the contents of an image or a video.

Note that the fs::write function will overwrite the file if it already exists. To avoid overwriting an existing file, we can check its existence before writing to it with the path::Path struct and its exists method.

Appending to a file

Rust does not provide a convenience function for appending to a file, but we can use the fs::OpenOptions struct to open the file in append mode. We can then append text to the file using the writeln! macro, which is a convenience macro for writing a string and a newline to a buffer (there is also write! when we don't want a new line at the end). Using the macro requires an additional method for OpenOptions though, which can be added by importing the trait std::io::Write (the compiler kindly hints us to do so in case we forget).

Removing a file

Removing a file in Rust code is as straightforward as creating or overwriting them with fs::write with the fs::remove_file function. This function will return an error if the given path doesn't exist, the path is a directory, or the user doesn't have permission to remove the file.

Try removing or commenting out the fs::write line to see the runtime error of trying to remove a non-existent file.

We can also read files at compile time with the include_str! macro. The include_str! macro will read the file at compile time and include the contents of the file as a string. The path of the read file is located relative to the file where the macro is called.

An invalid path will cause a compile time error. On the other hand, the file will not be read at runtime so the file does not need to exist when the program is run.

Listing directories

Listing directories can be a bit more complicated than reading and writing files because we have more possible errors to deal with. Although with the help of the ? (try) operator, we can streamline through most of them by propagating the errors back to the caller.

Rust can often be verbose, but it doesn't have to be always. Let's have a look at a simple backup function that leverages the fs::read_to_string function along with the fs::write to create a backup copy of a file.

Even though the function does not do that much, it contains quite a lot of code. We could of course use the more concise expect or unwrap functions to handle the error by causing a runtime panic, but often we want to propagate the error back to the caller instead. This way the caller can choose how to handle the error, and that is also the way most programming languages work implicitly.

To make error handling simpler, Rust provides a way to propagate errors by using the ? (pronounced try) operator. With it, our backup function can look rather nice and concise.

The ? operator works for both Options and Results by checking if the value in front of it is None or Err and returning the error prematurely. If the value is Some or Ok, ? unwraps the value.

Note that using ? requires the function to return either an Option or a Result, and the propagated value needs to match the return type.

The ? operator can also be used propagate errors from the main function by giving it a return type of Result.

Using std::fs::read_dir we can get an iterator over all the files and directories at the path provided as argument.

The read_dir function returns an io::Result<ReadDir>, which we can iterate over, but iterating over a Result only gives the wrapped value if it is Ok. We want to iterate over the ReadDir instead to get individual DirEntrys, which contain information about the entry, like whether it is a directory or a file.

Here is a good place to try the ? operator to get the value inside the result and propagate the error to the caller if it is an Err. Note that we need to give the function a return type of Result or Option to be able to use ?.

The ReadDir iterator gives us io::Result<DirEntry>s, which is interesting because we have just handled the errors from read_dir. The reason is that the ReadDir iterator doesn't contain the contents of the directory in any way. When the for loop calls next() during each iteration, the program gets the next DirEntry from the operating system. As with anything that interacts with the operating system, this may also fail.

But now we finally have access to the DirEntrys which have many useful methods, like file_name, path and metadata. We can use the metadata method on a DirEntry to get more information about the file or directory. metadata also interacts with the operating system, thus requiring us to handle potential errors.

We can see for example, which entries are directories and how big each file is.

Here we also use the Result type from the io module as the return type, which works with the ? operator because it is just a regular Result with the error type already set to io::Error.

The metadata for a file or directory can be accessed also by using the fs::metadata function, which takes a path as argument. It too returns a Result in case the path doesn't exist or the program doesn't have permission to access it.

When we need to create a new directory, Rust standard library provides the functions fs::create_dir and fs::create_dir_all. The create_dir function will return an error if a directory with the same name already exists or if one of it's parent directories doesn't exist. The create_dir_all function will create all the parent directories if they don't exist and will return Ok even when all directories in a given path exists.

For removing directories, Rust standard library provides the functions fs::remove_dir and fs::remove_dir_all. The remove_dir function only works for empty directories, while remove_dir_all recursively removes all the files and directories inside the directory before removing the directory itself.

Like the file modification and removal functions, these all return an error on failure due to e.g. insufficient permissions.

OsString and pesky temporary values

The file_name method of DirEntry doesn't return a String or a &str which are already familiar to us, but rather an std::ffi::OsString. This OsString is a compatibility feature in Rust which can store data in the different encodings different operating systems use — an OsString may contain non-valid UTF-8 unlike a String.

Let's say we want to format our file metadata listing from previous example with padding (:>20) for more pleasant reading. An OsString can't be displayed without debug format (:?) and padding doesn't work on debug format, so we need to get a String or &str from the OsString.

The simplest way to convert an OsString to a &str is to use the to_string_lossy method, which returns a &str where invalid unicode characters are replaced with �. This method technically returns a Clone-on-write smart pointer Cow<str>, but we don't need to worry about that yet. For our current purposes, we can use it like a normal &str — we'll cover smart pointers later when looking closer into memory and lifetimes.

With this information, we should now know for instance how to format the prints in our previous metadata listing example for prettier output. The following code won't compile however because in it a temporary value is dropped before it is being used. To fix this, we need to follow the compiler's advice and store the result of calling entry.file_name() in a separate variable.

This mistake is very common, and can be quite surprising to new Rustaceans. The problem here is that entry.file_name() returns a new OsString, which is not a reference to entry. Then calling to_string_lossy on the OsString returns a value that references the OsString. But the referenced OsString gets dropped because no variable is going to be its owner in the current scope. To fix this, we can add a variable for the temporary owned value OsString. Later in the course when discussing lifetimes this behaviour will hopefully become clearer.

Hi! Please help us improve the course!

Please consider the following statements and questions regarding this part of the course. We use your answers for improving the course.

I can see how the assignments and materials fit in with what I am supposed to learn.

I find most of what I learned so far interesting.

I am certain that I can learn the taught skills and knowledge.

I find that I would benefit from having explicit deadlines.

I feel overwhelmed by the amount of work.

I try out the examples outlined in the materials.

I feel that the assignments are too difficult.

I feel that I've been systematic and organized in my studying.

How many hours (estimated with a precision of half an hour) did you spend reading the material and completing the assignments for this part? (use a dot as the decimal separator, e.g 8.5)

Time spent studying

How would you improve the material or assignments?