Crates and Modules
Learning objectives
- You know how Rust module system works
- You know how to create multi-file Rust projects
- You know how to separate your code to library and binary crates
- You know how to use external crates in a Rust application
Thus far, we have been writing Rust code in a single file. This is fine for small projects, but as the project grows, it is better to organize the code into multiple files. Here, Rust's module system comes into play.
The Rust module system is a collection of features that allow us to manage and organize Rust code. These features include packages for building, testing and publishing crates, crates that produce an executable program or a library, and modules that define the hierarchy and visibility of code items e.g. functions and structs.
In this part, we will briefly cover the Rust module system, creating multi-file Rust projects, the differences between library and binary crates, and adding external dependencies to a Rust project. We will also look into a couple of useful external libraries.
Packages and crates
Some of the content in this section should be familiar from the Cargo book section that we covered to get started with course tooling. We will cover a bit more about Cargo and packages here but not much. If you wish to learn more about Cargo and Rust package management, feel free to read the Cargo book further. Also, the chapter 14 of the Rust book provides a quick summary of some of the more useful advanced cargo features.
In the Rust module system, package is a Cargo feature that allows building and managing crates with the cargo
command line tool. A crate defines the code that will be either (1) compiled into an executable binary file or (2) is part of reusable code that can be imported into other crates, i.e. a library.
When we create a new package with cargo new package-demo
, we get the default directory structure that contains the binary crate src/main.rs
.
The binary crate is binary in the sense that it compiles to an executable binary file and it can be executed using the cargo run
command.
Package manifest
A package is configured by the manifest file Cargo.toml
. The manifest file defines the properties of the package and what is included in the package. The Cargo.toml
package manifest contains the following configuration by default.
name
is an identifier for the package and determines the default name of the compiled target, such as the executable compiled frommain.rs
insidetarget/debug
directory. Some restrictions apply to what the name can be.version
indicates the current version of the software using semantic versioning.edition
indicates which set of opt-in features should be used to compile the source code. The source code which compiles with an older edition, e.g. 2018 may not compile with the newer edition 2021. The latest edition supports all past editions, so it is a safe choice (unless a project needs to compiled with an older edition).
We can declare external dependencies below the [dependencies]
section of the manifest, but we won't go into that just yet. We will first look at Rust's crates and modules and how to import code from local files. And in case you are wondering (and didn't bother to check out the link in the Cargo.toml
file), the format language of the Cargo.toml
file is TOML (Tom's Obvious, Minimal Language).
Library crates
Library crates produce reusable code libraries that can be imported and utilized by other crates. In contrast to binary crates, library crates do not compile to an executable binary file nor do they need a main
function. Any dependencies that we will be adding to our Rust projects are library crates. In the Rust community, the word crate refers to a library crate or the general concept of a library.
To create a library crate, we can add a lib.rs
file to the src
directory. Cargo knows to look for main.rs
and lib.rs
as the default crate roots for binary and library crates. If src
contains both, then the package contains two crates, one binary and a library. A package may contain at most one library crate.
A package with only a library crate can be initialized using cargo new --lib <name>
. What it does differently is that instead of a main.rs
, it creates a lib.rs
in the src
directory with the following content.
The library is equipped with a single test, which tests that the add
function works correctly. No need to worry about the tests or the new syntax for now though, we will discuss them later. We can run the tests in a library (or binary) crate using cargo test
.
The code in a library crate can be imported to other crates, binary or another library — as long as the code is marked public with the keyword pub
, like the pub fn add
function in the example, more on this in the upcoming Modules section.
Multiple binary crates
A package can contain at most one library crate but it can contain multiple binary crates. For our package to have more binary crates than just the default main.rs
, we can create a src/bin
directory and put additional binary crate root files there.
For example, we can create the file src/bin/greet.rs
with the following content:
With multiple binary crate options, cargo run
gives an error saying that it doesn't know which binary crate to run.
To execute the code from main.rs
, we need to specify the package name: cargo run --bin package-demo
. Similarly we can use cargo run --bin greet
for the greet.rs
binary crate.
The exercises in this part of the course will be zip-submission only since that is currently the only way to submit files with custom names on this platform. The submission zip file should contain the content of the src
directory. The zip file should not contain the src
directory itself, only its contents. The grader uses a default Cargo.toml for all submissions.
Modules
Modules are reusable code segments that enable splitting code into separate locations, e.g. different files. Modules allow us to create complex software while keeping related chunks of code together in an intuitive way -- as opposed to having thousands of lines in a single code file, oh the scrolling. Each file in Rust, even main.rs
, is its own module. Modules also enable us to control what parts of the code is accessible to external code and what is accessible only within a module.
Item visibility and inline modules
An item in Rust refers to any component that can be imported from a module. In essence, this incorporates all components that can be defined globally in a module, e.g. a module, a function or a constant among various other things.
All items are private by default, meaning they are only visible in the module they are defined in. Visibility here means whether an item can be accessed or not. In order to make items public i.e. visible to other modules, items in a module need to be prefixed with the keyword pub
An inline module can be defined inside a Rust file using the mod
keyword followed by the name of the module and curly braces. We can think of the code inside the curly braces like a separate source code file with its own use
statements, functions, and so forth.
As an example, we define a new module called io
inside our main.rs
file that has code responsible for various input and output operations.
In the above example, the greet
function is private, and thus only visible in the io
module. The compiler does not allow us to use io::greet
in the main
function even though it is in the same source code file. Add the pub
keyword to the correct item to fix the example.
Notice that items visible in a parent module are not implicitly visible inside its child module. Modules do not inherit visible items from their parents, which is why we need to import std::path::Path
inside the module as well.
When designing code to be used in other modules or projects, the visibility of items is important because it limits how they can be depended upon. A public item can be used potentially anywhere, wherefore modifying it can break code potentially anywhere. On the other hand, we can safely modify or even remove private items at will as long as the public items that depend on them function as intended. Declaring variables public only when they are necessary outside of a module makes updating code much more manageable and less error prone.
The import statement use library::module::function;
brings the function
item to the current scope from the module module
inside the module library
.
We can use also curly braces in the path to import a group of items at once.
This import statement expands to
There are sometimes special cases where we need a more absolute way to refer to a library, like std
. Suppose our code shadows the standard library with a new module mod std
; accessing the real standard library needs an extra ::
in the front to make the path absolute, which means that it doesn't depend on the scope: use ::std::io;
.
Another absolute path we will see later is the special crate
root module.
Question not found or loading of the question is still in progress.
Splitting code into multiple files
It is often a good idea to split code into separate logical parts across multiple files. Having multiple segments of code that are responsible for different things close to each other may easily become a burden as they distract the programmers' concentration. Further, having a hierarchical structure of modules makes it easier to understand the code and to find the right place to make changes.
The main.rs
in a binary crate and lib.rs
in a library crate are called the root modules of their respective crates. For source code files other than the root modules to be part of the module hierarchy, they need to be declared in a parent module.
Let's consider a binary crate with two modules: main.rs
(root module) and io.rs
:
Declaring a module which is defined in a separate file is done using the mod
keyword followed by the name of the file without the .rs
extension. Rather than writing an inline definition with curly braces, like mod io {}
, we leave the definition empty and cap it off with a semicolon: mod io;
.
The as
keyword that we have previously used to cast primitive serves more than one purpose. It can also be used to used to rename imports.
It is illegal to import two items with the same name. Unlike scopes, imports don't support shadowing.
Importing the structs std::fmt::Error
and std::io::Error
will result in a compiler error:
To avoid this, we can rename the conflicting items with the as
keyword.
However, it is often more convenient to just import the parent module, in this case std::io
, and spell out io::Error
rather than renaming the import.
The crate module
The identifier of the root module in module hierarchy is aptly named crate
, indicating it is the crate (not the package) we are working on. Defining an io
module like before brings it into scope in the parent module, in this case the root module crate
.
If the io
module was in a separate source code file, it still needs to be explicitly declared -- importing with use
is not enough, unlike in many other programming languages.
From library to binary
As we know, a package may contain multiple crates, at most one of which may be a library crate. Rust's module system is convenient when we need to separate which code goes into the library crate and which goes into a binary crate. In fact, for binary projects, the Rust Book recommends having a minimal binary crate which depends on modules defined in a library crate.
Suppose we have a package named application
. As there can only be one library, we can refer to its root module using the library crate name (by default derived from the package name) in the binary crates' source code. For example, the binary crate main.rs
can call application::io::greet
without needing to declare any modules. Calling crate::io::greet
won't work, because the library crate is separate from the binary crate.
When we import a library, there is no ambiguity on which crate to import because there can only be one library crate. Binary crates cannot be used as a dependency by any other crate. The items within a binary crate are visible only to the crate itself.
Let's look at an example with three files, main.rs
, lib.rs
and io.rs
.
Looking at lib.rs
we can see the declaration of the io
module with pub mod io;
. It looks for the io.rs
file and makes all public items in it visible to external users of the library.
We can use the special character asterisk *
to import all items from a module into the scope.
In many cases however, this is unadvisable and can be considered a code smell because it makes it unnecessarily difficult to find what exactly is being imported. The compiler will only complain about ambiguous imports, i.e. when two or more items with the same name are being imported.
In the example above, the default Result
type was shadowed with the standard io one: std::io::Result
.
The default Result
type comes from the prelude, defined in std::prelude
. It is a special module containing the most common types and traits which are available in every scope without importing. We can think of there being a use std::prelude::*
line in every Rust source file.
The items in the standard prelude are not defined in the module itself. It uses a re-export, which means importing an item and exporting the same item as is. The syntax for re-exports is pub use
. Here is an example with a re-export:
Libraries often re-export the primary types and traits they define in the top-level lib.rs
file.
Nested modules
Let's finally look at a more complex module hierarchy in an example package named nested
. The directory structure is the following.
The contents of the files are as follows.
In the example above, the library crate consists of a submodule io
and its submodule io::utils
. The important part to note here (besides the file structure) is the contents in src/io.rs
. The module declaration pub mod utils;
in src/io.rs
not only looks for a ./utils.rs
file, but for ./io/utils.rs
as well.
While this structure may look odd at first, this is the recommended way to split submodules into subdirectories. You can read more about this in The Book.
Cargo has a handy feature for defining executable example binaries that showcase example usage of a library. Rather than needing to write those as ordinary binaries in the src
directory, we can create a new examples
directory in the package root directory.
Let's create a library package named add
with the following directory structure.
Now, similarly to working with multiple binaries, we can specify which example to run using cargo run --example oneplusone
.
It is highly recommended to write well commented example binaries in the libraries you write.
External libraries as dependencies
Besides Rust's standard library std
, there are a vast number of useful Rust libraries available online that we may want to leverage in our applications. These non-built-in libraries are called external crates -- they are external to our package.
We can add external crates as dependencies to our project with Cargo by adding them under the [dependencies]
section in Cargo.toml
.
As an example, we'll add to our application
package the rand
crate, which provides random number generation functionality.
The string "0.8.5"
is the version number of the rand
library we want to use. The version is required, but we could also specify 0.8
to let Cargo know it can use any version of rand
as long as it starts with 0.8
. More options and information in The Cargo Book
When building the package with the rand
crate added as a dependency to our Cargo.toml
, Cargo looks for the rand
crate on crates.io, the official Rust package registry, and installs the dependency if found.
Once we have an external crate as a dependency, we can import it in our code with the use
keyword, just like the std
library or the lib.rs
package library.
Note that the embedded editor code executions are cached. To see a different random value output, something in the code needs to be changed, like adding a comment for instance.
As another example, we'll add the clap
, crate to our application dependencies. This crate allows us to easily handle different kinds of command line arguments, although using it can be a little complex since it can do a lot of things related to command line argument parsing.
We can now easily define named command line arguments. Let's add optional maximum and minimum values arguments to our random number generator program.
Now, we can run our program with the --min
and --max
arguments to set the minimum and maximum values of the random number generator, e.g.
cargo run -- --max 100 --min 0
The clap crate adds helpful usage info for the program automatically based on the specified arguments. We can see it by running cargo run -- -h
or cargo run -- --help
.
We can spice things up a bit more by adding a --type
argument to our program. The argument will allow us to specify whether we want a random integer or a float. This time we'll have to parse the --min
and --max
value types a bit later in the program since we won't know what those types should be before running checking the --type
argument value.
The example uses get_matches_from
so that we don't have to specify the arguments in the command line to test the application with different arguments.
To see details on the argument options or more options, check out the Clap documentation for the Arg and Command struct.
The target directories of our packages can easily grow quite large. With multiple packages that have external dependencies, this becomes especially apparent. By default, compiled dependencies are stored in the target
directory separately for each package. They are not shared across packages.
One option to share dependencies over compilations in different packages it to explicitly define the default target directory to be the same for all packages. This can be done by setting the environment variable CARGO_TARGET_DIR
to for example .$HOME/.cargo/target
. The default target directory can also be specified in a global cargo config file ($HOME/.cargo/config
for Unix) using the TOML syntax.
If we just want to clean up the target directory of a package, we can run cargo clean
in the root directory of the project.
Summary of symbols
Symbol | description |
---|---|
mod | Defines a module in place or as a separate file. |
use | Brings an item into scope. Requires that item to be public. |
pub | Declares an item you defined public. |
pub use | Imports an item and makes it public. |
crate | A path to the root module of the current crate. |
super | A path to the parent module of the code. |
Hi! Please help us improve the course!
Please consider the following statements and questions regarding this part of the course. We use your answers for improving the course.
I can see how the assignments and materials fit in with what I am supposed to learn.
I find most of what I learned so far interesting.
I am certain that I can learn the taught skills and knowledge.
I find that I would benefit from having explicit deadlines.
I feel overwhelmed by the amount of work.
I try out the examples outlined in the materials.
I feel that the assignments are too difficult.
I feel that I've been systematic and organized in my studying.
How many hours (estimated with a precision of half an hour) did you spend reading the material and completing the assignments for this part? (use a dot as the decimal separator, e.g 8.5)
How would you improve the material or assignments?