Rustlings Part 2
This week we continue with another Rustlings video tutorial! We'll tackle some more advanced concepts like move semantics, traits, and generics! Next week, we'll start considering how we might build a similar program to teach beginners about Haskell!
Rustlings Video Blog!
We're doing something very new this week. Instead of doing a code writeup, I've actually made . In keeping with the last couple months on content, this first one is still Rust related. We'll walkthrough the Rustlings tool, which is an interactive program that teaches you the basics of the Rust Language! Soon, we'll start exploring how we might do this in Haskell!
You can also watch this video on our YouTube Channel! Subscribe there or sign up for our mailing list!
Rust Web Series Complete!
We're taking a quick breather this week from new content for an announcement. Our recently concluded Rust Web series now has a permanent spot on the advanced page of our website. You can take a look at the series page here! Here's a quick summary of the series:
- Part 1: Postgres - In the first part, we learn about a basic library to enable integration with a Postgresql Database.
- Part 2: Diesel - Next up, we get a little more formal with our database mechanics. We use the Diesel library to provide a schema for our database application.
- Part 3: Rocket - In part 3, we take the next step and start making a web server! We'll learn the basics of the Rocket server library!
- Part 4: CRUD Server - What do we do once we have a database and server library? Combine them of course! In this part, we'll make a CRUD server that can access our database elements using Diesel and Rocket.
- Part 5: Authentication - If your server will actually serve real users, you'll need authentication at some point. We'll see the different mechanisms we can use with Rocket for securing our endpoints.
- Part 6: Front-end Templating - If you're serving a full front-end web app, you'll need some way to customize the HTML. In the last part of the series, we'll see how Rocket makes this easy!
The best part is that you can find all the code for the series on our Github Repo! So be sure to take a look there. And if you're still new to Rust, you can also get your feet wet first with our Beginners Series.
In other exciting news, we'll be trying a completely new kind of content in the next couple weeks. I've written a bit in the past about using different IDEs like Atom and IntelliJ to write Haskell. I'd like to revisit these ideas to give a clearer idea of how to make our lives easier when writing code. But instead of writing articles, I'll be making a few videos to showcase how these work! I hope that a visual display of the IDEs will help make the content more clear.
Unit Tests and Benchmarks in Rust
For a couple months now, we've focused on some specific libraries you can use in Rust for web development. But we shouldn't lose sight of some other core language skills and mechanics. Whenever you write code, you should be able to show first that it works, and second that it works efficiently. If you're going to build a larger Rust app, you should also know a bit about unit testing and benchmarking. This week, we'll take a couple simple sorting algorithms as our examples to learn these skills.
As always, you can take a look at the code for this article on our Github Repo for the series. You can find this week's code specifically in sorters.rs
! For a more basic introduction to Rust, be sure to check out our Rust Beginners Series!
Insertion Sort
We'll start out this article by implementing insertion sort. This is one of the simpler sorting algorithms, which is rather inefficient. We'll perform this sort "in place". This means our function won't return a value. Rather, we'll pass a mutable reference to our vector so we can manipulate its items. To help out, we'll also define a swap
function to change two elements around that same reference:
pub fn swap(numbers: &mut Vec<i32>, i: usize, j: usize) {
let temp = numbers[i];
numbers[i] = numbers[j];
numbers[j] = temp;
}
pub fn insertion_sorter(numbers: &mut Vec<i32>) {
...
}
At its core, insertion sort is a pretty simple algorithm. We maintain the invariant that the "left" part of the array is always sorted. (At the start, with only 1 element, this is clearly true). Then we loop through the array and "absorb" the next element into our sorted part. To absorb the element, we'll loop backwards through our sorted portion. Each time we find a larger element, we switch their places. When we finally encounter a smaller element, we know the left side is once again sorted.
pub fn insertion_sorter(numbers: &mut Vec<i32>) {
for i in 1..numbers.len() {
let mut j = i;
while j > 0 && numbers[j-1] > numbers[j] {
swap(numbers, j, j - 1);
j = j - 1;
}
}
}
Testing
Our algorithm is simple enough. But how do we know it works? The obvious answer is to write some unit tests for it. Rust is actually a bit different from Haskell and most other languages in the canonical approach to unit tests. Most of the time, you'll make a separate test directory. But Rust encourages you to write unit tests in the same file as the function definition. We do this by having a section at the bottom of our file specifically for tests. We delineate a test function with the test
macro:
[#test]
fn test_insertion_sort() {
...
}
To keep things simple, we'll define a random vector of 100 integers and pass it to our function. We'll use assert
to verify that each number is smaller than the next one after it.
#[test]
fn test_insertion_sort() {
let mut numbers: Vec<i32> = random_vector(100);
insertion_sorter(&mut numbers);
for i in 0..(numbers.len() - 1) {
assert!(numbers[i] <= numbers[i + 1]);
}
}
When we run the cargo test
command, Cargo will automatically detect that we have a test suite in this file and run it.
running 1 test...
test sorter::test_insertion_sort ... ok
Benchmarking
So we know our code works, but how quickly does it work? When you want to check the performance of your code, you need to establish benchmarks. These are like test suites except that they're meant to give out the average time it takes to perform a task.
Just as we had a test
macro for making test suites, we can use the bench
macro for benchmarks. Each of these takes a mutable Bencher
object as an argument. To record some code, we'll call iter
on that object and pass a closure that will run our function.
#[bench]
fn bench_insertion_sort_100_ints(b: &mut Bencher) {
b.iter(|| {
let mut numbers: Vec<i32> = random_vector(100);
insertion_sorter(&mut numbers)
});
}
We can then run the benchmark with cargo bench
.
running 2 tests
test sorter::test_insertion_sort ... ignored
test sorter::bench_insertion_sort_100_ints ... bench: 6,537 ns
/iter (+/- 1,541)
So on average, it took about 6ms to sort 100 numbers. On its own, this number doesn't tell us much. But we can get a more clear idea for the runtime of our algorithm by looking at benchmarks of different sizes. Suppose we make lists of 1000 and 10000:
#[bench]
fn bench_insertion_sort_1000_ints(b: &mut Bencher) {
b.iter(|| {
let mut numbers: Vec<i32> = random_vector(1000);
insertion_sorter(&mut numbers)
});
}
#[bench]
fn bench_insertion_sort_10000_ints(b: &mut Bencher) {
b.iter(|| {
let mut numbers: Vec<i32> = random_vector(10000);
insertion_sorter(&mut numbers)
});
}
Now when we run the benchmark, we can compare the results of these different runs:
running 4 tests
test sorter::test_insertion_sort ... ignored
test sorter::bench_insertion_sort_10000_ints ... bench: 65,716,130 ns
/iter (+/- 11,193,188)
test sorter::bench_insertion_sort_1000_ints ... bench: 612,373 ns
/iter (+/- 124,732)
test sorter::bench_insertion_sort_100_ints ... bench: 12,032 ns
/iter (+/- 904)
We see that when we increase the problem size by a factor of 10, we increase the runtime by a factor of nearly 100! This confirms for us that our simple insertion sort has an asymptotic runtime of O(n^2)
, which is not very good.
Quick Sort
There are many ways to sort more efficiently! Let's try our hand at quicksort. For this algorithm, we first "partition" our array. We'll choose a pivot value, and then move all the numbers smaller than the pivot to the left of the array, and all the greater numbers to the right. The upshot is that we know our pivot element is now in the correct final spot!
Here's what the partition algorithm looks like. It works on a specific sub-segment of our vector, indicated by start
and end
. We initially move the pivot element to the back, and then loop through the other elements of the array. The i
index tracks where our pivot will end up. Each time we encounter a smaller number, we increment it. At the very end we swap our pivot element back into its place, and return its final index.
pub fn partition(
numbers: &mut Vec<i32>,
start: usize,
end: usize,
partition: usize)
-> usize {
let pivot_element = numbers[partition];
swap(numbers, partition, end - 1);
let mut i = start;
for j in start..(end - 1) {
if numbers[j] < pivot_element {
swap(numbers, i, j);
i = i + 1;
}
}
swap(numbers, i, end - 1);
i
}
So to finish sorting, we'll set up a recursive helper that, again, functions on a sub-segment of the array. We'll choose a random element and partition by it:
pub fn quick_sorter_helper(
numbers: &mut Vec<i32>, start: usize, end: usize) {
if start >= end {
return;
}
let mut rng = thread_rng();
let initial_partition = rng.gen_range(start, end);
let partition_index =
partition(numbers, start, end, initial_partition);
...
}
Now that we've partitioned, all that's left to do is recursively sort each side of the partition! Our main API function will call this helper with the full size of the array.
pub fn quick_sorter_helper(
numbers: &mut Vec<i32>, start: usize, end: usize) {
if start >= end {
return;
}
let mut rng = thread_rng();
let initial_partition = rng.gen_range(start, end);
let partition_index =
partition(numbers, start, end, initial_partition);
quick_sorter_helper(numbers, start, partition_index);
quick_sorter_helper(numbers, partition_index + 1, end);
}
pub fn quick_sorter(numbers: &mut Vec<i32>) {
quick_sorter_helper(numbers, 0, numbers.len());
}
Now that we've got this function, let's add tests and benchmarks for it:
#[test]
fn test_quick_sort() {
let mut numbers: Vec<i32> = random_vector(100);
quick_sorter(&mut numbers);
for i in 0..(numbers.len() - 1) {
assert!(numbers[i] <= numbers[i + 1]);
}
}
#[bench]
fn bench_quick_sort_100_ints(b: &mut Bencher) {
b.iter(|| {
let mut numbers: Vec<i32> = random_vector(100);
quick_sorter(&mut numbers)
});
}
// Same kind of benchmarks for 1000, 10000, 100000
Then we can run our benchmarks and see our results:
running 9 tests
test sorter::test_insertion_sort ... ignored
test sorter::test_quick_sort ... ignored
test sorter::bench_insertion_sort_10000_ints ... bench: 65,130,880 ns
/iter (+/- 49,548,187)
test sorter::bench_insertion_sort_1000_ints ... bench: 312,300 ns
/iter (+/- 243,337)
test sorter::bench_insertion_sort_100_ints ... bench: 6,159 ns
/iter (+/- 4,139)
test sorter::bench_quick_sort_100000_ints ... bench: 14,292,660 ns
/iter (+/- 5,815,870)
test sorter::bench_quick_sort_10000_ints ... bench: 1,263,985 ns
/iter (+/- 622,788)
test sorter::bench_quick_sort_1000_ints ... bench: 105,443 ns
/iter (+/- 65,812)
test sorter::bench_quick_sort_100_ints ... bench: 9,259 ns
/iter (+/- 3,882)
Quicksort does much better on the larger values, as expected! We can discern that the times seem to only go up by a factor of around 10. It's difficult to determine that the true runtime is actually O(n log n)
. But we can clearly see that we're much closer to linear time!
Conclusion
That's all for this intermediate series on Rust! Next week, we'll summarize the skills we learned over the course of these couple months in Rust. Then we'll look ahead to our next series of topics, including some totally new kinds of content!
Don't forget! If you've never programmed in Rust before, our Rust Video Tutorial provides an in-depth introduction to the basics!
Cleaning our Rust with Monadic Functions
A couple weeks ago we explored how to add authentication to a Rocket Rust server. This involved writing a from_request
function that was very messy. You can see the original version of that function as an appendix at the bottom. But this week, we're going to try to improve that function! We'll explore functions like map
and and_then
in Rust. These can help us write cleaner code using similar ideas to functors and monads in Haskell.
For more details on this code, take a look at our Github Repo! For this article, you should look at rocket_auth_monads.rs
. For a simpler introduction to Rust, take a look at our Rust Beginners Series!
Closures and Mapping
First, let's talk a bit about Rust's equivalent to fmap
and functors. Suppose we have a simple option wrapper and a "doubling" function:
fn double(x: f64) -> {
2.0 * x
}
fn main() -> () {
let x: Option<f64> = Some(5.0);
...
}
We'd like to pass our x
value to the double
function, but it's wrapped in the Option
type. A logical thing to do would be to return None
if the input is None
, and otherwise apply the function and re-wrap in Some
. In Haskell, we describe this behavior with the Functor
class. Rust's approach has some similarities and some differences.
Instead of Functor
, Rust has a trait Iterable
. An iterable type contains any number of items of its wrapped type. And map
is one of the functions we can call on iterable types. As in Haskell, we provide a function that transforms the underlying items. Here's how we can apply our simple example with an Option
:
fn main() -> () {
let x: Option<f64> = Some(5.0);
let y: Option<f64> = x.map(double);
}
One notable difference from Haskell is that map
is a member function of the iterator type. In Haskell of course, there's no such thing as member functions, so fmap
exists on its own.
In Haskell, we can use lambda expressions as arguments to higher order functions. In Rust, it's the same, but they're referred to as closures instead. The syntax is rather different as well. We capture the particular parameters within bars, and then provide a brace-delimited code-block. Here's a simple example:
fn main() -> () {
let x: Option<f64> = Some(5.0);
let y: Option<f64> = x.map(|x| {2.0 * x});
}
Type annotations are also possible (and sometimes necessary) when specifying the closure. Unlike Haskell, we provide these on the same line as the definition:
fn main() -> () {
let x: Option<f64> = Some(5.0);
let y: Option<f64> = x.map(|x: f64| -> f64 {2.0 * x});
}
And Then…
Now using map
is all well and good, but our authentication example involved using the result of one effectful call in the next effect. As most Haskellers can tell you, this is a job for monads and not merely functors. We can capture some of the same effects of monads with the and_then
function in Rust. This works a lot like the bind operator (>>=)
in Haskell. It also takes an input function. And this function takes a pure input but produces an effectful output.
Here's how we apply it with Option
. We start with a safe_square_root
function that produces None
when it's input is negative. Then we can take our original Option
and use and_then
to use the square root function.
fn safe_square_root(x: f64) -> Option<f64> {
if x < 0.0 {
None
} else {
Some(x.sqrt())
}
}
fn main() -> () {
let x: Option<f64> = Some(5.0);
x.and_then(safe_square_root);
}
Converting to Outcomes
Now let's switch gears to our authentication example. Our final result type wasn't Option
. Some intermediate results used this. But in the end, we wanted an Outcome
. So to help us on our way, let's write a simple function to convert our options into outcomes. We'll have to provide the extra information of what the failure result should be. This is the status_error
parameter.
fn option_to_outcome<R>(
result: Option<R>,
status_error: (Status, LoginError))
-> Outcome<R, LoginError> {
match result {
Some(r) => Outcome::Success(r),
None => Outcome::Failure(status_error)
}
}
Now let's start our refactoring process. To begin, let's examine the retrieval of our username and password from the headers. We'll make a separate function for this. This should return an Outcome
, where the success value is a tuple of two strings. We'll start by defining our failure outcome, a tuple of a status and our LoginError
.
fn read_auth_from_headers(headers: &HeaderMap)
-> Outcome<(String, String), LoginError> {
let fail = (Status::BadRequest, LoginError::InvalidData);
...
}
We'll first retrieve the username out of the headers. Recall that this operation returns an Option
. So we can convert it to an Outcome
using our function. We can then use and_then
with a closure taking the unwrapped username.
fn read_auth_from_headers(headers: &HeaderMap)
-> Outcome<(String, String), LoginError> {
let fail = (Status::BadRequest, LoginError::InvalidData);
option_to_outcome(headers.get_one("username"), fail.clone())
.and_then(|u| -> Outcome<(String, String), LoginError> {
...
})
}
We can then do the same thing with the password field. When we've successfully unwrapped both fields, we can return our final Success
outcome.
fn read_auth_from_headers(headers: &HeaderMap)
-> Outcome<(String, String), LoginError> {
let fail = (Status::BadRequest, LoginError::InvalidData);
option_to_outcome(headers.get_one("username"), fail.clone())
.and_then(|u| {
option_to_outcome(
headers.get_one("password"), fail.clone())
.and_then(|p| {
Outcome::Success(
(String::from(u), String::from(p)))
})
})
}
Re-Organizing
Armed with this function we can start re-tooling our from_request
function. We'll start by gathering the header results and invoking and_then
. This unwraps the username and password:
impl<'a, 'r> FromRequest<'a, 'r> for AuthenticatedUser {
type Error = LoginError;
fn from_request(request: &'a Request<'r>)
-> Outcome<AuthenticatedUser, LoginError> {
let headers_result =
read_auth_from_headers(&request.headers());
headers_result.and_then(|(u, p)| {
...
}
...
}
}
Now for the next step, we'll make a couple database calls. Both of our normal functions return Option
values. So for each, we'll create a failure Outcome
and invoke option_to_outcome
. We'll follow this up with a call to and_then
. First we get the user based on the username. Then we find their AuthInfo
using the ID.
impl<'a, 'r> FromRequest<'a, 'r> for AuthenticatedUser {
type Error = LoginError;
fn from_request(request: &'a Request<'r>)
-> Outcome<AuthenticatedUser, LoginError> {
let headers_result =
read_auth_from_headers(&request.headers());
headers_result.and_then(|(u, p)| {
let conn_str = local_conn_string();
let maybe_user =
fetch_user_by_email(&conn_str, &String::from(u));
let fail1 =
(Status::NotFound, LoginError::UsernameDoesNotExist);
option_to_outcome(maybe_user, fail1)
.and_then(|user: UserEntity| {
let fail2 = (Status::MovedPermanently,
LoginError::WrongPassword);
option_to_outcome(
fetch_auth_info_by_user_id(
&conn_str, user.id), fail2)
})
.and_then(|auth_info: AuthInfoEntity| {
...
})
})
}
}
This gives us unwrapped authentication info. We can use this to compare the hash of the original password and return our final Outcome
!
impl<'a, 'r> FromRequest<'a, 'r> for AuthenticatedUser {
type Error = LoginError;
fn from_request(request: &'a Request<'r>)
-> Outcome<AuthenticatedUser, LoginError> {
let headers_result =
read_auth_from_headers(&request.headers());
headers_result.and_then(|(u, p)| {
let conn_str = local_conn_string();
let maybe_user =
fetch_user_by_email(&conn_str, &String::from(u));
let fail1 =
(Status::NotFound, LoginError::UsernameDoesNotExist);
option_to_outcome(maybe_user, fail1)
.and_then(|user: UserEntity| {
let fail2 = (Status::MovedPermanently,
LoginError::WrongPassword);
option_to_outcome(
fetch_auth_info_by_user_id(
&conn_str, user.id), fail2)
})
.and_then(|auth_info: AuthInfoEntity| {
let hash = hash_password(&String::from(p));
if hash == auth_info.password_hash {
Outcome::Success( AuthenticatedUser{
user_id: auth_info.user_id})
} else {
Outcome::Failure(
(Status::Forbidden,
LoginError::WrongPassword))
}
})
})
}
}
Conclusion
Is this new solution that much better than our original? Well it avoids the "triangle of death" pattern with our code. But it's not necessarily that much shorter. Perhaps it's a little more cleaner on the whole though. Ultimately these code choices are up to you! Next time, we'll wrap up our current exploration of Rust by seeing how to profile our code in Rust.
This series has covered some more advanced topics in Rust. For a more in-depth introduction, check out our Rust Video Tutorial!
Appendix: Original Function
impl<'a, 'r> FromRequest<'a, 'r> for AuthenticatedUser {
type Error = LoginError;
fn from_request(request: &'a Request<'r>) -> Outcome<AuthenticatedUser, LoginError> {
let username = request.headers().get_one("username");
let password = request.headers().get_one("password");
match (username, password) {
(Some(u), Some(p)) => {
let conn_str = local_conn_string();
let maybe_user = fetch_user_by_email(&conn_str, &String::from(u));
match maybe_user {
Some(user) => {
let maybe_auth_info = fetch_auth_info_by_user_id(&conn_str, user.id);
match maybe_auth_info {
Some(auth_info) => {
let hash = hash_password(&String::from(p));
if hash == auth_info.password_hash {
Outcome::Success(AuthenticatedUser{user_id: 1})
} else {
Outcome::Failure((Status::Forbidden, LoginError::WrongPassword))
}
}
None => {
Outcome::Failure((Status::MovedPermanently, LoginError::WrongPassword))
}
}
}
None => Outcome::Failure((Status::NotFound, LoginError::UsernameDoesNotExist))
}
},
_ => Outcome::Failure((Status::BadRequest, LoginError::InvalidData))
}
}
}
Rocket Frontend: Templates and Static Assets
In the last few articles, we've been exploring the Rocket library for Rust web servers. Last time out, we tried a couple ways to add authentication to our web server. In this last Rocket-specific post, we'll explore some ideas around frontend templating. This will make it easy for you to serve HTML content to your users!
To explore the code for this article, head over to the "rocket_template" file on our Github repo! If you're still new to Rust, you might want to start with some simpler material. Take a look at our Rust Beginners Series as well!
Templating Basics
First, let's understand the basics of HTML templating. When our server serves out a webpage, we return HTML to the user for the browser to render. Consider this simple index page:
<html>
<head></head>
<body>
<p> Welcome to the site!</p>
</body>
</html>
But of course, each user should see some kind of custom content. For example, in our greeting, we might want to give the user's name. In an HTML template, we'll create a variable or sorts in our HTML, delineated by braces:
<html>
<head></head>
<body>
<p> Welcome to the site {{name}}!</p>
</body>
</html>
Now before we return the HTML to the user, we want to perform a substitution. Where we find the variable {{name}}
, we should replace it with the user's name, which our server should know.
There are many different libraries that do this, often through Javascript. But in Rust, it turns out the Rocket library has a couple easy templating integrations. One option is Tera, which was specifically designed for Rust. Another option is Handlebars, which is more native to Javascript, but also has a Rocket integration. The substitutions in this article are simple, so there's not actually much of a difference for us.
Returning a Template
So how do we configure our server to return this HTML data? To start, we have to attach a "Fairing" to our server, specifically for the Template
library. A Fairing is a server-wide piece of middleware. This is how we can allow our endpoints to return templates:
use rocket_contrib::templates::Template;
fn main() {
rocket::ignite()
.mount("/", routes![index, get_user])
.attach(Template::fairing())
.launch();
}
Now we can make our index
endpoint. It has no inputs, and it will return Rocket's Template
type.
#[get("/")]
fn index() -> Template {
...
}
We have two tasks now. First, we have to construct our context. This can be any "map-like" type with string information. We'll use a HashMap
, populating the name
value.
#[get("/")]
fn index() -> Template {
let context: HashMap<&str, &str> = [("name", "Jonathan")]
.iter().cloned().collect();
...
}
Now we have to render
our template. Let's suppose we have a "templates" directory at the root of our project. We can put the template we wrote above in the "index.hbs" file. When we call the render
function, we just give the name of our template and pass the context
!
#[get("/")]
fn index() -> Template {
let context: HashMap<&str, &str> = [("name", "Jonathan")]
.iter().cloned().collect();
Template::render("index", &context)
}
Including Static Assets
Rocket also makes it quite easy to include static assets as part of our routing system. We just have to mount
the static
route to the desired prefix when launching our server:
fn main() {
rocket::ignite()
.mount("/static", StaticFiles::from("static"))
.mount("/", routes![index, get_user])
.attach(Template::fairing())
.launch();
}
Now any request to a /static/...
endpoint will return the corresponding file in the "static" directory of our project. Suppose we have this styles.css
file:
p {
color: red;
}
We can then link to this file in our index template:
<html>
<head>
<link rel="stylesheet" type="text/css" href="static/styles.css"/>
</head>
<body>
<p> Welcome to the site {{name}}!</p>
</body>
</html>
Now when we fetch our index, we'll see that the text on the page is red!
Looping in our Database
Now for one last piece of integration with our database. Let's make a page that will show a user their basic information. This starts with a simple template:
<!-- templates/user.hbs -->
<html>
<head></head>
<body>
<p> User name: {{name}}</p>
<br>
<p> User email: {{email}}</p>
<br>
<p> User name: {{age}}</p>
</body>
</html>
We'll compose an endpoint that takes the user's ID as an input and fetches the user from the database:
#[get("/users/<uid>")]
fn get_user(uid: i32) -> Template {
let maybe_user = fetch_user_by_id(&local_conn_string(), uid);
...
}
Now we need to build our context from the user information. This will require a match
statement on the resulting user. We'll use Unknown
for the fields if the user doesn't exist.
#[get("/users/<uid>")]
fn get_user(uid: i32) -> Template {
let maybe_user = fetch_user_by_id(&local_conn_string(), uid);
let context: HashMap<&str, String> = {
match maybe_user {
Some(u) =>
[ ("name", u.name.clone())
, ("email", u.email.clone())
, ("age", u.age.to_string())
].iter().cloned().collect(),
None =>
[ ("name", String::from("Unknown"))
, ("email", String::from("Unknown"))
, ("age", String::from("Unknown"))
].iter().cloned().collect()
}
};
Template::render("user", &context)
}
And to wrap it up, we'll render
the "user" template! Now when users get directed to the page for their user ID, they'll see their information!
Conclusion
Next week, we'll go back to some of our authentication code. But we'll do so with the goal of exploring a more universal Rust idea. We'll see how functors and monads still find a home in Rust. We'll explore the functions that allow us to clean up heavy conditional code just as we could in Haskell.
For a more in-depth introduction to Rust basics, be sure to take a look at our Rust Video Tutorial!
Authentication in Rocket
Last week we enhanced our Rocket web server. We combined our server with our Diesel schema to enable a series of basic CRUD endpoints. This week, we'll continue this integration, but bring in some more cool Rocket features. We'll explore two different methods of authentication. First, we'll create a "Request Guard" to allow a form of Basic Authentication. Then we'll also explore Rocket's amazingly simple Cookies integration.
As always, you can explore the code for this series by heading to our Github repository. For this article specifically, you'll want to take a look at the rocket_auth.rs
file
If you're just starting your Rust journey, feel free to check out our Beginners Series as well!
New Data Types
To start off, let's make a few new types to help us. First, we'll need a new database table, auth_infos
, based on this struct:
#[derive(Insertable)]
pub struct AuthInfo {
pub user_id: i32,
pub password_hash: String
}
When the user creates their account, they'll provide a password. We'll store a hash of that password in our database table. Of course, you'll want to run through all the normal steps we did with Diesel to create this table. This includes having the corresponding Entity
type.
We'll also want a couple new form types to accept authentication information. First off, when we create a user, we'll now include the password in the form.
#[derive(FromForm, Deserialize)]
struct CreateInfo {
name: String,
email: String,
age: i32,
password: String
}
Second, when a user wants to login, they'll pass their username (email) and their password.
#[derive(FromForm, Deserialize)]
struct LoginInfo {
username: String,
password: String,
}
Both these types should derive FromForm
and Deserialize
so we can grab them out of "post" data. You might wonder, do we need another type to store the same information that already exists in User
and UserEntity
? It would be possible to write CreateInfo
to have a User
within it. But then we'd have to manually write the FromForm
instance. This isn't difficult, but it might be more tedious than using a new type.
Creating a User
So in the first place, we have to create our user so they're matched up with their password. This requires taking the CreateInfo
in our post request. We'll first unwrap the user fields and insert our User
object. This follows the patterns we've seen so far in this series with Diesel.
#[post("/users/create", format="json", data="<create_info>")]
fn create(db: State<String>, create_info: Json<CreateInfo>)
-> Json<i32> {
let user: User = User
{ name: create_info.name.clone(),
email: create_info.email.clone(),
age: create_info.age};
let connection = ...;
let user_entity: UserEntity = diesel::insert_into(users::table)...
…
}
Now we'll want a function for hashing our password. We'll use the SHA3 algorithm, courtesy of the rust-crypto
library:
fn hash_password(password: &String) -> String {
let mut hasher = Sha3::sha3_256();
hasher.input_str(password);
hasher.result_str()
}
We'll apply this function on the input password and attach it to the created user ID. Then we can insert the new AuthInfo
and return the created ID.
#[post("/users/create", format="json", data="<create_info>")]
fn create(db: State<String>, create_info: Json<CreateInfo>)
-> Json<i32> {
...
let user_entity: UserEntity = diesel::insert_into(users::table)...
let password_hash = hash_password(&create_info.password);
let auth_info: AuthInfo = AuthInfo
{user_id: user_entity.id, password_hash: password_hash};
let auth_info_entity: AuthInfoEntity =
diesel::insert_into(auth_infos::table)..
Json(user_entity.id)
}
Now whenever we create our user, they'll have their password attached!
Gating an Endpoint
Now that our user has a password, how do we gate endpoints on authentication? Well the first approach we can try is something like "Basic Authentication". This means that every authenticated request contains the username and the password. In our example we'll get these directly out of header elements. But in a real application you would want to double check that the request is encrypted before doing this.
But it would be tiresome to apply the logic of reading the headers in every handler. So Rocket has a powerful functionality called "Request Guards". Rocket has a special trait called FromRequest
. Whenever a particular type is an input to a handler function, it runs the from_request
function. This determines how to derive the value from the request. In our case, we'll make a wrapper type AuthenticatedUser
. This represents a user that has included their auth info in the request.
struct AuthenticatedUser {
user_id: i32
}
Now we can include this type in a handler signature. For this endpoint, we only allow a user to retrieve their data if they've logged in:
#[get("/users/my_data")]
fn login(db: State<String>, user: AuthenticatedUser)
-> Json<Option<UserEntity>> {
Json(fetch_user_by_id(&db, user.user_id))
}
Implementing the Request Trait
The trick of course is that we need to implement the FromRequest
trait! This is more complicated than it sounds! Our handler will have the ability to short-circuit the request and return an error. So let's start by specifying a couple potential login errors we can throw.
#[derive(Debug)]
enum LoginError {
InvalidData,
UsernameDoesNotExist,
WrongPassword
}
The from_request
function will take in a request and return an Outcome
. The outcome will either provide our authentication type or an error. The last bit of adornment we need on this is lifetime specifiers for the request itself and the reference to it.
impl<'a, 'r> FromRequest<'a, 'r> for AuthenticatedUser {
type Error = LoginError;
fn from_request(request: &'a Request<'r>)
-> Outcome<AuthenticatedUser, LoginError> {
...
}
}
Now the actual function definition involves several layers of case matching! It consists of a few different operations that have to query the request or query our database. For example, let's consider the first layer. We insist on having two headers in our request: one for the username, and one for the password. We'll use request.headers()
to check for these values. If either doesn't exist, we'll send a Failure
outcome with invalid data. Here's what that looks like:
impl<'a, 'r> FromRequest<'a, 'r> for AuthenticatedUser {
type Error = LoginError;
fn from_request(request: &'a Request<'r>)
-> Outcome<AuthenticatedUser, LoginError> {
let username = request.headers().get_one("username");
let password = request.headers().get_one("password");
match (username, password) {
(Some(u), Some(p)) => {
...
}
_ => Outcome::Failure(
(Status::BadRequest,
LoginError::InvalidData))
}
}
}
In the main branch of the function, we'll do 3 steps:
- Find the user in our database based on their email address/username.
- Find their authentication information based on the ID
- Hash the input password and compare it to the database hash
If we are successful, then we'll return a successful outcome:
Outcome::Success(AuthenticatedUser(user_id: user.id))
The number of match
levels required makes the function definition very verbose. So we've included it at the bottom as an appendix. We know how to take such a function and write it more cleanly in Haskell using monads. In a couple weeks, we'll use this function as a case study to explore Rust's monadic abilities.
Logging In with Cookies
In most applications though, we'll won't want to include the password in the request each time. In HTTP, "Cookies" provide a way to store information about a particular user that we can track on our server.
Rocket makes this very easy with the Cookies
type! We can always include this mutable type in our requests. It works like a key-value store, where we can access certain information with a key like "user_id"
. Since we're storing auth information, we'll also want to make sure it's encoded, or "private". So we'll use these functions:
add_private(...)
get_private(...)
remove_private(...)
Let's start with a "login" endpoint. This will take our LoginInfo
object as its post data, but we'll also have the Cookies
input:
#[post("/users/login", format="json", data="<login_info>")]
fn login_post(db: State<String>, login_info: Json<LoginInfo>, mut cookies: Cookies) -> Json<Option<i32>> {
...
}
First we have to make sure a user of that name exists in the database:
#[post("/users/login", format="json", data="<login_info>")]
fn login_post(
db: State<String>,
login_info: Json<LoginInfo>,
mut cookies: Cookies)
-> Json<Option<i32>> {
let maybe_user = fetch_user_by_email(&db, &login_info.username);
match maybe_user {
Some(user) => {
...
}
}
None => Json(None)
}
}
Then we have to get their auth info again. We'll hash the password and compare it. If we're successful, then we'll add the user's ID as a cookie. If not, we'll return None
.
#[post("/users/login", format="json", data="<login_info>")]
fn login_post(
db: State<String>,
login_info: Json<LoginInfo>,
mut cookies: Cookies)
-> Json<Option<i32>> {
let maybe_user = fetch_user_by_email(&db, &login_info.username);
match maybe_user {
Some(user) => {
let maybe_auth = fetch_auth_info_by_user_id(&db, user.id);
match maybe_auth {
Some(auth_info) => {
let hash = hash_password(&login_info.password);
if hash == auth_info.password_hash {
cookies.add_private(Cookie::new(
"user_id", u ser.id.to_string()));
Json(Some(user.id))
} else {
Json(None)
}
}
None => Json(None)
}
}
None => Json(None)
}
}
A more robust solution of course would loop in some error behavior instead of returning None
.
Using Cookies
Using our cookie now is pretty easy. Let's make a separate "fetch user" endpoint using our cookies. It will take the Cookies
object and the user ID as inputs. The first order of business is to retrieve the user_id
cookie and verify it exists.
#[get("/users/cookies/<uid>")]
fn fetch_special(db: State<String>, uid: i32, mut cookies: Cookies)
-> Json<Option<UserEntity>> {
let logged_in_user = cookies.get_private("user_id");
match logged_in_user {
Some(c) => {
...
},
None => Json(None)
}
}
Now we need to parse the string value as a user ID and compare it to the value from the endpoint. If they're a match, we just fetch the user's information from our database!
#[get("/users/cookies/<uid>")]
fn fetch_special(db: State<String>, uid: i32, mut cookies: Cookies)
-> Json<Option<UserEntity>> {
let logged_in_user = cookies.get_private("user_id");
match logged_in_user {
Some(c) => {
let logged_in_uid = c.value().parse::<i32>().unwrap();
if logged_in_uid == uid {
Json(fetch_user_by_id(&db, uid))
} else {
Json(None)
}
},
None => Json(None)
}
And when we're done, we can also post a "logout" request that will remove the cookie!
#[post("/users/logout", format="json")]
fn logout(mut cookies: Cookies) -> () {
cookies.remove_private(Cookie::named("user_id"));
}
Conclusion
We've got one more article on Rocket before checking out some different Rust concepts. So far, we've only dealt with the backend part of our API. Next week, we'll investigate how we can use Rocket to send templated HTML files and other static web content!
Maybe you're more experienced with Haskell but still need a bit of an introduction to Rust. We've got some other materials for you! Watch our Rust Video Tutorial for an in-depth look at the basics of the language!
Appendix: From Request Function
impl<'a, 'r> FromRequest<'a, 'r> for AuthenticatedUser {
type Error = LoginError;
fn from_request(request: &'a Request<'r>) -> Outcome<AuthenticatedUser, LoginError> {
let username = request.headers().get_one("username");
let password = request.headers().get_one("password");
match (username, password) {
(Some(u), Some(p)) => {
let conn_str = local_conn_string();
let maybe_user = fetch_user_by_email(&conn_str, &String::from(u));
match maybe_user {
Some(user) => {
let maybe_auth_info = fetch_auth_info_by_user_id(&conn_str, user.id);
match maybe_auth_info {
Some(auth_info) => {
let hash = hash_password(&String::from(p));
if hash == auth_info.password_hash {
Outcome::Success(AuthenticatedUser{user_id: 1})
} else {
Outcome::Failure((Status::Forbidden, LoginError::WrongPassword))
}
}
None => {
Outcome::Failure((Status::MovedPermanently, LoginError::WrongPassword))
}
}
}
None => Outcome::Failure((Status::NotFound, LoginError::UsernameDoesNotExist))
}
},
_ => Outcome::Failure((Status::BadRequest, LoginError::InvalidData))
}
}
}
Joining Forces: An Integrated Rust Web Server
We've now explored a couple different libraries for some production tasks in Rust. A couple weeks ago, we used Diesel to create an ORM for some database types. And then last week, we used Rocket to make a basic web server to respond to basic requests. This week, we'll put these two ideas together! We'll use some more advanced functionality from Rocket to make some CRUD endpoints for our database type. Take a look at the code on Github here!
If you've never written any Rust, you should start with the basics though! Take a look at our Rust Beginners Series!
Database State and Instances
Our first order of business is connecting to the database from our handler functions. There are some direct integrations you can check out between Rocket, Diesel, and other libraries. These can provide clever ways to add a connection argument to any handler.
But for now we're going to keep things simple. We'll re-generate the PgConnection
within each endpoint. We'll maintain a "stateful" connection string to ensure they all use the same database.
Our Rocket server can "manage" different state elements. Suppose we have a function that gives us our database string. We can pass that to our server at initialization time.
fn local_conn_string() -> String {...}
fn main() {
rocket::ignite()
.mount("/", routes![...])
.manage(local_conn_string())
.launch();
}
Now we can access this String
from any of our endpoints by giving an input the State<String>
type. This allows us to create our connection:
#[get(...)]
fn fetch_all_users(database_url: State<String>) -> ... {
let connection = pgConnection.establish(&database_url)
.expect("Error connecting to database!");
...
}
Note: We can't use the PgConnection
itself because stateful types need to be thread safe.
So any other of our endpoints can now access the same database. Before we start writing these, we need a couple things first though. Let's recall that for our Diesel ORM we made a User
type and a UserEntity
type. The first is for inserting/creating, and the second is for querying. We need to add some instances to those types so they are compatible with our endpoints. We want to have JSON instances (Serialize, Deserialize), as well as FromForm
for our User
type:
#[derive(Insertable, Deserialize, Serialize, FromForm)]
#[table_name="users"]
pub struct User {
...
}
#[derive(Queryable, Serialize)]
pub struct UserEntity {
...
}
Now let's see how we get these types from our endpoints!
Retrieving Users
We'll start with a simple endpoint to fetch all the different users in our database. This will take no inputs, except our stateful database URL. It will return a vector of UserEntity
objects, wrapped in Json
.
#[get("/users/all")]
fn fetch_all_users(database_url: State<String>)
-> Json<Vec<UserEntity>> {
...
}
Now all we need to do is connect to our database and run the query function. We can make our users vector into a Json
object by wrapping with Json()
. The Serialize
instance lets us satisfy the Responder
trait for the return value.
#[get("/users/all")]
fn fetch_all_users(database_url: State<String>)
-> Json<Vec<UserEntity>> {
let connection = PgConnection::establish(&database_url)
.expect("Error connecting to database!");
Json(users.load::<UserEntity>(&connection)
.expect("Error loading users"))
}
Now for getting individual users. Once again, we'll wrap a response in JSON. But this time we'll return an optional, single, user. We'll use a dynamic capture parameter in the URL for the User ID.
#[get("/users/<uid>")]
fn fetch_user(database_url: State<String>, uid: i32)
-> Option<Json<UserEntity>> {
let connection = ...;
...
}
We'll want to filter on the users table by the ID. This will give us a list of different results. We want to specify this vector as mutable. Why? In the end, we want to return the first user. But Rust's memory rules mean we must either copy or move this item. And we don't want to move a single item from the vector without moving the whole vector. So we'll remove the head from the vector entirely, which requires mutability.
#[get("/users/<uid>")]
fn fetch_user(database_url: State<String>, uid: i32)
-> Option<Json<UserEntity>> {
let connection = ...;
use rust_web::schema::users::dsl::*;
let mut users_by_id: Vec<UserEntity> =
users.filter(id.eq(uid))
.load::<UserEntity>(&connection)
.expect("Error loading users");
...
}
Now we can do our case analysis. If the list is empty, we return None
. Otherwise, we'll remove the user from the vector and wrap it.
#[get("/users/<uid>")]
fn fetch_user(database_url: State<String>, uid: i32) -> Option<Json<UserEntity>> {
let connection = ...;
use rust_web::schema::users::dsl::*;
let mut users_by_id: Vec<UserEntity> =
users.filter(id.eq(uid))
.load::<UserEntity>(&connection)
.expect("Error loading users");
if users_by_id.len() == 0 {
None
} else {
let first_user = users_by_id.remove(0);
Some(Json(first_user))
}
}
Create/Update/Delete
Hopefully you can see the pattern now! Our queries are all pretty simple. So our endpoints all follow a similar pattern. Connect to the database, run the query and wrap the result. We can follow this process for the remaining three endpoints in a basic CRUD setup. Let's start with "Create":
#[post("/users/create", format="application/json", data = "<user>")]
fn create_user(database_url: State<String>, user: Json<User>)
-> Json<i32> {
let connection = ...;
let user_entity: UserEntity = diesel::insert_into(users::table)
.values(&*user)
.get_result(&connection).expect("Error saving user");
Json(user_entity.id)
}
As we discussed last week, we can use data
together with Json
to specify the form data in our post request. We de-reference the user with *
to get it out of the JSON wrapper. Then we insert the user and wrap its ID to send back.
Deleting a user is simple as well. It has the same dynamic path as fetching a user. We just make a delete
call on our database instead.
#[delete("/users/<uid>")]
fn delete_user(database_url: State<String>, uid: i32) -> Json<i32> {
let connection = ...;
use rust_web::schema::users::dsl::*;
diesel::delete(users.filter(id.eq(uid)))
.execute(&connection)
.expect("Error deleting user");
Json(uid)
}
Updating is the last endpoint, which takes a put
request. The endpoint mechanics are just like our other endpoints. We use a dynamic path component to get the user's ID, and then provide a User
body with the updated field values. The only trick is that we need to expand our Diesel knowledge a bit. We'll use update
and set
to change individual fields on an item.
#[put("/users/<uid>/update", format="json", data="<user>")]
fn update_user(
database_url: State<String>, uid: i32, user: Json<User>)
-> Json<UserEntity> {
let connection = ...;
use rust_web::schema::users::dsl::*;
let updated_user: UserEntity =
diesel::update(users.filter(id.eq(uid)))
.set((name.eq(&user.name),
email.eq(&user.email),
age.eq(user.age)))
.get_result::<UserEntity>(&connection)
.expect("Error updating user");
Json(updated_user)
}
The other gotcha is that we need to use references (&
) for the string fields in the input user. But now we can add these routes to our server, and it will manipulate our database as desired!
Conclusion
There are still lots of things we could improve here. For example, we're still using .expect
in many places. From the perspective of a web server, we should be catching these issues and wrapping them with "Err 500". Rocket also provides some good mechanics for fixing that. Next week though, we'll pivot to another server problem that Rocket solves adeptly: authentication. We should restrict certain endpoints to particular users. Rust provides an authentication scheme that is neatly encoded in the type system!
For a more in-depth introduction to Rust, watch our Rust Video Tutorial. It will take you through a lot of key skills like understanding memory and using Cargo!
Rocket: Web Servers in Rust!
Welcome back to our series on building simple apps in Rust. Last week, we explored the Diesel library which gave us an ORM for database interaction. For the next few weeks, we'll be trying out the Rocket library, which makes it quick and easy to build a web server in Rust! This is comparable to the Servant library in Haskell, which we've explored before.
This week, we'll be working on the basic building blocks of using this library. The reference code for this article is available here on Github!
Rust combines some of the neat functional ideas of Haskell with some more recognizable syntax from C++. To learn more of the basics, take a look at our Rust Beginners Series!
Our First Route
To begin, let's make a simple "hello world" endpoint for our server. We don't specify a full API definition all at once like we do with Servant. But we do use a special macro before the endpoint function. This macro describes the route's method and its path.
#[get("/hello")]
fn index() -> String {
String::from("Hello, world!")
}
So our macro tells us this is a "GET" endpoint and that the path is /hello
. Then our function specifies a String
as the return value. We can, of course, have different types of return values, which we'll explore those more as the series goes on.
Launching Our Server
Now this endpoint is useless until we can actually run and launch our server. To do this, we start by creating an object of type Rocket
with the ignite()
function.
fn main() {
let server: Rocket = rocket::ignite();
...
}
We can then modify our server by "mounting" the routes we want. The mount
function takes a base URL path and a list of routes, as generated by the routes
macro. This function returns us a modified server:
fn main() {
let server: Rocket = rocket::ignite();
let server2: Rocket = server.mount("/", routes![index]);
}
Rather than create multiple server objects, we'll just compose these different functions. Then to launch our server, we use launch
on the final object!
fn main() {
rocket::ignite().mount("/", routes![index]).launch();
}
And now our server will respond when we ping it at localhost:8000/hello
! We could, of course, use a different base path. We could even assign different routes to different bases!
fn main() {
rocket::ignite().mount("/api", routes![index]).launch();
}
Now it will respond at /api/hello
.
Query Parameters
Naturally, most endpoints need inputs to be useful. There are a few different ways we can do this. The first is to use path components. In Servant, we call these CaptureParams
. With Rocket, we'll format our URL to have brackets around the variables we want to capture. Then we can assigned them with a basic type in our endpoint function:
#[get("/math/<name>")]
fn hello(name: &RawStr) -> String {
format!("Hello, {}!", name.as_str())
}
We can use any type that satisfies the FromParam
trait, including a RawStr
. This is a Rocket specific type wrapping string-like data in a raw format. With these strings, we might want to apply some sanitization processes on our data. We can also use basic numeric types, like i32
.
#[get("/math/<first>/<second>")]
fn add(first: i32, second: i32) -> String {
String::from(format!("{}", first + second))
}
This endpoint will now return "11" when we ping /math/5/6
.
We can also use "query parameters", which all go at the end of the URL. These need the FromFormValue
trait, rather than FromParam
. But once again, RawStr
and basic numbers work fine.
#[get("/math?<first>&<second>)]
fn multiply(first: i32, second: i32) {
String::from(format!("{}", first * second)
}
Now we'll get "30" when we ping /math?5&6
.
Post Requests
The last major input type we'll deal with is post request data. Suppose we have a basic user type:
struct User {
name: String,
email: String,
age: i32
}
We'll want to derive various classes for it so we can use it within endpoints. From the Rust "Serde" library we'll want Deserialize
and Serialize
so we can make JSON elements out of it. Then we'll also want FromForm
to use it as post request data.
#[derive(FromForm, Deserialize, Serialize)]
struct User {
...
}
Now we can make our endpoint, but we'll have to specify the "format" as JSON and the "data" as using our "user" type.
#[post("/users/create", format="json", data="<user>")]
fn create_user(user: Json<User>) -> String {
...
}
We need to provide the Json
wrapper for our input type, but we can use it as though it's a normal User
. For now, we'll just return a string echoing the user's information back to us. Don't forget to add each new endpoint to the routes
macro in your server definition!
#[post("/users/create", format="json", data="<user>")]
fn create_user(user: Json<User>) -> String {
String::from(format!(
"Created user: {} {} {}", user.name, user.email, user.age))
}
Conclusion
Next time, we'll explore making a more systematic CRUD server. We'll add database integration and see some other tricks for serializing data and maintaining state. Then we'll explore more advanced topics like authentication, static files, and templating!
If you're going to be building a web application in Rust, you'd better have a solid foundation! Watch our Rust Video Tutorial to get an in-depth introduction!
Diesel: A Rust-y ORM
Last week on Monday Morning Haskell we took our first step into some real world tasks with Rust. We explored the simple Rust Postgres library to connect to a database and run some queries. This week we're going to use Diesel, a library with some cool ORM capabilities. It's a bit like the Haskell library Persistent, which you can explore more in our Real World Haskell Series.
For a more in-depth look at the code for this article, you should take a look at our Github Repository! You'll want to look at the files referenced below and also at the executable here.
Diesel CLI
Our first step is to add Diesel as a dependency in our program. We briefly discussed Cargo "features" in last week's article. Diesel has separate features for each backend you might use. So we'll specify "postgres". Once again, we'll also use a special feature for the chrono
library so we can use timestamps in our database.
[[dependencies]]
diesel={version="1.4.4", features=["postgres", "chrono"]}
But there's more! Diesel comes with a CLI that helps us manage our database migrations. It also will generate some of our schema code. Just as we can install binaries with Stack using stack install
, we can do the same with Cargo. We only want to specify the features we want. Otherwise it will crash if we don't have the other databases installed.
>> cargo install diesel_cli --no-default-features --features postgres
Now we can start using the program to setup our project to generate our migrations. We begin with this command.
>> diesel setup
This creates a couple different items in our project directory. First, we have a "migrations" folder, where we'll put some SQL code. Then we also get a schema.rs
file in our src
directory. Diesel will automatically generate code for us in this file. Let's see how!
Migrations and Schemas
When using Persistent in Haskell, we defined our basic types in a single Schema file using a special template language. We could run migrations on our whole schema programmatically, without our own SQL. But it is difficult to track more complex migrations as your schema evolves.
Diesel is a bit different. Unfortunately, we have to write our own SQL. But, we'll do so in a way that it's easy to take more granular actions on our table. Diesel will then generate a schema file for us. But we'll still need some extra work to get the Rust types we'll need. To start though, let's use Diesel to generate our first migration. This migration will create our "users" table.
>> diesel migration generate create_users
This creates a new folder within our "migrations" directory for this "create_users" migration. It has two files, up.sql
and down.sql
. We start by populating the up.sql
file to specify the SQL we need to run the migration.
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT NOT NULL,
age INTEGER NOT NULL
)
Then we also want the down.sql
file to contain SQL that reverses the migration.
DROP TABLE users CASCADE;
Once we've written these, we can run our migration!
>> diesel migration run
We can then undo the migration, running the code in down.sql
with this command:
>> diesel migration redo
The result of running our migration is that Diesel populates the schema.rs
file. This file uses the table
macro that generates helpful types and trait instances for us. We'll use this a bit when incorporating the table into our code.
table! {
users (id) {
id -> Int4,
name -> Text,
email -> Text,
age -> Int4,
}
}
While we're at it, let's make one more migration to add an articles table.
-- migrations/create_articles/up.sql
CREATE TABLE articles (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
body TEXT NOT NULL,
published_at TIMESTAMP WITH TIME ZONE NOT NULL,
author_id INTEGER REFERENCES users(id) NOT NULL
)
-- migrations/create_articles/down.sql
DROP TABLE articles;
Then we can once again use diesel migration run
.
Model Types
Now, while Diesel will generate a lot of useful code for us, we still need to do some work on our own. We have to create our own structs for the data types to take advantage of the instances we get. With Persistent, we got these for free. Persistent also used a wrapper Entity
type, which attached a Key
to our actual data.
Diesel doesn't have the notion of an entity. We have to manually make two different types, one with the database key and one without. For the "Entity" type which has the key, we'll derive the "Queryable" class. Then we can use Diesel's functions to select items from the table.
#[derive(Queryable)]
pub struct UserEntity {
pub id: i32
pub name: String,
pub email: String,
pub age: i32
}
We then have to declare a separate type that implements "Insertable". This doesn't have the database key, since we don't know the key before inserting the item. This should be a copy of our entity type, but without the key field. We use a second macro to tie it to the users
table.
#[derive(Insertable)]
#[table_name="users"]
pub struct User {
pub name: String,
pub email: String,
pub age: i32
}
Note that in the case of our foreign key type, we'll use a normal integer for our column reference. In Persistent we would have a special Key
type. We lose some of the semantic meaning of this field by doing this. But it can help keep more of our code separate from this specific library.
Making Queries
Now that we have our models in place, we can start using them to write queries. First, we need to make a database connection using the establish
function. Rather than using the ?
syntax, we'll use .expect
to unwrap our results in this article. This is less safe, but a little easier to work with.
fn create_connection() -> PgConnection {
let database_url = "postgres://postgres:postgres@localhost/rust_db";
PgConnection::establish(&database_url)
.expect("Error Connecting to database")
}
fn main() {
let connection: PgConnection = create_connection();
...
}
Let's start now with insertion. Of course, we begin by creating one of our "Insertable" User
items. We can then start writing an insertion query with the Diesel function insert_into
.
Diesel's query functions are composable. We add different elements to the query until it is complete. With an insertion, we use values
combined with the item we want to insert. Then, we call get_result
with our connection. The result of an insertion is our "Entity" type.
fn create_user(conn: &PgConnection) -> UserEntity {
let u = User
{ name = "James".to_string()
, email: "james@test.com".to_string()
, age: 26};
diesel::insert_into(users::table).values(&u)
.get_result(conn).expect("Error creating user!")
}
Selecting Items
Selecting items is a bit more complicated. Diesel generates a dsl
module for each of our types. This allows us to use each field name as a value within "filters" and orderings. Let's suppose we want to fetch all the articles written by a particular user. We'll start our query on the articles
table and call filter
to start building our query. We can then add a constraint on the author_id
field.
fn fetch_articles(conn: &PgConnection, uid: i32) -> Vec<ArticleEntity> {
use rust_web::schema::articles::dsl::*;
articles.filter(author_id.eq(uid))
...
We can also add an ordering to our query. Notice again how these functions compose. We also have to specify the return type we want when using the load
function to complete our select query. The main case is to return the full entity. This is like SELECT * FROM
in SQL lingo. Applying load
will give us a vector of these items.
fn fetch_articles(conn: &PgConnection, uid: i32) -> Vec<ArticleEntity> {
use rust_web::schema::articles::dsl::*;
articles.filter(author_id.eq(uid))
.order(title)
.load::<ArticleEntity>(conn)
.expect("Error loading articles!")
}
But we can also specify particular fields that we want to return. We'll see this in the final example, where our result type is a vector of tuples. This last query will be a join between our two tables. We start with users
and apply the inner_join
function.
fn fetch_all_names_and_titles(conn: &PgConnection) -> Vec<(String, String)> {
use rust_web::schema::users::dsl::*;
use rust_web::schema::articles::dsl::*;
users.inner_join(...
}
Then we join it to the articles table on the particular ID field. Because both of our tables have id
fields, we have to namespace it to specify the user's ID field.
fn fetch_all_names_and_titles(conn: &PgConnection) -> Vec<(String, String)> {
use rust_web::schema::users::dsl::*;
use rust_web::schema::articles::dsl::*;
users.inner_join(
articles.on(author_id.eq(rust_web::schema::users::dsl::id)))...
}
Finally, we load
our query to get the results. But notice, we use select
and only ask for the name
of the User and the title
of the article. This gives us our final values, so that each element is a tuple of two strings.
fn fetch_all_names_and_titles(conn: &PgConnection) -> Vec<(String, String)> {
use rust_web::schema::users::dsl::*;
use rust_web::schema::articles::dsl::*;
users.inner_join(
articles.on(author_id.eq(rust_web::schema::users::dsl::id)))
.select((name, title)).load(conn).expect("Error on join query!")
}
Conclusion
For my part, I prefer the functionality provided by Persistent in Haskell. But Diesel's method of providing a separate CLI to handle migrations is very cool as well. And it's good to see more sophisticated functionality in this relatively new language.
If you're still new to Rust, we have some more beginner-related material. Read our Rust Beginners Series or better yet, watch our Rust Video Tutorial!
Basic Postgres Data in Rust
For our next few articles, we're going to be exploring some more advanced concepts in Rust. Specifically, we'll be looking at parallel ideas from our Real World Haskell Series. In these first couple weeks, we'll be exploring how to connect Rust and a Postgres database. To start, we'll use the Rust Postgres library. This will help us create a basic database connection so we can make simple queries. You can see all the code for this article in action by looking at our RustWeb repository. Specifically, you'll want to check out the file pg_basic.rs
.
If you're new to Rust, we have a couple beginner resources for you to start out with. You can read our Rust Beginners Series to get a basic introduction to the language. Or for some more in-depth explanations, you can watch our Rust Video Tutorial!!
Creating Tables
We'll start off by making a client object to connect to our database. This uses a query string like we would with any Postgres library.
let conn_string = "host=localhost port=5432 user=postgres";
let mut client : Client = Client::connect(conn_string, NoTls)?;
Note that the connect
function generally returns a Result<Client, Error>
. In Haskell, we would write this as Either Error Client
. By using ?
at the end of our call, we can immediately unwrap the Client
. The caveat on this is that it only compiles if the whole function returns some kind of Result<..., Error>
. This is an interesting monadic behavior Rust gives us. Pretty much all our functions in this article will use this ?
behavior.
Now that we have a client, we can use it to run queries. The catch is that we have to know the Raw SQL ourselves. For example, here's how we can create a table to store some users:
client.batch_execute("\
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT NOT NULL,
age INTEGER NOT NULL
)
")?;
Inserting with Interpolation
A raw query like that with no real result is the simplest operation we can perform. But, any non-trivial program will require us to customize the queries programmatically. To do this we'll need to interpolate values into the middle of our queries. We can do this with execute
(as opposed to batch_execute
).
Let's try creating a user. As with batch_execute
, we need a query string. This time, the query string will contain values like $1
, $2
that we'll fill in with variables. We'll provide these variables with a list of references. Here's what it looks like with a user:
let name = "James";
let email = "james@test.com";
let age = 26;
client.execute(
"INSERT INTO users (name, email, age) VALUES ($1, $2, $3)",
&[&name, &email, &age],
)?;
Again, we're using a raw query string. All the values we interpolate must implement the specific class postgres_types::ToSql
. We'll see this a bit later.
Fetching Results
The last main type of query we can perform is to fetch our results. We can use our client to call the query
function, which returns a vector of Row
objects:
for row: Row in client.query("SELECT * FROM users"), &[])? {
...
}
For more complicated SELECT
statements we would interpolate parameters, as with insertion above. The Row
has different Columns
for accessing the data. But in our case it's a little easier to use get
and the index to access the different fields. Like our Raw SQL calls, this is unsafe in a couple ways. If we use an out of bounds index, we'll get a crash. And if we try to cast to the wrong data type, we'll also run into problems.
for row: Row in client.query("SELECT * FROM users"), &[])? {
let id: i32 = row.get(0);
let name: &str = row.get(1);
let email: &str = row.get(2);
let age: i32 = row.get(3);
...
}
We could then use these individual values to populate whatever data types we wanted on our end.
Joining Tables
If we want to link two tables together, of course we'll also have to know how to do this with Raw SQL. For example, we can make our articles table:
client.batch_execute("\
CREATE TABLE articles (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
body TEXT NOT NULL,
published_at TIMESTAMP WITH TIME ZONE NOT NULL,
author_id INTEGER REFERENCES users(id)
)
")?;
Then, after retrieving a user's ID, we can insert an article written by that user.
for row: Row in client.query("SELECT * FROM users"), &[])? {
let id: i32 = row.get(0);
let title: &str = "A Great Article!";
let body: &str = "You should share this with friends.";
let cur_time: DateTime<Utc> = Utc::now();
client.execute(
"INSERT INTO articles (title, body, published_at, author_id) VALUES ($1, $2, $3, $4)",
&[&title, &body, &cur_time, &id]
)?;
}
One of this tricky parts is that this won't compile if you only use the basic postgres
dependency in Rust! There isn't a native ToSql
instance for the DateTime<Utc>
type. However, Rust dependencies can have specific "features". This concept doesn't really exist in Haskell, except through extra packages. You'll need to specify the with-chrono
feature for the version of the chrono
library you use. This feature, or sub-dependency contains the necessary ToSql
instance. Here's what the structure looks like in our Cargo.toml
:
[dependencies]
chrono="0.4"
postgres={version="0.17.3", features=["with-chrono-0_4"]}
After this, our code will compile!
Runtime Problems
Now there are lots of reasons we wouldn't want to use a library like this in a formal project. One of the big principles of Rust (and Haskell) is catching errors at compile time. And writing out functions with lots of raw SQL like this makes our program very prone to runtime errors. I encountered several of these as I was writing this small program! At one point, I started writing the SELECT
query and absentmindedly forgot to complete it until I ran my program!
At another point, I couldn't decide what timestamp format to use in Postgres. I went back and forth between using a TIMESTAMP
or just an INTEGER
for the published_at
field. I needed to coordinate the SQL for both the table creation query and the fetching query. I often managed to change one but not the other, resulting in annoying runtime errors. I finally discovered I needed TIMESTAMP WITH TIME ZONE
and not merely TIMESTAMP
. This was a rather painful process with this setup.
Conclusion
Next week, we'll explore Diesel, a library that lets us use schemas to catch more of these issues at compile time. The framework is more comparable to Persistent in Haskell. It gives us an ORM (Object Relational Mapping) so that we don't have to write raw SQL. This approach is much more suited to languages like Haskell and Rust!
To try out tasks like this in Haskell, take a look at our Production Checklist! It includes a couple different libraries for interacting with databases using ORMs.
Preparing for Rust!
Next week, we're going to change gears a bit and start some interesting projects with Rust! Towards the end of last year, we dabbled a bit with Rust and explored some of the basics of the language. In our next series of blog articles, we're going to take a deep dive into some more advanced concepts.
We'll explore several different Rust libraries in various topics. We'll consider data serialization, web servers and databases, among other. We'll build a couple small apps, and compare the results to our earlier work with Haskell.
To get ready for this series, you should brush up on your Rust basics! To help, we've wrapped up our Rust content into a permanent series on the Beginners page! Here's an overview of that series:
Part 1: Basic Syntax
We start out by learning about Rust's syntax. We'll see quite a few differences to Haskell. But there are also some similarities in unexpected places.
Part 2: Memory Management
One of the major things that sets Rust apart from other languages is how it manages memory. In the second part, we'll learn a bit about how Rust's memory system works.
Part 3: Data Types
In the third part of the series, we'll explore how to make our own data types in Rust. We'll see that Rust borrows some of Haskell's neat ideas!
Part 4: Cargo Package Manager
Cargo is Rust's equivalent of Stack and Cabal. It will be our package and dependency manager. In part 4, we see how to make basic Rust projects using Cargo.
Part 5: Lifetimes and Collections
In the final part, we'll look at some more advanced collection types in Rust. Because of Rust's memory model, we'll need some special rules for handling items in collections. This will lead us to the idea of lifetimes.
If you prefer video content, our Rust Video Tutorial also provides a solid foundation. It goes through all the topics in this series, starting from installation. Either way, stay tuned for new blog content, starting next week!
Collections and Lifetimes in Rust!
Last week we discussed how to use Cargo to create and manage Rust projects. This week we're going to finish up our discussion of Rust. We'll look at a couple common container types: vectors and hash maps. Rust has some interesting ideas for how to handle common operations, as we'll see. We'll also touch on the topic of lifetimes. This is another concept related to memory management, for trickier scenarios.
For a good summary of the main principles of Rust, take a look at our Rust Video Tutorial. It will give you a concise overview of the topics we've covered in our series and more!
Vectors
Vectors, like lists or arrays in Haskell, contain many elements of the same type. It's important to note that Rust vectors store items in memory like arrays. That is, all their elements are in a contiguous space in memory. We refer to them with the parameterized type Vec
in Rust. There are a couple ways to initialize them:
let v1: Vec<i32> = vec![1, 2, 3, 4];
let v2: Vec<u32> = Vec::new();
The first vector uses the vec!
macro to initialize. It will have four elements. Then the second will have no elements. Of course, the second vector won't be much use to us! Since it's immutable, it will never contain any elements! Of course, we can change this by making it mutable! We can use simple operations like push
and pop
to manipulate it.
let mut v2: Vec<u32> = Vec::new();
v2.push(5);
v2.push(6);
v2.push(7);
v2.push(8);
let x = v2.pop();
println!("{:?}", v2);
println!("{} was popped!", x);
Note that pop
will remove from the back of the vector. So the printed vector will have 5, 6, and 7. The second line will print 8.
There are a couple different ways of accessing vectors by index. The first way is to use traditional bracket syntax, like we would find in C++. This will throw an error and crash if you are out of bounds!
let v1: Vec<i32> = vec![1, 2, 3, 4];
let first: i32 = v1[0];
let second: i32 = v1[1];
You can also use the get
function. This returns the Option
type we discussed a couple weeks back. This allows us to handle the error gracefully instead of crashing. In the example below, we print a failure message, rather than crashing as we would with v1[5]
.
let v1: Vec<i32> = vec![1, 2, 3, 4];
match v1.get(5) {
Some(elem) => println!("Found an element: {}!", elem),
None => println!("No element!"),
}
Another neat trick we can do with vectors is loop through them. This loop will add 2 to each of the integers in our list. It does this by using the *
operator to de-reference to element, like in C++. We must pass it as a mutable reference to the vector to update it.
for i in &mut v1 {
*i += 2;
}
Ownership with Vectors
Everything works fine with the examples above because we're only using primitive numbers. But if we use non-primitive types, we need to remember to apply the rules of ownership. In general, a vector should own its contents. So when we push an element into a vector, we give up ownership. The follow code will not compile because s1
is invalid after pushing!
let s1 = String::from("Hello");
let mut v1: Vec<String> = Vec::new();
v1.push(s1);
println!("{}", s1); // << Doesn't work!
Likewise, we can't get a normal String
out of the vector. We can only get a reference to it. This will also cause problems:
let s1 = String::from("Hello");
let mut v1: Vec<String> = Vec::new();
v1.push(s1);
// Bad!
let s2: String = v1[0];
We can fix these easily though, by adding ampersands to make it clear we want a reference:
let s1 = String::from("Hello");
let mut v1: Vec<String> = Vec::new();
v1.push(s1);
let s2: &String = &v1[0];
But if we get an item out of the list, that reference gets invalidated if we then update the list again.
let mut v1: Vec<String> = Vec::new();
v1.push(String::from("Hello"));
let s2: &String = &v1[0];
v1.push(String::from("World"));
// Bad! s2 is invalidated by pushing the v1!
println!("{}", s2);
This can be very confusing. But again, the reason lies with the way memory works. The extra push
might re-allocate the entire vector. This would invalidate the memory s2
points to.
Hash Maps
Now let's discuss hash maps. At a basic level, these work much like their Haskell counterparts. They have two type parameters, a key and a value. We can initialize them and insert elements:
let mut phone_numbers: HashMap<String, String> = HashMap::new();
phone_numbers.insert(
String::from("Johnny"),
String::from("555-123-4567"));
phone_numbers.insert(
String::from("Katie"),
String::from("555-987-6543"));
As with vectors, hash maps take ownership of their elements, both keys and values. We access elements in hash maps with the get
function. As with vectors, this returns an Option
:
match phone_numbers.get("Johnny") {
Some(number) => println!("{}", number),
None => println!("No number"),
}
We can also loop through the elements of a hash map. We get a tuple, rather than individual elements:
for (name, number) in &phone_numbers {
println!("{}: {}", name, number);
}
Updating hash maps is interesting because of the entry
function. This allows us to insert a new value for key, but only if that key doesn't exist. We apply or_insert
on the result of entry
. In this example, we'll maintain the previous phone number for "Johnny" but add a new one for "Nicholas".
phone_numbers.entry(String::from("Johnny")).
or_insert(String::from("555-111-1111"));
phone_numbers.entry(String::from("Nicholas")).
or_insert(String::from("555-111-1111"));
If we want to overwrite the key though, we can use insert
. After this example, both keys will use the new phone number.
phone_numbers.insert(
String::from("Johnny"),
String::from("555-111-1111"));
phone_numbers.insert(
String::from("Nicholas"),
String::from("555-111-1111"));
Lifetimes
There's one more concept we should cover before putting Rust away. This is the idea of lifetimes. Ownership rules get even trickier as your programs get more complicated. Consider this simple function, returning the longer of two strings:
fn longest_string(s1: &String, s2: &String) -> &String {
if s1.len() > s2.len() {
s1
} else {
s2
}
}
This seems innocuous enough, but it won't compile! The reason is that Rust doesn't know at compile time which string will get returned. This prevents it from analyzing the ownership of the variables it gets. Consider this invocation of the function:
fn main() {
let s1 = String::from("A long Hello");
let result;
{
let s2 = String::from("Hello");
result = longest_string(&s1, &s2);
}
println!("{}", result);
}
With this particular set of parameters, things would work out. Since s1
is longer, result
would get that reference. And when we print it at the end, s1
is still in scope. But if we flip the strings, then result
would refer to s2
, which is no longer in scope!
But the longest_string
function doesn't know about the scopes of its inputs. And it doesn't know which value gets returned. So it complains at compile time. We can fix this by specifying the lifetimes of the inputs. Here's how we do that:
fn longest_string<'a>(s1: &'a String, s2: &'a String) -> &'a String {
if s1.len() > s2.len() {
s1
} else {
s2
}
}
The lifetime annotation 'a
is now a template of the function. Each of the types in that line should read "a reference with lifetime 'a' to a string". Both inputs have the same lifetime. Rust assumes this is the smallest common lifetime between them. It states that the return value must have this same (shortest) lifetime.
When we add this specification, our longest_string
function compiles. But the main
function we have above will not, since it violates the lifetime rules we gave it! By moving the print statement into the block, we can fix it:
fn main() {
let s1 = String::from("A long Hello");
let result;
{
let s2 = String::from("Hello");
result = longest_string(&s1, &s2);
println!("{}", result);
}
}
The shortest common lifetime is the time inside the block. And we don't use the result of the function outside the block. So everything works now!
It's a little difficult to keep all these rules straight. Luckily, Rust finds all these problems at compile time! So we won't shoot ourselves in the foot and have difficult problems to debug!
Conclusion
That's all for our Rust series! Rust is an interesting language. We'll definitely come back to it at some point on this blog. For more detailed coverage, watch Rust Video Tutorial. You can also read the Rust Book, which has lots of great examples and covers all the important concepts!
Next week is New Year's, so we'll be doing a recap of all the exciting topics we've covered in 2019!
Making Rust Projects with Cargo!
In the last couple weeks, we've gotten a good starter look at Rust. We've considered different elements of basic syntax, including making data types. We've also looked at the important concepts of ownership and memory management.
This week, we're going to look at the more practical side of things. We'll explore how to actually build out a project using Cargo. Cargo is the Rust build system and package manager. This is Rust's counterpart to Stack and Cabal. We'll look at creating, building and testing projects. We'll also consider how to add dependencies and link our code together.
If you want to see Cargo in action, take a look at our Rust Video Tutorial. It gives a demonstration of most all the material in this article and then some more!
Cargo
As we mentioned above, Cargo is Rust's version of Stack. It exposes a small set of commands that allow us to build and test our code with ease. We can start out by creating a project with:
cargo new my_first_rust_project
This creates a bare-bones application with only a few files. The first of these is Cargo.toml
. This is our project description file, combining the roles of a .cabal
file and stack.yaml
. It's initial layout is actually quite simple! We have four lines describing our package, and then an empty dependencies section:
[package]
name = "my_first_rust_project"
version = "0.1.0"
authors = ["Your Name <you@email.com>"]
edition = "2018"
[dependencies]
Cargo's initialization assumes you use Github. It will pull your name and email from the global Git config. It also creates a .git
directory and .gitignore
file for you.
The only other file it creates is a src/main.rs
file, with a simple Hello World application:
fn main() {
println!("Hello World!");
}
Building and Running
Cargo can, of course, also build our code. We can run cargo build
, and this will compile all the code it needs to produce an executable for main.rs
. With Haskell, our build artifacts go into the .stack-work
directory. Cargo puts them in the target
directory. Our executable ends up in target/debug
, but we can run it with cargo run
.
There's also a simple command we can run if we only want to check that our code compiles. Using cargo check
will verify everything without creating any executables. This runs much faster than doing a normal build. You can do this with Stack by using GHCI and reloading your code with :r
.
Like most good build systems, Cargo can detect if any important files have changed. If we run cargo build
and files have changed, then it won't re-compile our code.
Adding Dependencies
Now let's see an example of using an external dependency. We'll use the rand
crate to generate some random values. We can add it to our Cargo.toml
file by specifying a particular version:
[dependencies]
rand = "0.7"
Rust uses semantic versioning to ensure you get dependencies that do not conflict. It also uses a .lock
file to ensure that your builds are reproducible. But (to my knowledge at least), Rust does not yet have anything like Stackage. This means you have to specify the actual versions for all your dependencies. This seems to be one area where Stack has a distinct advantage.
Now, "rand" in this case is the name of the "crate". A crate is either an executable or a library. In this case, we'll use it as a library. A "package" is a collection of crates. This is somewhat like a Haskell package. We can specify different components in our .cabal
file. We can only have one library, but many executables.
We can now include the random functionality in our Rust executable with the use
keyword:
use rand::prelude::Rng;
fn main() {
let mut rng = rand::thread_rng();
let random_num: i32 = rng.gen_range(-100, 101);
println!("Here's a random number: {}", random_num);
}
When we specify the import, rand
is the name of the crate. Then prelude
is the name of the module, and Rng
is the name of the trait we'll be using.
Making a Library
Now let's enhance our project by adding a small library. We'll write this file in src/lib.rs
. By Cargo's convention, this file will get compiled into our project's library. We can delineate different "modules" within this file by using the mod
keyword and naming a block. We can expose the function within this block by declaring it with the pub
keyword. Here's a module with a simple doubling function:
pub mod doubling {
pub fn double_number(x: i32) -> i32 {
x * 2
}
}
We also have to make the module itself pub
to export and use it! To use this function in our main binary, we need to import our library. We refer to the library crate with the name of our project. Then we namespace the import by module, and pick out the specific function (or *
if we like).
use my_first_rust_project::doubling::double_number;
use rand::prelude::Rng;
fn main() {
let mut rng = rand::thread_rng();
let random_num: i32 = rng.gen_range(-100, 101);
println!("Here's a random number: {}", random_num);
println!("Here's twice the number: {}", double_number(random_num));
}
Adding Tests
Rust also allows testing, of course. Unlike most languages, Rust has the convention of putting unit tests in the same file as the source code. They go in a different module within that file. To make a test module, we put the cfg(test)
annotation before it. Then we mark any test function with a test
annotation.
// Still in lib.rs!
#[cfg(test)]
mod tests {
use crate::doubling::double_number;
#[test]
fn test_double() {
assert_eq!(double_number(4), 8);
assert_eq!(double_number(5), 10);
}
}
Notice that it must still import our other module, even though it's in the same file! Of course, integration tests would need to be in a separate file. Cargo still recognizes that if we create a tests
directory it should look for test code there.
Now we can run our tests with cargo test
. Because of the annotations, Cargo won't waste time compiling our test code when we run cargo build
. This helps save time.
What's Next
We've done a very simple example here. We can see that a lot of Cargo's functionality relies on certain conventions. We may need to move beyond those conventions if our project demands it. You can see more details by watching our Rust Video Tutorial! Next time, we'll wrap up our study of Rust by looking at different container types!
Data Types in Rust: Borrowing from Both Worlds
Last time, we looked at the concept of ownership in Rust. This idea underpins how we manage memory in our Rust programs. It explains why we don't need garbage collection. and it helps a lot with ensuring our program runs efficiently.
This week, we'll study the basics of defining data types. As we've seen so far, Rust combines ideas from both object oriented languages and functional languages. We'll continue to see this trend with how we define data. There will be some ideas we know and love from Haskell. But we'll also see some ideas that come from C++.
For the quickest way to get up to speed with Rust, check out our Rust Video Tutorial! It will walk you through all the basics of Rust, including installation and making a project.
Defining Structs
Haskell has one primary way to declare a new data type: the data
keyword. We can also rename types in certain ways with type
and newtype
, but data
is the core of it all. Rust is a little different in that it uses a few different terms to refer to new data types. These all correspond to particular Haskell structures. The first of these terms is struct
.
The name struct
is a throwback to C and C++. But to start out we can actually think of it as a distinguished product type in Haskell. That is, a type with one constructor and many named fields. Suppose we have a User
type with name, email, and age. Here's how we could make this type a struct in Rust:
struct User {
name: String,
email: String,
age: u32,
}
This is very much like the following Haskell definition:
data User = User
{ name :: String
, email :: string
, age :: Int
}
When we initialize a user, we should use braces and name the fields. We access individual fields using the .
operator.
let user1 = User {
name: String::from("James"),
email: String::from("james@test.com"),
age: 25,
};
println!("{}", user1.name);
If we declare a struct instance to be mutable, we can also change the value of its fields if we want!
let mut user1 = User {
name: String::from("James"),
email: String::from("james@test.com"),
age: 25,
};
user1.age = 26;
When you're starting out, you shouldn't use references in your structs. Make them own all their data. It's possible to put references in a struct, but it makes things more complicated.
Tuple Structs
Rust also has the notion of a "tuple struct". These are like structs except they do not name their fields. The Haskell version would be an "undistinguished product type". This is a type with a single constructor, many fields, but no names. Consider these:
// Rust
struct User(String, String, u32);
-- Haskell
data User = User String String Int
We can destructure and pattern match on tuple structs. We can also use numbers as indices with the .
operator, in place of user field names.
struct User(String, String, u32);
let user1 = User("james", "james@test.com", 25);
// Prints "james@test.com"
println!("{}", user1.1);
Rust also has the idea of a "unit struct". This is a type that has no data attached to it. These seem a little weird, but they can be useful, as in Haskell:
// Rust
struct MyUnitType;
-- Haskell
data MyUnitType = MyUnitType
Enums
The last main way we can create a data type is with an "enum". In Haskell, we typically use this term to refer to a type that has many constructors with no arguments. But in Rust, an enum is the general term for a type with many constructors, no matter how much data each has. Thus it captures the full range of what we can do with data
in Haskell. Consider this example:
// Rust
struct Point(f32, f32);
enum Shape {
Triangle(Point, Point, Point),
Rectangle(Point, Point, Point, Point),
Circle(Point, f32),
}
-- Haskell
data Point = Point Float Float
data Shape =
Triangle Point Point Point |
Rectangle Point Point Point Point |
Circle Point Float
Pattern matching isn't quite as easy as in Haskell. We don't make multiple function definitions with different patterns. Instead, Rust uses the match
operator to allow us to sort through these. Each match must be exhaustive, though you can use _
as a wildcard, as in Haskell. Expressions in a match can use braces, or not.
fn get_area(shape: Shape) -> f32 {
match shape {
Shape::Triangle(pt1, pt2, pt3) => {
// Calculate 1/2 base * height
},
Shape::Rectangle(pt1, pt2, pt3, pt4) => {
// Calculate base * height
},
Shape::Circle(center, radius) => (0.5) * radius * radius * PI
}
}
Notice we have to namespace the names of the constructors! Namespacing is one element that feels more familiar from C++. Let's look at another.
Implementation Blocks
So far we've only looked at our new types as dumb data, like in Haskell. But unlike Haskell, Rust allows us to attach implementations to structs and enums. These definitions can contain instance methods and other functions. They act like class definitions from C++ or Python. We start off an implementation section with the impl
keyword.
As in Python, any "instance" method has a parameter self
. In Rust, this reference can be mutable or immutable. (In C++ it's called this
, but it's an implicit parameter of instance methods). We call these methods using the same syntax as C++, with the .
operator.
impl Shape {
fn area(&self) -> f32 {
match self {
// Implement areas
}
}
}
fn main() {
let shape1 = Shape::Circle(Point(0, 0), 5);
println!("{}", shape1.area());
}
We can also create "associated functions" for our structs and enums. These are functions that don't take self
as a parameter. They are like static functions in C++, or any function we would write for a type in Haskell.
impl Shape {
fn shapes_intersect(s1: &Shape, s2: &Shape) -> bool
}
fn main() {
let shape1 = Shape::Circle(Point(0, 0), 5);
let shape2 = Shape::Circle(Point(10, 0), 6);
if Shape::shapes_intersect(&shape1, &shape2) {
println!("They intersect!");
} else {
println!("No intersection!");
};
}
Notice we still need to namespace the function name when we use it!
Generic Types
As in Haskell, we can also use generic parameters for our types. Let's compare the Haskell definition of Maybe
with the Rust type Option
, which does the same thing.
// Rust
enum Option<T> {
Some(T),
None,
}
-- Haskell
data Maybe a =
Just a |
Nothing
Not too much is different here, except for the syntax.
We can also use generic types for functions:
fn compare<T>(t1: &T, t2: &T) -> bool
But, you won't be able to do much with generics unless you know some information about what the type does. This is where traits come in.
Traits
For the final topic of this article, we'll discuss traits. These are like typeclasses in Haskell, or interfaces in other languages. They allow us to define a set of functions. Types can provide an implementation for those functions. Then we can use those types anywhere we need a generic type with that trait.
Let's reconsider our shape example and suppose we have a different type for each of our shapes.
struct Point(f32, f32);
struct Rectangle {
top_left: Point,
top_right: Point,
bottom_right: Point,
bottom_left: Point,
}
struct Triangle {
pt1: Point,
pt2: Point,
pt3: Point,
}
struct Circle {
center: Point,
radius: f32,
}
Now we can make a trait for calculating the area, and let each shape implement that trait! Here's how the syntax looks for defining it and then using it in a generic function. We can constrain what generics a function can use, as in Haskell:
pub trait HasArea {
fn calculate_area(&self) -> f32;
}
impl HasArea for Circle {
fn calculate_area(&self) -> f32 {
self.radius * self.radius * PI
}
}
fn double_area<T: HasArea>(element: &T) -> f32 {
2 * element.calculate_area()
}
Also as in Haskell, we can derive certain traits with one line! The Debug
trait works like Show
:
#[derive(Debug)]
struct Circle
What's Next
This should give us a more complete understanding of how we can define data types in Rust. We see an interesting mix of concepts. Some ideas, like instance methods, come from the object oriented world of C++ or Python. Other ideas, like matchable enumerations, come from more functional languages like Haskell.
Next time, we'll start looking at making a project with Rust. We'll consider how we create a project, how to manage its dependencies, how to run it, and how to test it.
Ownership: Managing Memory in Rust
When we first discussed Rust we mentioned how it has a different memory model than Haskell. The suggestion was that Rust allows more control over memory usage, like C++. In C++, we explicitly allocate memory on the heap with new
and de-allocate it with delete
. In Rust, we do allocate memory and de-allocate memory at specific points in our program. Thus it doesn't have garbage collection, as Haskell does. But it doesn't work quite the same way as C++.
In this article, we'll discuss the notion of ownership. This is the main concept governing Rust's memory model. Heap memory always has one owner, and once that owner goes out of scope, the memory gets de-allocated. We'll see how this works; if anything, it's a bit easier than C++!
For a more detailed look at getting started with Rust, take a look at our Rust video tutorial!
Scope (with Primitives)
Before we get into ownership, there are a couple ideas we want to understand. First, let's go over the idea of scope. If you code in Java, C, or C++, this should be familiar. We declare variables within a certain scope, like a for-loop or a function definition. When that block of code ends, the variable is out of scope. We can no longer access it.
int main() {
for (int i = 0; i < 10; ++i) {
int j = 0;
// Do something with j...
}
// This doesn't work! j is out of scope!
std::cout << j << std::endl;
}
Rust works the same way. When we declare a variable within a block, we cannot access it after the block ends. (In a language like Python, this is actually not the case!)
fn main() {
let j: i32 = {
let i = 14;
i + 5
};
// i is out of scope. We can't use it anymore.
println!("{}", j);
}
Another important thing to understand about primitive types is that we can copy them. Since they have a fixed size, and live on the stack, copying should be inexpensive. Consider:
fn main() {
let mut i: i32 = 10;
let j = i;
i = 15;
// Prints 15, 10
println!("{}, {}", i, j);
}
The j
variable is a full copy. Changing the value of i
doesn't change the value of j
. Now for the first time, let's talk about a non-primitive type, String
.
The String Type
We've dealt with strings a little by using string literals. But string literals don't give us a complete string type. They have a fixed size. So even if we declare them as mutable, we can't do certain operations like append another string. This would change how much memory they use!
let mut my_string = "Hello";
my_string.append(" World!"); // << This doesn't exist for literals!
Instead, we can use the String
type. This is a non-primitive object type that will allocate memory on the heap. Here's how we can use it and append to one:
let mut my_string = String::from("Hello");
my_string.push_str(" World!");
Now let's consider some of the implications of scope with object types.
Scope with Strings
At a basic level, some of the same rules apply. If we declare a string within a block, we cannot access it after that block ends.
fn main() {
let str_length = {
let s = String::from("Hello");
s.len()
}; // s goes out of scope here
// Fails!
println!("{}", s);
}
What's cool is that once our string does go out of scope, Rust handles cleaning up the heap memory for it! We don't need to call delete
as we would in C++. We define memory cleanup for an object by declaring the drop
function. We'll get into more details with this in a later article.
C++ doesn't automatically de-allocate for us! In this example, we must delete myObject
at the end of the for
loop block. We can't de-allocate it after, so it will leak memory!
int main() {
for (int i = 0; i < 10; ++i) {
// Allocate myObject
MyType* myObject = new MyType(i);
// Do something with myObject …
// We MUST delete myObject here or it will leak memory!
delete myObject;
}
// Can't delete myObject here!
}
So it's neat that Rust handles deletion for us. But there are some interesting implications of this.
Copying Strings
What happens when we try to copy a string?
let len = {
let s1 = String::from("Hello");
let s2 = s1;
s2.len()
};
This first version works fine. But we have to think about what will happen in this case:
let len = {
let mut s1 = String::from("123");
let mut s2 = s1;
s1.push_str("456");
s1.len() + s2.len()
};
For people coming from C++ or Java, there seem to be two possibilities. If copying into s2
is a shallow copy, we would expect the sum length to be 12. If it's a deep copy, the sum should be 9.
But this code won't compile at all in Rust! The reason is ownership.
Ownership
Deep copies are often much more expensive than the programmer intends. So a performance-oriented language like Rust avoids using deep copying by default. But let's think about what will happen if the example above is a simple shallow copy. When s1
and s2
go out of scope, Rust will call drop
on both of them. And they will free the same memory! This kind of "double delete" is a big problem that can crash your program and cause security problems.
In Rust, here's what would happen with the above code. Using let s2 = s1
will do a shallow copy. So s2
will point to the same heap memory. But at the same time, it will invalidate the s1
variable. Thus when we try to push values to s1
, we'll be using an invalid reference. This causes the compiler error.
At first, s1
"owns" the heap memory. So when s1
goes out of scope, it will free the memory. But declaring s2
gives over ownership of that memory to the s2
reference. So s1
is now invalid. Memory can only have one owner. This is the main idea to get familiar with.
Here's an important implication of this. In general, passing variables to a function gives up ownership. In this example, after we pass s1
over to add_to_len
, we can no longer use it.
fn main() {
let s1 = String::from("Hello");
let length = add_to_length(s1);
// This is invalid! s1 is now out of scope!
println!("{}", s1);
}
// After this function, drop is called on s
// This deallocates the memory!
fn add_to_length(s: String) -> i32 {
5 + s.len()
}
This seems like it would be problematic. Won't we want to call different functions with the same variable as an argument? We could work around this by giving back the reference through the return value. This requires the function to return a tuple.
fn main() {
let s1 = String::from("Hello");
let (length, s2) = add_to_length(s1);
// Works
println!("{}", s2);
}
fn add_to_length(s: String) -> (i32, String) {
(5 + s.len(), s)
}
But this is cumbersome. There's a better way.
Borrowing References
Like in C++, we can pass a variable by reference. We use the ampersand operator (&
) for this. It allows another function to "borrow" ownership, rather than "taking" ownership. When it's done, the original reference will still be valid. In this example, the s1
variable re-takes ownership of the memory after the function call ends.
fn main() {
let s1 = String::from("Hello");
let length = add_to_length(&s1);
// Works
println!("{}", s1);
}
fn add_to_length(s: &String) -> i32 {
5 + s.len()
}
This works like a const
reference in C++. If you want a mutable reference, you can do this as well. The original variable must be mutable, and then you specify mut
in the type signature.
fn main() {
let mut s1 = String::from("Hello");
let length = add_to_length(&mut s1);
// Prints "Hello World!"
println!("{}", s1);
}
fn add_to_length(s: &mut String) -> i32 {
s.push_str(", World!");
5 + s.len()
}
There's one big catch though! You can only have a single mutable reference to a variable at a time! Otherwise your code won't compile! This helps prevent a large category of bugs!
As a final note, if you want to do a true deep copy of an object, you should use the clone
function.
fn main() {
let s1 = String::from("Hello");
let s2 = s1.clone();
// Works!
println!("{}", s1);
println!("{}", s2);
}
Notes On Slices
We can wrap up with a couple thoughts on slices. Slices give us an immutable, fixed-size reference to a continuous part of an array. Often, we can use the string literal type str
as a slice of an object String
. Slices are either primitive data, stored on the stack, or they refer to another object. This means they do not have ownership and thus do not de-allocate memory when they go out of scope.
What's Next?
Hopefully this gives you a better understanding of how memory works in Rust! Next time, we'll start digging into how we can define our own types. We'll start seeing some more ways that Rust acts like Haskell!
Digging Into Rust's Syntax
Last time we kicked off our study of Rust with a quick overview comparing it with Haskell. In this article, we'll start getting familiar with some of the basic syntax of Rust. The initial code looks a bit more C-like. But we'll also see how functional principles like those in Haskell are influential!
For a more comprehensive guide to starting out with Rust, take a look at our Rust video tutorial!
Hello World
As we should with any programming language, let's start with a quick "Hello World" program.
fn main() {
println!("Hello World!");
}
Immediately, we can see that this looks more like a C++ program than a Haskell program. We can call a print statement without any mention of the IO
monad. We see braces used to delimit the function body, and a semicolon at the end of the statement. If we wanted, we could add more print statements.
fn main() {
println!("Hello World!");
println!("Goodbye!");
}
There's nothing in the type signature of this main
function. But we'll explore more further down.
Primitive Types and Variables
Before we can start getting into type signatures though, we need to understand types more! In another nod to C++ (or Java), Rust distinguishes between primitive types and other more complicated types. We'll see that type names are a bit more abbreviated than in other languages. The basic primitives include:
- Various sizes of integers, signed and unsigned (
i32
,u8
, etc.) - Floating point types
f32
andf64
. - Booleans (
bool
) - Characters (
char
). Note these can represent unicode scalar values (i.e. beyond ASCII)
We mentioned last time how memory matters more in Rust. The main distinction between primitives and other types is that primitives have a fixed size. This means they are always stored on the stack. Other types with variable size must go into heap memory. We'll see next time what some of the implications of this are.
Like "do-syntax" in Haskell, we can declare variables using the let
keyword. We can specify the type of a variable after the name. Note also that we can use string interpolation with println
.
fn main() {
let x: i32 = 5;
let y: f64 = 5.5;
println!("X is {}, Y is {}", x, y);
}
So far, very much like C++. But now let's consider a couple Haskell-like properties. While variables are statically typed, it is typically unnecessary to state the type of the variable. This is because Rust has type inference, like Haskell! This will become more clear as we start writing type signatures in the next section. Another big similarity is that variables are immutable by default. Consider this:
fn main() {
let x: i32 = 5;
x = 6;
}
This will throw an error! Once the x
value gets assigned its value, we can't assign another! We can change this behavior though by specifying the mut
(mutable) keyword. This works in a simple way with primitive types. But as we'll see next time, it's not so simple with others! The following code compiles fine!
fn main() {
let mut x: i32 = 5;
x = 6;
}
Functions and Type Signatures
When writing a function, we specify parameters much like we would in C++. We have type signatures and variable names within the parentheses. Specifying the types on your signatures is required. This allows type inference to do its magic on almost everything else. In this example, we no longer need any type signatures in main
. It's clear from calling printNumbers
what x
and y
are.
fn main() {
let x = 5;
let y = 7;
printNumbers(x, y);
}
fn printNumbers(x: i32, y: i32) {
println!("X is {}, Y is {}", x, y);
}
We can also specify a return type using the arrow operator ->
. Our functions so far have no return value. This means the actual return type is ()
, like the unit in Haskell. We can include it if we want, but it's optional:
fn printNumbers(x: i32, y: i32) -> () {
println!("X is {}, Y is {}", x, y);
}
We can also specify a real return type though. Note that there's no semicolon here! This is important!
fn add(x: i32, y: i32) -> i32 {
x + y
}
This is because a value should get returned through an expression, not a statement. Let's understand this distinction.
Statements vs. Expressions
In Haskell most of our code is expressions. They inform our program what a function "is", rather than giving a set of steps to follow. But when we use monads, we often use something like statements in do
syntax.
addExpression :: Int -> Int -> Int
addExpression x y = x + y
addWithStatements ::Int -> Int -> IO Int
addWithStatements x y = do
putStrLn "Adding: "
print x
print y
return $ x + y
Rust has both these concepts. But it's a little more common to mix in statements with your expressions in Rust. Statements do not return values. They end in semicolons. Assigning variables with let
and printing are expressions.
Expressions return values. Function calls are expressions. Block statements enclosed in braces are expressions. Here's our first example of an if
expression. Notice how we can still use statements within the blocks, and how we can assign the result of the function call:
fn main() {
let x = 45;
let y = branch(x);
}
fn branch(x: i32) -> i32 {
if x > 40 {
println!("Greater");
x * 2
} else {
x * 3
}
}
Unlike Haskell, it is possible to have an if
expression without an else
branch. But this wouldn't work in the above example, since we need a return value! As in Haskell, all branches need to have the same type. If the branches only have statements, that type can be ()
.
Note that an expression can become a statement by adding a semicolon! The following no longer compiles! Rust thinks the block has no return value, because it only has a statement! By removing the semicolon, the code will compile!
fn add(x: i32, y: i32) -> i32 {
x + y; // << Need to remove the semicolon!
}
This behavior is very different from both C++ and Haskell, so it takes a little bit to get used to it!
Tuples, Arrays, and Slices
Like Haskell, Rust has simple compound types like tuples and arrays (vs. lists for Haskell). These arrays are more like static arrays in C++ though. This means they have a fixed size. One interesting effect of this is that arrays include their size in their type. Tuples meanwhile have similar type signatures to Haskell:
fn main() {
let my_tuple: (u32, f64, bool) = (4, 3.14, true);
let my_array: [i8; 3] = [1, 2, 3];
}
Arrays and tuples composed of primitive types are themselves primitive! This makes sense, because they have a fixed size.
Another concept relating to collections is the idea of a slice. This allows us to look at a contiguous portion of an array. Slices use the &
operator though. We'll understand why more after the next article!
fn main() {
let an_array = [1, 2, 3, 4, 5];
let a_slice = &a[1..4]; // Gives [2, 3, 4]
}
What's Next
We've now got a foothold with the basics of Rust syntax. Next time, we'll start digging deeper into more complicated types. We'll discuss types that get allocated on the heap. We'll also learn the important concept of ownership that goes along with that.
Get Ready for Rust!
I'm excited to announce that for the next few weeks, we'll be exploring the Rust language! Rust is a very interesting language to compare to Haskell. It has some similar syntax. But it is not as similar as, say, Elm or Purescript. Rust can also look a great deal like C++. And its similarities with C++ are where a lot of its strongpoints are.
In these next few weeks we'll go through some of the basics of Rust. We'll look at things like syntax and building small projects. In this article, we'll do a brief high level comparison between Haskell and Rust. Next time, we'll start digger deeper in some actual code.
To get jump started on your Rust development, take a look at our Starting out with Rust video tutorial!.
Why Rust?
Rust has a few key differences that make it better than Haskell for certain tasks and criteria. One of the big changes is that Rust gives more control over the allocation of memory in one's program.
Haskell is a garbage collected language. The programmer does not control when items get allocated or deallocated. Every so often, your Haskell program will stop completely. It will go through all the allocated objects, and deallocate ones which are no longer needed. This simplifies our task of programming, since we don't have to worry about memory. It helps enable language features like laziness. But it makes the performance of your program a lot less predictable.
I once proposed that Haskell's type safety makes it good for safety critical programs. There's still some substance to this idea. But the specific example I suggested was a self-driving car, a complex real-time system. But the performance unknowns of Haskell make it a poor choice for such real-time systems.
With more control over memory, a programmer can make more assertions over performance. One could assert that a program never uses too much memory. And they'll also have the confidence that it won't pause mid-calculation. Besides this principle, Rust is also made to be more performant in general. It strives to be like C/C++, perhaps the most performant of all mainstream languages.
Rust is also currently more popular with programmers. A larger community correlates to certain advantages, like a broader ecosystem of packages. Companies are more likely to use Rust than Haskell since it will be easier to recruit engineers. It's also a bit easier to bring engineers from non-functional backgrounds into Rust.
Similarities
That said, Rust still has a lot in common with Haskell! Both languages embrace strong type systems. They view the compiler as a key element in testing the correctness of our program. Both embrace useful syntactic features like sum types, typeclasses, polymorphism, and type inference. Both languages also use immutability to make it easier to write correct programs.
What's Next?
Next time, we'll start digging into the language itself. We'll go over some basic examples that show some of the important syntactic points about Rust. We'll explore some of the cool ways in which Rust is like Haskell, but also some of the big differences.