Practical differences between Rust closures and functions
Elizabeth asks an interesting question: “What’s the difference between a function and a closure that doesn’t enclose any variable?”
When I read that question, I was intrigued. Using closures instead of functions tends to be more costly because closures capture some of the environment, and that has some overhead. In practice, we usually don’t think about the difference when programming, and the choice of one over the other comes down to personal preference and what feels right.
However, Rust has a reasonably smart compiler that can take advantage from a wide array of optimisations, and zero cost abstractions is one of Rust’s strengths, so it’s also reasonable to expect little difference between a function and a closure with no free variables (i.e., no enclosed variables) after optimisation.
This matter can be unfolded into several smaller questions, but for the sake of pragmatism I will be focusing on the differences in the code generated by the current stable nightly release of the compiler, rustc 1.5.0-nightly (7bf4c885f 2015-09-26)
. The easiest way to compare the generated assembly code is to compile the two different versions with rustc -C opt-level=2 --emit asm
and check the differences with diff
. Comparing only the optimised code will help us avoid drowning in inconsequential differences, since the debug builds are only useful for development and aren’t supposed to be used in production. 😉
Alright, let’s go ahead and compare the following two examples (adapted from Elizabeth’s question):
Function
fn double(x: i32) -> i32 {
2 * x
}
fn main() {
let v = vec!(1, 2, 3);
let w = v.into_iter().map(double);
// Make sure the result isn't optimised away
println!("{:?}", w.collect::<Vec<i32>>());
}
Closure
fn main() {
let v = vec!(1, 2, 3);
let w = v.into_iter().map(|x| 2 * x);
// Make sure the result isn't optimised away
println!("{:?}", w.collect::<Vec<i32>>());
}
I made both versions available on the Rust playground: function and closure. If you want to see for yourself the output of running diff
on the resulting files, here you go. Be warned: the diff is “noisy” because some CPU registers are used in one version but not in the other, and there is also name mangling.
Looking at the generated assembly code, the only significant difference is that the code corresponding to the body of the function/closure, 2 * x
, is preceded by a call to __rust_allocate
(std::rt::heap::allocate
), and followed by a call to __rust_deallocate
(std::rt::heap::deallocate
), but only in the version where a closure is used.
EDIT: After I published this post, Huon Wilson and Björn Steinbrink took a look to the generated assembly code and concluded that the extra allocation is actually an instance of the vector v
that isn’t being optimized away in the closure version, which is a little surprising. Björn suspects [1, 2] this is due to a different set of optimizations being applied in each version. Huon also pointed out that closures are never implicitly on the heap, as he explains in an excellent blog post about closures.
Thanks for the precious feedback, Huon and Björn!
For more details about how closures are implemented in Rust, check the Closures chapter in the official book, as well as Huon Wilson’s blog post.
I also ran a small benchmark pitting both versions against each other, but there is no measurable difference between both versions.
The bottom line is, for most users there is no practical difference between functions and closures without captured variables. Use whatever seems right. Personally, I prefer using closures when they’re small and used in one place (like in the previous examples), and functions otherwise. Remember Harold Abelson’s advice:
“Programs must be written for people to read, and only incidentally for machines to execute.”
Bonus round: named closure
Before hitting “Publish”, I discussed that observation with a friend unfamiliar with Rust, and he thought it might be interesting to check if there was any difference between using a closure directly in the argument, as previously, and using a named closure, i.e., assigning the closure to a variable and passing it to map
, like this:
fn main() {
let double = |x| 2 * x;
let v = vec!(1, 2, 3);
let w = v.into_iter().map(double);
// Make sure the result isn't optimised away
println!("{:?}", w.collect::<Vec<i32>>());
}
Since both the Rust compiler and LLVM (the compiler infrastructure that powers the compiler) are smart, the intermediate variable is optimises away, generating exactly the same code for this version as it did for the one with an anonymous closure.
Update
Jake Goulding noticed I goofed up and used the nightly release of the compiler instead of the stable release I meant to use. I recompiled the examples with the current stable rustc 1.3.0 (9a92aaf19 2015-09-15)
and generated a new diff. As you can see in this diff, the extra allocation is optimised away, just like it should. I apologize for the mistake. Thank you, Jake!