Rust: The `?` operator
For people who are not familiar with Haskell or Scala, Rust’s Option
and Result
types might
feel a bit cumbersome and verbose to work with. To make it easier and less verbose to use them
the RFC PR #243: Trait-based exception handling has
been proposed.
In this blog post I will go through some basics of the RFC and then compare with a hypothetical
do
-notation.
The RFC proposes a ?
operator which is a compiler-assisted rewrite of expressions around ?
characters. It is a unary suffix operator which can be placed on an expression to unwrap the value
on the left hand side of ?
while propagating any error through an early return:
File::create("foo.txt")?.write_all(b"Hello world!")
Would be transformed to:
match File::create("foo.txt") {
Ok(t) => t.write_all(b"Hello world!"),
Err(e) => return Err(e.into()),
}
On its own ?
is just syntactic sugar for the try!
macro, making it easier to write code
chaining expressions which can fail:
try!(File::create("foo.txt")).write_all(b"Hello world!")
try
and catch
The RFC also details a try
-catch
expression which would “catch” any early returns performed by
the ?
operator. Essentially the early returns would jump to the catch
block and the whole
try
-catch
expression would assume that value. If no catch
block is provided the try
block
will return a wrapped result:
try {
let mut f = File::create("foo.txt")?;
f.write_all(b"Hello world!")?
}
// can also be written as
try { File::create("foo.txt")?.write_all(b"Hello world")? }
Note that the ?
is required at the last line since we want a Result<(), io::Error>
, not a
Result<Result<(), io::Error>, io::Error>
. The Result
type will automatically re-wrap the
return value of the block if there is no catch
block, so that the whole expression assumes
a Result<T, E>
without the need to wrap the return value yourself.
Adding the catch
would be equivalent to using Result::or_else
with try
and match
:
try {
let mut f = File::create("foo.txt")?;
f.write_all(b"Hello world!")?
}
catch {
// we only have one type to match on
e => {
println!("{}", e)
}
}
Is equivalent to:
try {
let mut f = File::create("foo.txt")?;
f.write_all(b"Hello world!")?
}.or_else(|e|
println!("{}", e)
)
The difference here is that any return
inside of Result::or_else
cannot immediately result in
an early return.
The ?
also allows us to use it at an arbitrary nesting within the try
block (and in code in
general):
fn logging_on() -> Result<bool, io::Error> { ... }
fn read_values() -> Result<SomeData, io::Error> { ... }
fn log_values(values: &SomeData) -> Result<(), io::Error> { ... }
try {
let data = read_values()?;
if logging_on()? {
log_values(data)?;
}
data
}
Do-notation
So called do
-notation is a syntactic-sugar which allows us to write statements and expressions
dealing with the computation within a context. For example values of types like Option
and
Result
enable us to perform operations without having to worry about the failure-state of the
same inside of the do
-expression:
do {
mut f <- File::create("foo.txt");
f.write_all(b"Hello world!")
}
The first line inside of the do
-block is a so called monadic bind: it will bind the value
contained inside of the type on the right of <-
to the identifier on the left for the rest of
the block. The result of the rest of the block will be merged with the context from the value on
the right of <-
. In the case of Result
and Option
this is very simple and would just not
evaluate the rest of the block if the right hand side is of an error-variant.
The second line is an expression which is evaluated within the context of the first line: f
is available and can be mutated and the result is another Result
which will be returned to the
monadic-bind method of the first Result
value for merging (in Result
this would be a no-op
since there is no state to merge).
The result of the do
-block expression is Result<(), io::Error>
, since that is the return value
of write_all
. The two expressions are compatible since they both return a value of type
Result<T, io::Error>
.
The above code desugars to:
File::create("foo.txt").and_then(|mut f|
f.write_all(b"Hello world!"))
Each expression in the do
-notation above evaluates to some Result<T, io::Error>
(for any T
) which
means that when adding expressions not resulting in the Result<T, io::Error>
type they need to be
wrapped (this is called “lifting” in Haskell terminology) to produce a Result
which then fits into the
do
-block:
let h = "Hello".to_owned();
do {
mut f <- File::create("foo.txt");
s <- Ok(h + " world!");
f.write_all(s)
}
In the try
-block we could just add it as usual:
let h = "Hello".to_owned();
try {
let mut f = File::create("foo.txt")?;
let s = h + " world!";
f.write_all(s)?
}
Though in the code examples above would be more suitable to just move the expression assigned to
s
into the call to write_all
. It would also be desirable to allow the use of normal
let
-binds inside of do
-blocks to allow the declarations of variables without having to use
monadic bind.
Another difference is that do
-notation only works on the statement-level whereas ?
works at
any nesting inside of the try
-block. A direct translation of the nesting-example of the
try
-block would look like this:
fn logging_on() -> Result<bool, io::Error> { ... }
fn read_values() -> Result<SomeData, io::Error> { ... }
fn log_values(values: &SomeData) -> Result<(), io::Error> { ... }
do {
data <- read_values();
log <- logging_on();
if log {
log_values(data)
} else {
Ok(())
};
Ok(data)
}
Note that we cannot use logging_on
directly as the condition of the if
-expression and that the
if
-expression needs to return a Result
from both branches.
Utility functions can easily alleviate some of this, but Higher Kinded Types are required to make many of them generic enough.
Monad with state
The Result
and Option
monads only carry state in the type, eg. Option
has Some(T)
and
None
but there is no extra value describing any state. But there are monads which carry state
as another data-item, like the State, Iterator and Parser monads (the State monad would probably
not be very interesting for Rust, but list-comprehension and parser combinators certainly are).
The signature of the proposed ?
is “M<T, E> -> T
”, essentially the same as unwrap
but with
the invisible addition that an early return or jump will be performed if the type decides that it
is in an “error” state.
Monadic bind on the other hand has the signature M<T> -> (Fn*(T) -> M<U>) -> M<U>
; we have an
initial context M<T>
which is then unwrapped to let the closure Fn*(T) -> M<U>
act on it to
produce another wrapped value and then the returned M<U>
is merged with the remaining state of
the original M<T>
(this is a simplification). Both the unwrapping and merging of the value is
under the control which implements the bind-operator and the original state is still available,
which means that context can be carried through in an appropriate way.
The closure denoted Fn*
is one of the three closure types FnOnce
, FnMut
and Fn
since
different types of bind-implementations have different requirements (eg. Result
would use
FnOnce
since it is just run once, whereas an iterator would need FnMut
since the closure would
be executed once for each item). To let a trait-signature be generic over closures in this way is
something which is not yet possible in Rust at the moment, but once that is possible do
-notation
should not be far off.
The proposal also mentions that the signature of ?
inside of try
could be written as
M<T, E> -> (FnOnce(T) -> R, FnOnce(E) -> R) -> R
if the compiler rewrites it to the trait-part of
the proposal. This R
would then be wrapped using the static method M<R, E>::normal
once it is
returned from the try
-block, which has the signature R -> M<R, E>
.
This is very similar to monadic bind, but with some key differences in its use and the return value
R
is not actually merged in the monadic context M
. By adding a = M<U>
constraint to R
we
can allow the wrapping method to investigate and update the state of any returned M<U>
and
actually provide a proper monadic bind for the type. Though the proposal is lacking one very
important piece which is needed to make it at all possible, and that is how the state should be
carried over from the original M<T, E>
to the new M<R, E>
.
The trait-version of the proposal for ?
and try
-blocks is essentially a do-notation for a
bi-monad (ie. a monad carrying two values) but without the needed restrictions on the types or
any way of carrying state from the left hand side to the return value. This makes it impossible
to actually use for anything but the most simple monad types.
try
vs do
Differences:
do
requires users to lift expressions into the used type whereastry
requires users to unwrap values out of the type.do
only works on one level, requiring nested expressions to use their owndo
-blocks when necessary.try
allows the?
to short-circuit the whole thing whenever needed.try
automatically wraps the resulting value from the block whereasdo
requires the block to return a wrapped value.do
allows the type to control the state-management between statements,try
explicitly disallows the carrying of state between the original type and the resulting type.
Similarities:
- Both result in a wrapped value
Alternatives
The try
-blocks and ?
-operator are intrinsically tied to the execution around a “failure-state”
and does not consider any other type of state. There are alternatives which are more general and
would open up for different types of contextual-execution.
Personally I do not see any gain by the catch
itself, since it can easily be constructed using
existing constructions in the language. The try
-blocks or do
-notation is another matter,
if implemented properly this would be a nice composable way of dealing with compuations in a
context.
Method-position macros
Method-position macros could make the ?
operator without try
-blocks superflous, sine we would
be able to write the following:
File::create("foo.txt").try!.write_all(b"Hello world!").try!;
It does not replace the ?
+ try
-blocks functionality but could serve as a good complement to
some kind of do
-notation.
do
-notation
As detailed above it would be much more composable with different types compared to the
try
-blocks which would be limited to just short-circuiting types with simple control-flow.
If we compare a few of the examples from the comments in the RFC-comments and how they look if
we use do
-notation we can see that they are not so bad:
self.type_variables.borrow()
.probe(v)
.map(|t| self.shallow_resolve(t))
.unwrap_or(typ)
// is equivalent to:
try {
let t = self.type_variables.borrow().probe(v)?;
self.shallow_resolve(t)
} catch {
_ => typ
}
// which is equivalent to:
do {
t <- self.type_variables.borrow().probe(v);
Ok(self.shallow_resolve(t))
}.unwrap_or(typ)
Making a Option<(A, B)>
from two Option
:
// current:
a.and_then(|x| b.map(|y| (x, y)))
// try + ?:
try { (a?, b?) }
// do and map:
do { x <- a; b.map(|y| (x, y)) }
// only do:
do { x <- a; y <- b; Some((x, y)) }
Multiple try!
macros in a row would also be easier to deal with, especially if their
success-value is not needed:
// from libsyntax, printing code fragments:
try!(self.space_if_not_bol());
try!(self.ibox(indent_unit));
try!(self.word_nbsp("let"));
// can be written as:
do {
self.space_if_not_bol();
self.ibox(indent_unit);
self.word_nbsp("let")
}.try!
Personally I am a fan of do
-notation since it much more general than just a control-flow-specific
language-construction and allows much more advanced ways of composing operations.
EDIT: Posted on reddit: /r/rust