m4rw3r The blog of Martin Wernstål

Parser Combinator Experiments in Rust

For the last week I have been working a bit on parser-combinator experiments using the programming-language Rust. I have tried stacking structs, manually threading state and boxed closures, with the last two seeming to be the most promising.

I am writing this as I would like feedback on my approach, as well as to announce that I have something in the works which looks pretty promising.

The code

The code can be found in my rust_parser_experiments repo, the master branch currently containing the manual state-threading and the fourth branch containing the boxed closure version.

First, some numbers

I used the attoparsec’s http-header examples as a simple benchmark to compare the two approaches, both with each other as well as the other provided examples to see how well it holds up. I also included the parser combinator Nom in this comparison.

The data used is the file http-requests.txt file as well as two copies of this file which contains the same data copied 100 times and 10,000 times, resulting two files, 2 MB and 204 MB, in size.

The tests were run on a MacBook Pro (Retina, 15-inch, Late 2013) with a 2.3 GHz Intel Core i7 and 16 GB RAM. All optimizations were turned on (-O2 for Haskell, -O3 for C and --release for Rust).

Parser Time, 21 kB Time, 2 MB Time, 204 MB
C http-parser 0.003 s 0.009 s 0.62 s
Attoparsec 0.004 s 0.021 s 1.45 s
Parsec 0.009 s 0.490 s 47.75 s
Nom 0.003 s 0.018 s 1.42 s
Manual 0.003 s 0.015 s 1.19 s
Boxed 0.004 s 0.041 s 3.75 s

Making some quick profiling using Instruments (bundled with XCode) it seems like the Boxed version spends about a second in je_malloc to allocate all the boxed closures which are used. In comparison the Manual version spends most of its time (~600 ms) in the message_header function.

Usage

The basic monad laws apply to both of the approaches, the difference being that manual state threading requires the first parameter to always be the monad-state.

Haskell Manual Boxed
m :: Parser a m: Parser<A> m: Parser<A>
f :: a -> Parser b f: Fn(Empty, A) -> Parser<B> f: Fn(A) -> Parser<B>
g :: Parser b g: Fn(Empty) -> Parser<B> g: Fn() -> Parser<B>
m >>= f bind(m, f) bind(m, f)
m >> g bind(m, |m, _| g(m)) bind(f, |_| g())
return a ret(m, a) ret(a)
fail a err(m, a) err(a)

For the manual version above, Empty is an alias for Parser<()> to indicate state but no wrapped value (all the parsers require an Empty instance to act upon, to prevent accidental loss of data).

The usage itself does not differ much, all parsers just needs to accomodate for that Empty parameter.

Expanded version of do-syntax:

Manually threading state

fn request_line<'a>(p: Empty<'a, u8>) -> Parser<'a, u8, Request<'a>, Error<u8>> {
    bind(take_while1(p, is_token), |p, method|
         bind(take_while1(p, is_space), |p, _|
              bind(take_while1(p, is_not_space), |p, uri|
                   bind(take_while1(p, is_space), |p, _|
                        bind(http_version(p), |p, version|
                             ret(p, Request{method: method, uri: uri, version: version,}))))))
}

Boxed

fn request_line<'a>() -> Parser<'a, 'a, u8, Request<'a>, Error<u8>> {
    bind(take_while1(is_token), move |method|
         bind(take_while1(is_space), move |_|
              bind(take_while1(is_not_space), move |uri|
                   bind(take_while1(is_space), move |_|
                        bind(http_version(), move |version|
                             ret(Request{method: method, uri: uri, version: version,}))))))
}

And here is the difference in the http-example once the mdo! macro is used: Diff

Operation

The difference really becomes apparent once you look at how the parser is run. The manual threading of state eagerly parses the data as it moves through the defined parser functions, eg. in request_line above the resulting data contains the actual Request struct, fully populated and all, as well as the remaining data to be parsed. This is pretty straightforward and inlining makes the resulting code appear similar to how a hand-written parser would look.

The boxed-closure version on the other hand does nothing when a defined parser like request_line is run. Instead this allocates a bunch of structures on the heap (boxed FnOnces inside of each other) which are then executed once you want to parse something. Due to requiremens from ret the boxed closures cannot be Fn or FnMut or it would otherwise enforce Clone on the value passed to ret.

The FnOnce requirement also poses a problem when combinators like many and manyTill are used, as the parser can only be used once. This means that many and the like require a newly allocated parser for every iteration, taking a Fn() -> Parser<A> as a parameter, which causes a lot of heap-allocation during parsing.

My opinion

I am currently on the fence here, but leaning towards using the manual threading-of-state for the time being since it optimizes much better and seems to correspond better to Rust’s current feature-set. Once abstract return types lands, and it works with functions, it should hopefully improve the performance of the closure-returning version and make that a clear winner.

EDIT 2015-08-19: Updated performance numbers for Nom. The nom::Stepper introduced lots of memory operations and caused cycles to be spent on moving data which should not have needed to be moved.

This increased Nom’s performance from 0.004 s and 6.902 s for the first two benchmarks, and the last one couldn’t complete it in one hour when nom::Stepper was used. The nom::MemProducer was also tried along with nom::Stepper but that did not make any noticeable difference.

EDIT: Posted on Reddit here: /r/rust