|
---
|
|
title: Monadic Parsing with User State
|
|
date: August 26, 2016
|
|
---
|
|
|
|
> {-# LANGUAGE FlexibleInstances, MultiParamTypeClasses #-}
|
|
> module StatefulParsing where
|
|
> import Control.Monad.State.Class
|
|
> import Control.Applicative
|
|
|
|
In this post I propose an extension to the monadic parser framework
|
|
introduced in a previous post, _[You could have invented
|
|
Parsec](/posts/2016-08-17.html)_, that extends
|
|
the parser to also support embedded user state in your parsing.
|
|
|
|
This could be used, for example, for parsing a language with
|
|
user-extensible operators: The precedences and fixidities of operators
|
|
would be kept in a hashmap threaded along the bind chain.
|
|
|
|
Instead of posing these changes as diffs, we will rewrite the parser
|
|
framework from scratch with the updated type.
|
|
|
|
---
|
|
|
|
Parser `newtype`{.haskell}
|
|
=========================
|
|
|
|
Our new parser is polymorphic in both the return type and the user state
|
|
that, so we have to update the `newtype`{.haskell} declaration to match.
|
|
|
|
> newtype Parser state result
|
|
> = Parser { runParser :: String
|
|
> -> state
|
|
> -> Either String (result, state, String) }
|
|
|
|
Our tuple now contains the result of the parsing operation and the new
|
|
user state, along with the stream. We still need to supply a stream to
|
|
parse, and now also supply the initial state. This will be reflected in
|
|
our functions.
|
|
|
|
For convenience, we also make a `Parser' a`{.haskell} type alias for
|
|
parsers with no user state.
|
|
|
|
< type Parser' a = Parser () a
|
|
|
|
Seeing as type constructors are also curried, we can apply η-reduction
|
|
to get the following, which is what we'll go
|
|
with.
|
|
|
|
> type Parser' = Parser ()
|
|
|
|
`Functor`{.haskell} instance
|
|
============================
|
|
|
|
> instance Functor (Parser st) where
|
|
|
|
The functor instance remains mostly the same, except now we have to
|
|
thread the user state around, too.
|
|
|
|
The instance head also changes to fit the kind signature of the
|
|
`Functor`{.haskell} typeclass. Since user state can not change from
|
|
fmapping, this is fine.
|
|
|
|
> fn `fmap` (Parser p) = Parser go where
|
|
> go st us = case p st us of
|
|
> Left e -> Left e
|
|
> Right (r, us', st') -> Right (fn r, us', st')
|
|
|
|
As you can see, the new user state (`us'`) is just returned as is.
|
|
|
|
`Applicative`{.haskell} instance
|
|
================================
|
|
|
|
> instance Applicative (Parser st) where
|
|
|
|
The new implementations of `pure`{.haskell} and `<*>`{.haskell} need to
|
|
correctly manipulate the user state. In the case of `pure`, it's just passed
|
|
as-is to the `Right`{.haskell} constructor.
|
|
|
|
> pure ret = Parser go where
|
|
> go st us = Right (ret, us, st)
|
|
|
|
Since `(<*>)` needs to evaluate both sides before applying the function, we need
|
|
to pass the right-hand side's generated user state to the right-hand side for
|
|
evaluation.
|
|
|
|
> (Parser f) <*> (Parser v) = Parser go where
|
|
> go st us = case f st us of
|
|
> Left e -> Left e
|
|
> Right (fn, us', st') -> case v st' us' of
|
|
> Left e -> Left e
|
|
> Right (vl, us'', st'') -> Right (fn vl, us'', st'')
|
|
|
|
|
|
`Monad`{.haskell} instance
|
|
==========================
|
|
|
|
> instance Monad (Parser st) where
|
|
|
|
Since we already have an implementation of `pure`{.haskell} from the Applicative
|
|
instance, we don't need to worry about an implementation of `return`.
|
|
|
|
> return = pure
|
|
|
|
The monad instance is much like the existing monad instance, except now we have
|
|
to give the updated parser state to the new computation.
|
|
|
|
> (Parser p) >>= f = Parser go where
|
|
> go s u = case p s u of
|
|
> Left e -> Left e
|
|
> Right (x, u', s') -> runParser (f x) s' u'
|
|
|
|
`MonadState`{.haskell} instance
|
|
===============================
|
|
|
|
> instance MonadState st (Parser st) where
|
|
|
|
Since we now have a state transformer in the parser, we can make it an instance
|
|
of the MTL's `MonadState` class.
|
|
|
|
The implementation of `put`{.haskell} must return `()` (the unit value), and
|
|
needs to replace the existing state with the supplied one. This operation can
|
|
not fail.
|
|
|
|
Since this is a parsing framework, we also need to define how the stream is
|
|
going to be affected: In this case, it isn't.
|
|
|
|
> put us' = Parser go where
|
|
> go st _ = Right ((), us', st)
|
|
|
|
The `get`{.haskell} function returns the current user state, and leaves it
|
|
untouched. This operation also does not fail.
|
|
|
|
> get = Parser go where
|
|
> go st us = Right (us, us, st)
|
|
|
|
Since we're an instance of `MonadState`{.haskell}, we needn't an implementation
|
|
of `modify` and friends - They're given by the MTL.
|
|
|
|
`Alternative`{.haskell} instance
|
|
================================
|
|
|
|
> instance Alternative (Parser st) where
|
|
|
|
The `Alternative`{.haskell} instance uses the same state as it was given for
|
|
trying the next parse.
|
|
|
|
The `empty`{.haskell} parser just fails unconditionally.
|
|
|
|
> empty = Parser go where
|
|
> go _ _ = Left "empty parser"
|
|
|
|
`(<|>)` will try both parsers in order, reusing both the state and the stream.
|
|
|
|
> (Parser p) <|> (Parser q) = Parser go where
|
|
> go st us = case p st us of
|
|
> Left e -> q st us
|
|
> Right v -> Right v
|
|
|
|
Conclusion
|
|
==========
|
|
|
|
This was a relatively short post. This is because many of the convenience
|
|
functions defined in the previous post also work with this parser framework, if
|
|
you replace `Parser` with `Parser'`. You can now use `get`, `put` and `modify`
|
|
to work on the parser's user state. As a closing note, a convenience function
|
|
for running parsers with no state is given.
|
|
|
|
> parse :: Parser' a -> String -> Either String a
|
|
> parse str = case runParser str () of
|
|
> Left e -> Left e
|
|
> Right (x, _, _) -> x
|