my blog lives here now
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

172 lines
5.4 KiB

7 years ago
  1. ---
  2. title: Monadic Parsing with User State
  3. date: August 26, 2016
  4. ---
  5. > {-# LANGUAGE FlexibleInstances, MultiParamTypeClasses #-}
  6. > module StatefulParsing where
  7. > import Control.Monad.State.Class
  8. > import Control.Applicative
  9. In this post I propose an extension to the monadic parser framework
  10. introduced in a previous post, _[You could have invented
  11. Parsec](/posts/2016-08-17.html)_, that extends
  12. the parser to also support embedded user state in your parsing.
  13. This could be used, for example, for parsing a language with
  14. user-extensible operators: The precedences and fixidities of operators
  15. would be kept in a hashmap threaded along the bind chain.
  16. Instead of posing these changes as diffs, we will rewrite the parser
  17. framework from scratch with the updated type.
  18. ---
  19. Parser `newtype`{.haskell}
  20. =========================
  21. Our new parser is polymorphic in both the return type and the user state
  22. that, so we have to update the `newtype`{.haskell} declaration to match.
  23. > newtype Parser state result
  24. > = Parser { runParser :: String
  25. > -> state
  26. > -> Either String (result, state, String) }
  27. Our tuple now contains the result of the parsing operation and the new
  28. user state, along with the stream. We still need to supply a stream to
  29. parse, and now also supply the initial state. This will be reflected in
  30. our functions.
  31. For convenience, we also make a `Parser' a`{.haskell} type alias for
  32. parsers with no user state.
  33. < type Parser' a = Parser () a
  34. Seeing as type constructors are also curried, we can apply η-reduction
  35. to get the following, which is what we'll go
  36. with.
  37. > type Parser' = Parser ()
  38. `Functor`{.haskell} instance
  39. ============================
  40. > instance Functor (Parser st) where
  41. The functor instance remains mostly the same, except now we have to
  42. thread the user state around, too.
  43. The instance head also changes to fit the kind signature of the
  44. `Functor`{.haskell} typeclass. Since user state can not change from
  45. fmapping, this is fine.
  46. > fn `fmap` (Parser p) = Parser go where
  47. > go st us = case p st us of
  48. > Left e -> Left e
  49. > Right (r, us', st') -> Right (fn r, us', st')
  50. As you can see, the new user state (`us'`) is just returned as is.
  51. `Applicative`{.haskell} instance
  52. ================================
  53. > instance Applicative (Parser st) where
  54. The new implementations of `pure`{.haskell} and `<*>`{.haskell} need to
  55. correctly manipulate the user state. In the case of `pure`, it's just passed
  56. as-is to the `Right`{.haskell} constructor.
  57. > pure ret = Parser go where
  58. > go st us = Right (ret, us, st)
  59. Since `(<*>)` needs to evaluate both sides before applying the function, we need
  60. to pass the right-hand side's generated user state to the right-hand side for
  61. evaluation.
  62. > (Parser f) <*> (Parser v) = Parser go where
  63. > go st us = case f st us of
  64. > Left e -> Left e
  65. > Right (fn, us', st') -> case v st' us' of
  66. > Left e -> Left e
  67. > Right (vl, us'', st'') -> Right (fn vl, us'', st'')
  68. `Monad`{.haskell} instance
  69. ==========================
  70. > instance Monad (Parser st) where
  71. Since we already have an implementation of `pure`{.haskell} from the Applicative
  72. instance, we don't need to worry about an implementation of `return`.
  73. > return = pure
  74. The monad instance is much like the existing monad instance, except now we have
  75. to give the updated parser state to the new computation.
  76. > (Parser p) >>= f = Parser go where
  77. > go s u = case p s u of
  78. > Left e -> Left e
  79. > Right (x, u', s') -> runParser (f x) s' u'
  80. `MonadState`{.haskell} instance
  81. ===============================
  82. > instance MonadState st (Parser st) where
  83. Since we now have a state transformer in the parser, we can make it an instance
  84. of the MTL's `MonadState` class.
  85. The implementation of `put`{.haskell} must return `()` (the unit value), and
  86. needs to replace the existing state with the supplied one. This operation can
  87. not fail.
  88. Since this is a parsing framework, we also need to define how the stream is
  89. going to be affected: In this case, it isn't.
  90. > put us' = Parser go where
  91. > go st _ = Right ((), us', st)
  92. The `get`{.haskell} function returns the current user state, and leaves it
  93. untouched. This operation also does not fail.
  94. > get = Parser go where
  95. > go st us = Right (us, us, st)
  96. Since we're an instance of `MonadState`{.haskell}, we needn't an implementation
  97. of `modify` and friends - They're given by the MTL.
  98. `Alternative`{.haskell} instance
  99. ================================
  100. > instance Alternative (Parser st) where
  101. The `Alternative`{.haskell} instance uses the same state as it was given for
  102. trying the next parse.
  103. The `empty`{.haskell} parser just fails unconditionally.
  104. > empty = Parser go where
  105. > go _ _ = Left "empty parser"
  106. `(<|>)` will try both parsers in order, reusing both the state and the stream.
  107. > (Parser p) <|> (Parser q) = Parser go where
  108. > go st us = case p st us of
  109. > Left e -> q st us
  110. > Right v -> Right v
  111. Conclusion
  112. ==========
  113. This was a relatively short post. This is because many of the convenience
  114. functions defined in the previous post also work with this parser framework, if
  115. you replace `Parser` with `Parser'`. You can now use `get`, `put` and `modify`
  116. to work on the parser's user state. As a closing note, a convenience function
  117. for running parsers with no state is given.
  118. > parse :: Parser' a -> String -> Either String a
  119. > parse str = case runParser str () of
  120. > Left e -> Left e
  121. > Right (x, _, _) -> x