my blog lives here now
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

173 lines
5.4 KiB

6 years ago
2 years ago
6 years ago
  1. ---
  2. title: Monadic Parsing with User State
  3. date: August 26, 2016
  4. synopsys: 2
  5. ---
  6. > {-# LANGUAGE FlexibleInstances, MultiParamTypeClasses #-}
  7. > module StatefulParsing where
  8. > import Control.Monad.State.Class
  9. > import Control.Applicative
  10. In this post I propose an extension to the monadic parser framework
  11. introduced in a previous post, _[You could have invented
  12. Parsec](/posts/2016-08-17.html)_, that extends
  13. the parser to also support embedded user state in your parsing.
  14. This could be used, for example, for parsing a language with
  15. user-extensible operators: The precedences and fixidities of operators
  16. would be kept in a hashmap threaded along the bind chain.
  17. Instead of posing these changes as diffs, we will rewrite the parser
  18. framework from scratch with the updated type.
  19. ---
  20. Parser `newtype`{.haskell}
  21. =========================
  22. Our new parser is polymorphic in both the return type and the user state
  23. that, so we have to update the `newtype`{.haskell} declaration to match.
  24. > newtype Parser state result
  25. > = Parser { runParser :: String
  26. > -> state
  27. > -> Either String (result, state, String) }
  28. Our tuple now contains the result of the parsing operation and the new
  29. user state, along with the stream. We still need to supply a stream to
  30. parse, and now also supply the initial state. This will be reflected in
  31. our functions.
  32. For convenience, we also make a `Parser' a`{.haskell} type alias for
  33. parsers with no user state.
  34. < type Parser' a = Parser () a
  35. Seeing as type constructors are also curried, we can apply η-reduction
  36. to get the following, which is what we'll go
  37. with.
  38. > type Parser' = Parser ()
  39. `Functor`{.haskell} instance
  40. ============================
  41. > instance Functor (Parser st) where
  42. The functor instance remains mostly the same, except now we have to
  43. thread the user state around, too.
  44. The instance head also changes to fit the kind signature of the
  45. `Functor`{.haskell} typeclass. Since user state can not change from
  46. fmapping, this is fine.
  47. > fn `fmap` (Parser p) = Parser go where
  48. > go st us = case p st us of
  49. > Left e -> Left e
  50. > Right (r, us', st') -> Right (fn r, us', st')
  51. As you can see, the new user state (`us'`) is just returned as is.
  52. `Applicative`{.haskell} instance
  53. ================================
  54. > instance Applicative (Parser st) where
  55. The new implementations of `pure`{.haskell} and `<*>`{.haskell} need to
  56. correctly manipulate the user state. In the case of `pure`, it's just passed
  57. as-is to the `Right`{.haskell} constructor.
  58. > pure ret = Parser go where
  59. > go st us = Right (ret, us, st)
  60. Since `(<*>)` needs to evaluate both sides before applying the function, we need
  61. to pass the right-hand side's generated user state to the right-hand side for
  62. evaluation.
  63. > (Parser f) <*> (Parser v) = Parser go where
  64. > go st us = case f st us of
  65. > Left e -> Left e
  66. > Right (fn, us', st') -> case v st' us' of
  67. > Left e -> Left e
  68. > Right (vl, us'', st'') -> Right (fn vl, us'', st'')
  69. `Monad`{.haskell} instance
  70. ==========================
  71. > instance Monad (Parser st) where
  72. Since we already have an implementation of `pure`{.haskell} from the Applicative
  73. instance, we don't need to worry about an implementation of `return`.
  74. > return = pure
  75. The monad instance is much like the existing monad instance, except now we have
  76. to give the updated parser state to the new computation.
  77. > (Parser p) >>= f = Parser go where
  78. > go s u = case p s u of
  79. > Left e -> Left e
  80. > Right (x, u', s') -> runParser (f x) s' u'
  81. `MonadState`{.haskell} instance
  82. ===============================
  83. > instance MonadState st (Parser st) where
  84. Since we now have a state transformer in the parser, we can make it an instance
  85. of the MTL's `MonadState` class.
  86. The implementation of `put`{.haskell} must return `()` (the unit value), and
  87. needs to replace the existing state with the supplied one. This operation can
  88. not fail.
  89. Since this is a parsing framework, we also need to define how the stream is
  90. going to be affected: In this case, it isn't.
  91. > put us' = Parser go where
  92. > go st _ = Right ((), us', st)
  93. The `get`{.haskell} function returns the current user state, and leaves it
  94. untouched. This operation also does not fail.
  95. > get = Parser go where
  96. > go st us = Right (us, us, st)
  97. Since we're an instance of `MonadState`{.haskell}, we needn't an implementation
  98. of `modify` and friends - They're given by the MTL.
  99. `Alternative`{.haskell} instance
  100. ================================
  101. > instance Alternative (Parser st) where
  102. The `Alternative`{.haskell} instance uses the same state as it was given for
  103. trying the next parse.
  104. The `empty`{.haskell} parser just fails unconditionally.
  105. > empty = Parser go where
  106. > go _ _ = Left "empty parser"
  107. `(<|>)` will try both parsers in order, reusing both the state and the stream.
  108. > (Parser p) <|> (Parser q) = Parser go where
  109. > go st us = case p st us of
  110. > Left e -> q st us
  111. > Right v -> Right v
  112. Conclusion
  113. ==========
  114. This was a relatively short post. This is because many of the convenience
  115. functions defined in the previous post also work with this parser framework, if
  116. you replace `Parser` with `Parser'`. You can now use `get`, `put` and `modify`
  117. to work on the parser's user state. As a closing note, a convenience function
  118. for running parsers with no state is given.
  119. > parse :: Parser' a -> String -> Either String a
  120. > parse str = case runParser str () of
  121. > Left e -> Left e
  122. > Right (x, _, _) -> x