my blog lives here now
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

456 lines
18 KiB

6 years ago
  1. ---
  2. title: The Amulet Programming Language
  3. date: January 18, 2018
  4. ---
  5. As you might have noticed, I like designing and implementing programming
  6. languages. This is another of these projects. Amulet is a
  7. strictly-evaluated, statically typed impure roughly functional
  8. programming language with support for parametric data types and rank-1
  9. polymorphism _à la_ Hindley-Milner (but [no
  10. let-generalization](#letgen)), along with row-polymorphic records. While
  11. syntactically inspired by the ML family, it's a disservice to those
  12. languages to group Amulet with them, mostly because of the (present)
  13. lack of modules.
  14. Planned features (that I haven't even started working on, as of writing
  15. this post) include generalized algebraic data types, modules and modular
  16. implicits, a reworked type inference engine based on _OutsideIn(X)_[^4]
  17. to support the other features, and, perhaps most importantly, a back-end
  18. that's not a placeholder (i.e. something that generates either C or LLVM
  19. and can be compiled to a standalone executable).
  20. The compiler is still very much a work in progress, and is actively
  21. being improved in several ways: Rewriting the parser for efficiency
  22. concerns (see [Lexing and Parsing](#parser)), improving the quality of
  23. generated code by introducing more intermediate representations, and
  24. introducing several optimisations on the one intermediate language we
  25. _do_ have.
  26. ## The Technical Bits
  27. In this section, I'm going to describe the implementation of the
  28. compiler as it exists at the time of writing - warts and all.
  29. Unfortunately, we have a bit too much code for all of it to fit in this
  30. blag post, so I'm only going to include the horribly broken bits here,
  31. and leave the rest out. Of course, the compiler is open source, and is
  32. available on my [GitHub][2].
  33. ### Lexing and Parsing {#parser}
  34. To call what we have a _lexer_ is a bit of an overstatement: The
  35. `Parser.Lexer` module, which underpins the actual parser, contains only
  36. a handful of imports and some definitions for use with [Parsec's][3]
  37. [`Text.Parsec.Token`][4] module; Everything else is boilerplate, namely,
  38. declaring, at top-level, the functions generated by `makeTokenParser`.
  39. Our parser is then built on top of this infrastructure (and the other
  40. combinators provided by Parsec) in a monadic style. Despite having
  41. chosen to use strict `Text`s, many of the Parsec combinators return
  42. `Char`s, and using the Alternative type class' ability to repeat actions
  43. makes linked lists of these - the dreaded `String` type. Due to this,
  44. and other inefficiencies, the parser is ridiculously bad at memory
  45. management.
  46. However, it does have some cute hacks. For example, the pattern parser
  47. has to account for being used in the parsing of both `match`{.ml} and
  48. `fun`{.ml} - in the former, destructuring patterns may appear without
  49. parenthesis, but in the latter, they _must_ be properly parenthesised:
  50. since `fun`{.ml} may have multiple patterns, it would be ambiguous if
  51. `fun Foo x -> ...`{.ml} is destructuring a `Foo` or takes two arguments.
  52. Instead of duplicating the pattern parser, one for `match`{.ml}es and
  53. one for function arguments, we instead _parametrised_ the parser over
  54. needing parenthesis or not by adding a rank-2 polymorphic continuation
  55. argument.
  56. ```haskell
  57. patternP :: (forall a. Parser a -> Parser a) -> Parser Pattern'
  58. patternP cont = wildcard <|> {- some bits omitted -} try destructure where
  59. destructure = withPos . cont $ do
  60. ps <- constrName
  61. Destructure ps <$> optionMaybe (patternP id)
  62. ```
  63. When we're parsing a pattern `match`{.ml}-style, the continuation given
  64. is `id`, and when we're parsing an argument, the continuation is
  65. `parens`.
  66. For the aforementioned efficiency concerns, however, we've decided to
  67. scrap the Parsec-based parser and move to an Alex/Happy based solution,
  68. which is not only going to be more maintainable and more easily hackable
  69. in the future, but will also be more efficient overall. Of course, for
  70. a toy compiler such as this one, efficiency doesn't matter that much,
  71. but using _one and a half gigabytes_ to compile a 20-line file is really
  72. bad.
  73. ### Renaming {#renamer}
  74. To simplify scope handling in both the type checker and optimiser, after
  75. parsing, each variable is tagged with a globally unique integer that is
  76. enough to compare variables. This also lets us use more efficient data
  77. structures later in the compiler, such as `VarSet`, which stores only the
  78. integer identifier of a variable in a big-endian Patricia tree[^1].
  79. Our approach, described in _[Secrets of the Glasgow Haskell Compiler
  80. inliner][5]_ as "the Sledgehammer", consists of duplicating _every_
  81. bound variable to avoid name capture problems. However, while the first
  82. of the listed disadvantages surely does apply, by doing all of the
  83. _renaming_ in one go, we mostly avoid the latter. Of course, since then,
  84. the Haskell ecosystem has evolved significantly, and the plumbing
  85. required is a lot less intrusive.
  86. In our compiler, we use MTL-style classes instead of concrete monad
  87. transformer stacks. We also run every phase after parsing in a single
  88. `GenT`{.haskell} monad, which provides a fresh supply of integers for
  89. names. "Plumbing" the fresh name supply, then, only involves adding a
  90. `MonadGen Int m` constraint to the context of functions that need it.
  91. Since the string component of parsed names is not thrown away, we also
  92. have to make up strings themselves. This is where another cute hack
  93. comes in: We generate, lazily, an infinite stream of names that goes
  94. `["a" .. "z", "aa" .. "az", "ba" .. "bz", ..]`, then use the
  95. `MonadGen`{.haskell} counter as an index into that stream.
  96. ```haskell
  97. alpha :: [Text]
  98. alpha = map T.pack $ [1..] >>= flip replicateM ['a'..'z']
  99. ```
  100. ### Desugaring
  101. The desugarer is a very simple piece of code which, through use of _Scrap
  102. Your Boilerplate_-style generic programming, traverses the syntax tree
  103. and rewrites nodes representing syntax sugar to their more explicit
  104. versions.
  105. Currently, the desugarer only expands _sections_: That is, expressions
  106. of the form `(+ e)` become `fun x -> x + e` (where `e` is a fresh name),
  107. expressions like `(e +)` become `fun x -> e + x`, and expressions like
  108. `.foo` becomes `fun x -> x.foo`.
  109. This is the only component of the compiler that I can reasonably
  110. include, in its entirety, in this post.
  111. ```haskell
  112. desugarProgram = everywhereM (mkM defaults) where
  113. defaults :: Expr Parsed -> m (Expr Parsed)
  114. defaults (BothSection op an) = do
  115. (ap, ar) <- fresh an
  116. (bp, br) <- fresh an
  117. pure (Fun ap (Fun bp (BinOp ar op br an) an) an)
  118. defaults (LeftSection op vl an) = do
  119. (cap, ref) <- fresh an
  120. pure (Fun cap (BinOp ref op vl an) an)
  121. defaults (RightSection op vl an) = do
  122. (cap, ref) <- fresh an
  123. pure (Fun cap (BinOp vl op ref an) an)
  124. defaults (AccessSection key an) = do
  125. (cap, ref) <- fresh an
  126. pure (Fun cap (Access ref key an) an)
  127. defaults x = pure x
  128. ```
  129. ### Type Checking
  130. By far the most complicated stage of the compiler pipeline, our
  131. inference algorithm is modelled after Algorithm W (extended with kinds
  132. and kind inference), with constraint generation and solving being two
  133. separate steps.
  134. We first traverse the syntax tree, in order, making up constraints and
  135. fresh type variables as needed, then invoke a unification algorithm to
  136. produce a substitution, then apply that over both the generated type (a
  137. skeleton of the actual result) and the syntax tree (which is explicitly
  138. annotated with types everywhere).
  139. The type inference code also generates and inserts explicit type
  140. applications when instancing polymorphic types, since we internally
  141. lower Amulet into a System F core language with explicit type
  142. abstraction and application. We have `TypeApp` nodes in the syntax tree
  143. that never get parsed or renamed, and are generated by the type checker
  144. before lowering happens.
  145. Our constraint solver is quite rudimentary, but it does the job nicely.
  146. We operate with a State monad with the current substitution. When we
  147. unify a variable with another type, it is added to the current
  148. substitution. Everything else is just zipping the types together. When
  149. we try to unify, say, a function type with a constructor, that's an
  150. error. If a variable has already been added to the current substitution and
  151. encounter it again, the new type is unified with the previously recorded
  152. one.
  153. ```haskell
  154. unify :: Type Typed -> Type Typed -> SolveM ()
  155. unify (TyVar a) b = bind a b
  156. unify a (TyVar b) = bind b a
  157. unify (TyArr a b) (TyArr a' b') = unify a a' *> unify b b'
  158. unify (TyApp a b) (TyApp a' b') = unify a a' *> unify b b'
  159. unify ta@(TyCon a) tb@(TyCon b)
  160. | a == b = pure ()
  161. | otherwise = throwError (NotEqual ta tb)
  162. ```
  163. This is only an excerpt, because we have very complicated types.
  164. #### Polymorphic Records
  165. One of Amulet's selling points (if one could call it that) is its support
  166. for row-polymorphic records. We have two types of first-class record
  167. types: _closed_ record types (the type of literals) and _open_ record
  168. types (the type inferred by record patterns and field getters.). Open
  169. record types have the shape `{ 'p | x_n : t_n ... x_n : t_n }`{.ml},
  170. while closed records lack the type variable `'p`{.ml}.
  171. Unification of records has 3 cases, but in all 3 cases it is checked that
  172. fields present in both records have unifiable types.
  173. - When unifying an open record with a closed one, present in both
  174. records have unifiable types, and instance the type variable to contain
  175. the extra fields.
  176. - When unifying two closed records, they must have exactly the same
  177. shape and unifiable types for common fields.
  178. - When unifying two open record types, a new fresh type variable is
  179. created to use as the "hole" and tack the fields together.
  180. As an example, `{ x = 1 }` has type `{ x : int }`{.ml}, the function
  181. `fun x -> x.foo` has type `{ 'p | foo : 'a } -> 'a`{.ml}, and
  182. `(fun r -> r.x) { y = 2 }` is a type error[^2].
  183. #### No Let Generalisation {#letgen}
  184. Vytiniotis, Peyton Jones and Schrijvers argue[^5] that HM-style
  185. `let`{.ml} generalisation interacts badly with complex type system
  186. extensions such as GADTs and type families, and should therefore be
  187. omitted from such systems. In a deviation from the paper, GHC 7.2
  188. reintroduces `let`{.ml} generalisation for local definitions that meet
  189. some criteria[^3].
  190. > Here's the rule. With `-XMonoLocalBinds` (the default), a binding
  191. > without a type signature is **generalised only if all its free variables
  192. > are closed.**
  193. >
  194. > A binding is **closed** if and only if
  195. >
  196. > - It has a type signature, and the type signature has no free variables; or
  197. > - It has no type signature, and all its free variables are closed, and it
  198. is unaffected by the monomorphism restriction. And hence it is fully
  199. generalised.
  200. We, however, have chosen to follow that paper to a tee. Despite not
  201. (yet!) having any of those fancy type system features that interact
  202. poorly with let generalisation, we do not generalise _any_ local
  203. bindings.
  204. ### Lowering
  205. After type checking is done (and, conveniently, type applications have
  206. been left in the correct places for us by the type checker), Amulet code
  207. is converted into an explicitly-typed intermediate representation, in
  208. direct style, which is used for (local) program optimisation. The AST is
  209. simplified considerably: from 19 constructors to 9.
  210. Type inference is no longer needed: the representation of core is packed
  211. with all the information we need to check that programs are
  212. type-correct. This includes types in every binder (lambda abstractions,
  213. `let`{.ml}s, pattern bindings in `match`{.ml}), big-lambda abstractions
  214. around polymorphic values (a $\lambda$ binds a value, while a $\Lambda$
  215. binds a type), along with the already mentioned type applications.
  216. Here, code also gets the error branches for non-exhaustive `match`{.ml}
  217. expressions, and, as a general rule, gets a lot uglier.
  218. ```ocaml
  219. let main _ = (fun r -> r.x) { x = 2 }
  220. (* Is elaborated into *)
  221. let main : ∀ 'e. 'e -> int =
  222. Λe : *. λk : 'e. match k {
  223. (p : 'e) : 'e -> (λl : { 'g | x : int }. match l {
  224. (r : { 'g | x : int }) : { 'g | x : int } -> match r {
  225. { (n : { 'g | x : int }) | x = (m : int) } : { 'g | x : int } -> m
  226. };
  227. (o : { 'g | x : int }) : { 'g | x : int } ->
  228. error @int "<test>[1:15 .. 1:27]"
  229. }) ({ {} | x : int = 2 });
  230. (q : 'e) : 'e -> error @int "<test>[1:14 .. 1:38]"
  231. }
  232. ```
  233. ### Optimisation
  234. As the code we initially get from lowering is ugly and inefficient -
  235. along with being full of the abstractions functional programs have by
  236. nature, it is full of redundant matches created by e.g. the fact that
  237. functions can not do pattern matching directly, and that field access
  238. gets reduced to pattern matching - the optimiser's job is to make it
  239. prettier, and more efficient.
  240. The optimiser works by applying, in order, a series of local
  241. transformations operating on individual sub-terms to produce an efficient
  242. program, 25 times. The idea of applying them several times is that, when
  243. a simplification pass kicks in, more simplification opportunities might
  244. arise.
  245. #### `dropBranches`, `foldExpr`, `dropUselessLets`
  246. These trivial passes remove similarly trivial pieces of code that only
  247. add noise to the program. `dropBranches` will do its best to remove
  248. redundant arms from a `match`{.ml} expression, such as those that
  249. appear after an irrefutable pattern. `foldExpr` reduces uses of
  250. operators where both sides are known, e.g. `2 + 2` (replaced by the
  251. literal `5`) or `"foo " ^ "bar"` (replaced by the literal `"foo
  252. bar"`). `dropUselessLets` removes `let`{.ml}s that bind unused variables
  253. whose right-hand sides are pure expressions.
  254. #### `trivialPropag`, `constrPropag`
  255. The Amulet optimiser does inlining decisions in two (well, three)
  256. separate phases: One is called _propagation_, in which a `let` decides
  257. to propagate its bound values into the expression, and the other is the
  258. more traditional `inlining`, where variables get their values from the
  259. context.
  260. Propagation is by far the easiest of the two: The compiler can see both
  261. the definitions and all of the use sites, and could in theory decide if
  262. propagating is beneficial or not. Right now, we propagate all literals
  263. (and records made up solely of other trivial expressions), and do a
  264. round of propagation that is best described as a rule.
  265. ```ocaml
  266. let { v = C e } in ... v ...
  267. (* becomes *)
  268. let { v' = e } in ... C v' ...
  269. ```
  270. This _constructor propagation_ allows the `match`{.ml} optimisations to kick
  271. in more often, and is semantics preserving.
  272. #### `match`{.ml}-of-known-constructor
  273. This pass identifies `match`{.ml} expressions where we can statically
  274. determine the expression being analysed and, therefore, decide which
  275. branch is going to be taken.
  276. ```ocaml
  277. match C x with
  278. | C e -> ... e ...
  279. ...
  280. (* becomes *)
  281. ... x ...
  282. ```
  283. #### `match`{.ml}-of-bottom
  284. It is always safe to turn a `match`{.ml} where the term being matched is a
  285. diverging expression into only that diverging expression, thus reducing
  286. code size several times.
  287. ```ocaml
  288. match (error @int "message") with ...
  289. (* becomes *)
  290. error @int "message"
  291. ```
  292. As a special case, when one of the arms is itself a diverging
  293. expression, we use the type mentioned in that application to `error` to
  294. fix up the type of the value being scrutinized.
  295. ```ocaml
  296. match (error @foo "message") with
  297. | _ -> error @bar "message 2"
  298. ...
  299. (* becomes *)
  300. error @bar "message"
  301. ```
  302. #### `match`{.ml}-of-`match`{.ml}
  303. This transformation turns `match`{.ml} expressions where the expression
  304. being dissected is itself another `match`{.ml} "inside-out": we push the
  305. branches of the _outer_ `match`{.ml} "into" the _inner_ `match` (what
  306. used to be the expression being scrutinized). In doing so, sometimes,
  307. new opportunities for match-of-known-constructor arise, and the code
  308. ends up simpler.
  309. ```ocaml
  310. match (match x with
  311. | A -> B
  312. | C -> D) with
  313. | B -> e
  314. | D -> f
  315. (* becomes *)
  316. match x with
  317. | A -> match B with
  318. | B -> e
  319. | D -> f
  320. | C -> match D with
  321. | B -> e
  322. | D -> f
  323. ```
  324. A clear area of improvement here is extracting the outer branches into
  325. local `let`{.ml}-bound lambda abstractions to avoid an explosion in code
  326. size.
  327. #### `inlineVariable`, `betaReduce`
  328. In this pass, use of a variable is replaced with the definition of that
  329. variable, if it meets the following conditions:
  330. - The variable is a lambda abstraction; and
  331. - The lambda abstraction's body is not too _expensive_. Computing the
  332. cost of a term boils down to computing the depth of the tree
  333. representing that term, with some extra cost added to some specific
  334. types of expression.
  335. In doing this, however, we end up with pathological terms of the form
  336. `(fun x -> e) y`{.ml}. The `betaReduce` pass turns this into `let x = y in
  337. e`{.ml}. We generate `let`{.ml} bindings instead of substituting the
  338. variable with the parameter to maintain the same evaluation order and
  339. observable effects of the original code. This does mean that, often,
  340. propagation kicks in and gives rise to new simplification opportunities.
  341. ## Epilogue
  342. I was planning to write a section with a formalisation of the language's
  343. semantics and type system, but it turns out I'm no mathematician, no
  344. matter how hard I pretend. Maybe in the future.
  345. Our code generator is wholly uninteresting, and, most of all, a
  346. placeholder: This is why it is not described in detail (that is, at all)
  347. in this post. I plan to write a follow-up when we actually finish the
  348. native code generator.
  349. As previously mentioned, the compiler _is_ open source: the code is
  350. [here][2]. I recommend using the [Nix package manager][9] to acquire the
  351. Haskell dependencies, but Cabal should work too. Current work in
  352. rewriting the parser is happening in the `feature/alex-happy` branch.
  353. [^1]: This sounds fancy, but in practice, it boils down to using
  354. `Data.IntSet`{.haskell} instead of `Data.Set`{.haskell}.
  355. [^2]: As shown [here][6]. Yes, the error messages need improvement.
  356. [^3]: As explained in [this blog post][8].
  357. [^4]: Dimitrios Vytiniotis, Simon Peyton Jones, Tom Schrijvers,
  358. and Martin Sulzmann. 2011. [OutsideIn(X): Modular Type Inference With
  359. Local Assumptions][1]. _Note that, although the paper has been
  360. published in the Journal of Functional Programming, the version linked
  361. to here is a preprint._
  362. [^5]: Dimitrios Vytiniotis, Simon Peyton Jones, Tom Schrijvers. 2010.
  363. [Let Should not be Generalised][7].
  364. [1]: <https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/jfp-outsidein.pdf>
  365. [2]: <https://github.com/zardyh/amulet/tree/66a4143af32c3e261af51b74f975fc48c0155dc8>
  366. [3]: <https://hackage.haskell.org/package/parsec-3.1.11>
  367. [4]: <https://hackage.haskell.org/package/parsec-3.1.11/docs/Text-Parsec-Token.html>
  368. [5]: <https://www.microsoft.com/en-us/research/wp-content/uploads/2002/07/inline.pdf>
  369. [6]: </snip/sel.b0e94.txt>
  370. [7]: <https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tldi10-vytiniotis.pdf>
  371. [8]: <https://ghc.haskell.org/trac/ghc/blog/LetGeneralisationInGhc7>
  372. [9]: <https://nixos.org/nix/>