my blog lives here now
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

369 lines
14 KiB

6 years ago
2 years ago
6 years ago
  1. ---
  2. title: Typed Type-Level Computation in Amulet
  3. date: October 04, 2019
  4. maths: true
  5. ---
  6. Amulet, as a programming language, has a focus on strong static typing. This has led us to adopt
  7. many features inspired by dependently-typed languages, the most prominent of which being typed holes
  8. and GADTs, the latter being an imitation of indexed families.
  9. However, Amulet was up until recently sorely lacking in a way to express computational content in
  10. types: It was possible to index datatypes by other, regular datatypes ("datatype promotion", in the
  11. Haskell lingo) since the type and kind levels are one and the same, but writing functions on those
  12. indices was entirely impossible.
  13. As of this week, the language supports two complementary mechanisms for typed type-level programming:
  14. _type classes with functional dependencies_, a form of logic programming, and _type functions_, which
  15. permit functional programming on the type level.
  16. I'll introduce them in that order; This post is meant to serve as an introduction to type-level
  17. programming using either technique in general, but it'll also present some concepts formally and with
  18. some technical depth.
  19. ### Type Classes are Relations: Programming with Fundeps
  20. In set theory[^1] a _relation_ $R$ over a family of sets $A, B, C, \dots$ is a subset of the
  21. cartesian product $A \times B \times C \times \dots$. If $(a, b, c, \dots) \in R_{A,B,C,\dots}$ we
  22. say that $a$, $b$ and $c$ are _related_ by $R$.
  23. In this context, a _functional dependency_ is a term $X \leadsto Y$
  24. where $X$ and $Y$ are both sets of natural numbers. A relation is said
  25. to satisfy a functional dependency $X \leadsto Y$ when, for any tuple in
  26. the relation, the values at $X$ uniquely determine the values at $Y$.
  27. For instance, the relations $R_{A,B}$ satisfying $\{0\} \leadsto \{1\}$ are partial functions $A \to
  28. B$, and if it were additionally to satisfy $\{1\} \leadsto \{0\}$ it would be a partial one-to-one
  29. mapping.
  30. One might wonder what all of this abstract nonsense[^2] has to do with type classes. The thing is, a
  31. type class `class foo : A -> B -> constraint`{.amulet} is a relation $\text{Foo}_{A,B}$! With this in
  32. mind, it becomes easy to understand what it might mean for a type class to satisfy a functional
  33. relation, and indeed the expressive power that they bring.
  34. To make it concrete:
  35. ```amulet
  36. class r 'a 'b (* an arbitrary relation between a and b *)
  37. class f 'a 'b | 'a -> 'b (* a function from a to b *)
  38. class i 'a 'b | 'a -> 'b, 'b -> 'a (* a one-to-one mapping between a and b *)
  39. ```
  40. #### The Classic Example: Collections
  41. In Mark P. Jones' paper introducing functional dependencies, he presents as an example the class
  42. `collects : type -> type -> constraint`{.amulet}, where `'e`{.amulet} is the type of elements in the
  43. collection type `'ce`{.amulet}. This class can be used for all the standard, polymorphic collections
  44. (of kind `type -> type`{.amulet}), but it also admits instances for monomorphic collections, like a
  45. `bitset`.
  46. ```amulet
  47. class collects 'e 'ce begin
  48. val empty : 'ce
  49. val insert : 'e -> 'ce -> 'ce
  50. val member : 'e -> 'ce -> bool
  51. end
  52. ```
  53. Omitting the standard implementation details, this class admits instances like:
  54. ```amulet
  55. class eq 'a => collects 'a (list 'a)
  56. class eq 'a => collects 'a ('a -> bool)
  57. instance collects char string (* amulet strings are not list char *)
  58. ```
  59. However, Jones points out this class, as written, has a variety of problems. For starters, `empty`{.amulet} has
  60. an ambiguous type, `forall 'e 'ce. collects 'e 'ce => 'ce`{.amulet}. This type is ambiguous because the type
  61. varialbe `e`{.amulet} is $\forall$-bound, and appears in the constraint `collects 'e 'ce`{.amulet}, but doesn't
  62. appear to the right of the `=>`{.amulet}; Thus, we can't solve it using unification, and the program
  63. would have undefined semantics.
  64. Moreover, this class leads to poor inferred types. Consider the two functions `f`{.amulet} and `g`, below.
  65. These have the types `(collects 'a 'c * collects 'b 'c) => 'a -> 'b -> 'c -> 'c`{.amulet} and
  66. `(collects bool 'c * collects int 'c) => 'c -> 'c`{.amulet} respectively.
  67. ```amulet
  68. let f x y coll = insert x (insert y coll)
  69. let g coll = f true 1 coll
  70. ```
  71. The problem with the type of `f`{.amulet} is that it is too general, if we wish to model homogeneous
  72. collections only; This leads to the type of `g`, which really ought to be a type error, but isn't; The
  73. programming error in its definition won't be reported here, but at the use site, which might be in a
  74. different module entirely. This problem of poor type inference and bad error locality motivates us to
  75. refine the class `collects`, adding a functional dependency:
  76. ```amulet
  77. (* Read: 'ce determines 'e *)
  78. class collects 'e 'ce | 'ce -> 'e begin
  79. val empty : 'ce
  80. val insert : 'e -> 'ce -> 'ce
  81. val member : 'e -> 'ce -> bool
  82. end
  83. ```
  84. This class admits all the same instances as before, but now the functional dependency lets Amulet
  85. infer an improved type for `f`{.amulet} and report the type error at `g`{.amulet}.
  86. ```amulet
  87. val f : collects 'a 'b => 'a -> 'a -> 'b -> 'b
  88. ```
  89. ```
  90. 2 │ let g coll = f true 1 coll
  91. │ ^
  92. Couldn't match actual type int
  93. with the type expected by the context, bool
  94. ```
  95. One can see from the type of `f`{.amulet} that Amulet can simplify the conjunction of constraints
  96. `collects 'a 'c * collects 'b 'c`{.amulet} into `collects 'a 'c`{.amulet} and substitute `'b`{.amulet}
  97. for `'a`{.amulet} in the rest of the type. This is because the second parameter of `collects`{.amulet}
  98. is enough to determine the first parameter; Since `'c`{.amulet} is obviously equal to itself,
  99. `'a`{.amulet} must be equal to `'b`.
  100. We can observe improvement within the language using a pair of data types, `(:-) : constraint ->
  101. constraint -> type`{.amulet} and `dict : constraint -> type`{.amulet}, which serve as witnesses of
  102. implication between constraints and a single constraint respectively.
  103. ```amulet
  104. type dict 'c = Dict : 'c => dict 'c
  105. type 'p :- 'q = Sub of ('p => unit -> dict 'q)
  106. let improve : forall 'a 'b 'c. (collects 'a 'c * collects 'b 'c) :- ('a ~ 'b) =
  107. Sub (fun _ -> Dict)
  108. ```
  109. Because this program type-checks, we can be sure that `collects 'a 'c * collects 'b 'c`{.amulet}
  110. implies `'a`{.amulet} is equal to `'b`{.amulet}. Neat!
  111. ### Computing with Fundeps: Natural Numbers and Vectors
  112. If you saw this coming, pat yourself on the back.
  113. I'm required by law to talk about vectors in every post about types. No, really; It's true.
  114. I'm sure everyone's seen this by now, but vectors are cons-lists indexed by their type as a Peano
  115. natural.
  116. ```amulet
  117. type nat = Z | S of nat
  118. type vect 'n 'a =
  119. | Nil : vect Z 'a
  120. | Cons : 'a * vect 'n 'a -> vect (S 'n) 'a
  121. ```
  122. Our running objective for this post will be to write a function to append two vectors, such that the
  123. length of the result is the sum of the lengths of the arguments.[^3] But, how do we even write the
  124. type of such a function?
  125. Here we can use a type class with functional dependencies witnessing the fact that $a + b = c$, for
  126. some $a$, $b$, $c$ all in $\mathbb{N}$. Obviously, knowing $a$ and $b$ is enough to know $c$, and the
  127. functional dependency expresses that. Due to the way we're going to be implementing `add`, the other
  128. two functional dependencies aren't admissible.
  129. ```amulet
  130. class add 'a 'b 'c | 'a 'b -> 'c begin end
  131. ```
  132. Adding zero to something just results in that something, and if $a + b = c$ then $(1 + a) + b = 1 + c$.
  133. ```amulet
  134. instance add Z 'a 'a begin end
  135. instance add 'a 'b 'c => add (S 'a) 'b (S 'c) begin end
  136. ```
  137. With this in hands, we can write a function to append vectors.
  138. ```amulet
  139. let append : forall 'n 'k 'm 'a. add 'n 'k 'm
  140. => vect 'n 'a -> vect 'k 'a -> vect 'm 'a =
  141. fun xs ys ->
  142. match xs with
  143. | Nil -> ys
  144. | Cons (x, xs) -> Cons (x, append xs ys)
  145. ```
  146. Success!
  147. ... or maybe not. Amulet's complaining about our definition of `append` even though it's correct; What
  148. gives?
  149. The problem is that while functional dependencies let us conclude equalities from pairs of instances,
  150. it doesn't do us any good if there's a single instance. So we need a way to reflect the equalities in
  151. a way that can be pattern-matched on. If your GADT senses are going off, that's a good thing.
  152. #### Computing with Evidence
  153. This is terribly boring to do and what motivated me to add type functions to Amulet in the first
  154. place, but the solution here is to have a GADT that mirrors the structure of the class instances, and
  155. make the instances compute that. Then, in our append function, we can match on this evidence to reveal
  156. equalities to the type checker.
  157. ```amulet
  158. type add_ev 'k 'n 'm =
  159. | AddZ : add_ev Z 'a 'a
  160. | AddS : add_ev 'a 'b 'c -> add_ev (S 'a) 'b (S 'c)
  161. class add 'a 'b 'c | 'a 'b -> 'c begin
  162. val ev : add_ev 'a 'b 'c
  163. end
  164. instance add Z 'a 'a begin
  165. let ev = AddZ
  166. end
  167. instance add 'a 'b 'c => add (S 'a) 'b (S 'c) begin
  168. let ev = AddS ev
  169. end
  170. ```
  171. Now we can write vector `append` using the `add_ev` type.
  172. ```amulet
  173. let append' (ev : add_ev 'n 'm 'k)
  174. (xs : vect 'n 'a)
  175. (ys : vect 'm 'a)
  176. : vect 'k 'a =
  177. match ev, xs with
  178. | AddZ, Nil -> ys
  179. | AddS p, Cons (x, xs) -> Cons (x, append' p xs ys)
  180. and append xs ys = append' ev xs ys
  181. ```
  182. This type-checks and we're done.
  183. ### Functions on Types: Programming with Closed Type Functions
  184. Look, duplicating the structure of a type class at the value level just so the compiler can figure out
  185. equalities is stupid. Can't we make it do that work instead? Enter _closed type functions_.
  186. ```amulet
  187. type function (+) 'n 'm begin
  188. Z + 'n = 'n
  189. (S 'k) + 'n = S ('k + 'n)
  190. end
  191. ```
  192. This declaration introduces the type constructor `(+)`{.amulet} (usually written infix) and two rules
  193. for reducing types involving saturated applications of `(+)`{.amulet}. Type functions, unlike type
  194. classes which are defined like Prolog clauses, are defined in a pattern-matching style reminiscent of
  195. Haskell.
  196. Each type function has a set of (potentially overlapping) _equations_, and the compiler will reduce an
  197. application using an equation as soon as it's sure that equation is the only possible equation based
  198. on the currently-known arguments.
  199. Using the type function `(+)`{.amulet} we can use our original implementation of `append` and have it
  200. type-check:
  201. ```amulet
  202. let append (xs : vect 'n 'a) (ys : vect 'k 'a) : vect ('n + 'k) 'a =
  203. match xs with
  204. | Nil -> ys
  205. | Cons (x, xs) -> Cons (x, append xs ys)
  206. let ys = append (Cons (1, Nil)) (Cons (2, Cons (3, Nil)))
  207. ```
  208. Now, a bit of a strange thing is that Amulet reduces type family applications as lazily as possible,
  209. so that `ys` above has type `vect (S Z + S (S Z)) int`{.amulet}. In practice, this isn't an issue, as
  210. a simple ascription shows that this type is equal to the more orthodox `vect (S (S (S Z)))
  211. int`{.amulet}.
  212. ```amulet
  213. let zs : vect (S (S (S Z))) int = ys
  214. ```
  215. Internally, type functions do pretty much the same thing as the functional dependency + evidence
  216. approach we used earlier. Each equation gives rise to an equality _axiom_, represented as a
  217. constructor because our intermediate language pretty much lets constructors return whatever they damn
  218. want.
  219. ```amulet
  220. type + '(n : nat) '(m : nat) =
  221. | awp : forall 'n 'm 'r. 'n ~ Z -> 'm ~ 'n -> ('n + 'm) ~ 'n
  222. | awq : forall 'n 'k 'm 'l. 'n ~ (S 'k) -> 'm ~ 'l
  223. -> ('n + 'm) ~ (S ('k + 'l))
  224. ```
  225. These symbols have ugly autogenerated names because they're internal to the compiler and should never
  226. appear to users, but you can see that `awp` and `awq` correspond to each clause of the `(+)`{.amulet}
  227. type function, with a bit more freedom in renaming type variables.
  228. ### Custom Type Errors: Typing Better
  229. Sometimes - I mean, pretty often - you have better domain knowledge than Amulet. For instance, you
  230. might know that it's impossible to `show` a function. The `type_error` type family lets you tell the
  231. type checker this:
  232. ```amulet
  233. instance
  234. (type_error (String "Can't show functional type:" :<>: ShowType ('a -> 'b))
  235. => show ('a -> 'b)
  236. begin
  237. let show _ = ""
  238. end
  239. ```
  240. Now trying to use `show` on a function value will give you a nice error message:
  241. ```amulet
  242. let _ = show (fun x -> x + 1)
  243. ```
  244. ```
  245. 1 │ let _ = show (fun x -> x + 1)
  246. │ ^^^^^^^^^^^^^^^^^^^^^
  247. Can't show functional type: int -> int
  248. ```
  249. ### Type Families can Overlap
  250. Type families can tell when two types are equal or not:
  251. ```amulet
  252. type function equal 'a 'b begin
  253. discrim 'a 'a = True
  254. discrim 'a 'b = False
  255. end
  256. ```
  257. But overlapping equations need to agree:
  258. ```amulet
  259. type function overlap_not_ok 'a begin
  260. overlap_not_ok int = string
  261. overlap_not_ok int = int
  262. end
  263. ```
  264. ```
  265. Overlapping equations for overlap_not_ok int
  266. • Note: first defined here,
  267. 2 │ overlap_not_ok int = string
  268. │ ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  269. but also defined here
  270. 3 │ overlap_not_ok int = int
  271. │ ^^^^^^^^^^^^^^^^^^^^^^^^
  272. ```
  273. ### Conclusion
  274. Type families and type classes with functional dependencies are both ways to introduce computation in
  275. the type system. They both have their strengths and weaknesses: Fundeps allow improvement to inferred
  276. types, but type families interact better with GADTs (since they generate more equalities). Both are
  277. important in language with a focus on type safety, in my opinion.
  278. [^1]: This is not actually the definition of a relation with full generality; Set theorists are
  279. concerned with arbitrary families of sets indexed by some $i \in I$, where $I$ is a set of indices;
  280. Here, we've set $I = \mathbb{N}$ and restrict ourselves to the case where relations are tuples.
  281. [^2]: At least it's not category theory.
  282. [^3]: In the shower today I actually realised that the `append` function on vectors is a witness to
  283. the algebraic identity $a^n * a^m = a^{n + m}$. Think about it: the `vect 'n`{.amulet} functor is
  284. representable by `fin 'n`{.amulet}, i.e. it is isomorphic to functions `fin 'n -> 'a`{.amulet}. By
  285. definition, `fin 'n`{.amulet} is the type with `'n`{.amulet} elements, and arrow types `'a ->
  286. 'b`{.amulet} have $\text{size}(b)^{\text{size}(a)}$ elements, which leads us to conclude `vect 'n
  287. 'a` has size $\text{size}(a)^n$ elements.