my blog lives here now
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

516 lines
18 KiB

7 years ago
  1. ---
  2. title: Dependent Types
  3. date: September 08, 2017
  4. maths: true
  5. ---
  6. Dependent types are pretty cool, yo. This post is a semi-structured
  7. ramble about [dtt](https://ahti-saarelainen.zgrep.org/git/hydraz/dtt),
  8. a small dependently-typed "programming language" inspired by Thierry
  9. Coquand's Calculus of (inductive) Constructions (though, note that the
  10. _induction_ part is still lacking: There is support for defining
  11. inductive data types, and destructuring them by pattern matching, but
  12. since there's no totality checker, recursion is disallowed).
  13. `dtt` is written in Haskell, and served as a learning experience both in
  14. type theory and in writing programs using [extensible
  15. effects](https://hackage.haskell.org/package/freer). I *do* partly regret
  16. the implementation of effects I chose (the more popular
  17. [`extensible-effects`](https://hackage.haskell.org/package/extensible-effects)
  18. did not build on the Nixpkgs channel I had, so I went with `freer`;
  19. Refactoring between these should be easy enough, but I still haven't
  20. gotten around to it, yet)
  21. I originally intended for this post to be a Literate Haskell file,
  22. interleaving explanation with code. However, for a pet project, `dtt`'s
  23. code base quickly spiralled out of control, and is now over a thousand
  24. lines long: It's safe to say I did not expect this one bit.
  25. ### The language
  26. `dtt` is a very standard $\lambda_{\prod{}}$ calculus. We have all 4 axes of
  27. Barendgret's lambda cube, in virtue of having types be first class
  28. values: Values depending on values (functions), values depending on
  29. types (polymorphism), types depending on types (type operators), and
  30. types depending on values (dependent types). This places dtt squarely at
  31. the top, along with other type theories such as the Calculus of
  32. Constructions (the theoretical basis for the Coq proof assistant) and TT
  33. (the type theory behind the Idris programming language).
  34. The syntax is very simple. We have the standard lambda calculus
  35. constructs - $\lambda$-abstraction, application and variables - along
  36. with `let`{.haskell}-bindings, pattern matching `case` expression, and
  37. the dependent type goodies: $\prod$-abstraction and `Set`{.haskell}.
  38. _As an aside_, pi types are called as so because the dependent function
  39. space may (if you follow the "types are sets of values" line of
  40. thinking) be viewed as the cartesian product of types. Consider a type
  41. `A`{.haskell} with inhabitants `Foo`{.haskell}, `Bar`{.haskell} and
  42. a type `B`{.haskell} with inhabitant `Quux`{.haskell}. A dependent
  43. product $\displaystyle\prod_{(x: \mathtt{A})}\mathtt{B}$, then, has
  44. inhabitants `(Foo, Quux)`{.haskell} and `(Bar, Quux)`{.haskell}.
  45. You'll notice that dtt does not have a dedicated arrow type. Indeed, the
  46. dependent product subsumes both the $\forall$ quantifier of System $F$,
  47. and the arrow type $\to$ of the simply-typed lambda calculus. Keep this
  48. in mind: It'll be important later.
  49. Since dtt's syntax is unified (i.e., there's no stratification of terms
  50. and types), the language can be - and is - entirely contained in
  51. a single algebraic data type. All binders are _explicitly typed_, seeing
  52. as inference for dependent types is undecidable (and, therefore,
  53. bad).[^1]
  54. ```haskell
  55. type Type = Term
  56. data Term
  57. = Variable Var
  58. | Set Int
  59. | TypeHint Term Type
  60. | Pi Var Type Type
  61. | Lam Var Type Term
  62. | Let Var Term Term
  63. | App Term Term
  64. | Match Term [(Pattern, Term)]
  65. deriving (Eq, Show, Ord)
  66. ```
  67. The `TypeHint`{.haskell} term constructor, not mentioned before, is
  68. merely a convenience: It allows the programmer to check their
  69. assumptions and help the type checker by supplying a type (Note that we
  70. don't assume this type is correct, as you'll see later; It merely helps
  71. guide inference.)
  72. Variables aren't merely strings because of the large amount of
  73. substitutions we have to perform: For this, instead of generating a new
  74. name, we increment a counter attached to the variable - the pretty
  75. printer uses the original name to great effect, when unambiguous.
  76. ```haskell
  77. data Var
  78. = Name String
  79. | Refresh String Int
  80. | Irrelevant
  81. deriving (Eq, Show, Ord)
  82. ```
  83. The `Irrelevant`{.haskell} variable constructor is used to support $a
  84. \to b$ as sugar for $\displaystyle\prod_{(x: a)} b$ when $x$ does not
  85. appear free in $b$. As soon as the type checker encounters an
  86. `Irrelevant`{.haskell} variable, it is refreshed with a new name.
  87. `dtt` does not have implicit support (as in Idris), so all parameters,
  88. including type parameters, must be bound explicitly. For this, we
  89. support several kinds of syntatic sugar. First, all abstractions support
  90. multiple variables in a _binding group_. This allows the programmer to
  91. write `(a, b, c : α) -> β` instead of `(a : α) -> (b : α) -> (c : α) ->
  92. β`. Furthermore, there is special syntax `/\a` for single-parameter
  93. abstraction with type `Set 0`{.haskell}, and lambda abstractions support
  94. multiple binding groups.
  95. As mentioned before, the language does not support recursion (either
  96. general or well-founded). Though I would like to, writing a totality
  97. checker is hard - way harder than type checking $\lambda_{\prod{}}$, in
  98. fact. However, an alternative way of inspecting inductive values _does_
  99. exist: eliminators. These are dependent versions of catamorphisms, and
  100. basically encode a proof by induction. An inductive data type as Nat
  101. gives rise to an eliminator much like it gives rise to a natural
  102. catamorphism.
  103. ```
  104. inductive Nat : Type of {
  105. Z : Nat;
  106. S : Nat -> Nat
  107. }
  108. natElim : (P : Nat -> Type)
  109. -> P Z
  110. -> ((k : Nat) -> P k -> P (S k))
  111. -> (n : Nat)
  112. -> P n
  113. ```
  114. If you squint, you'll see that the eliminator models a proof by
  115. induction (of the proposition $P$) on the natural number $n$: The type
  116. signature basically states "Given a proposition $P$ on $\mathbb{N}$,
  117. a proof of $P_0$, a proof that $P_{(k + 1)}$ follows from $P_k$ and
  118. a natural number $n$, I'll give you a proof of $P_n$."
  119. This understanding of computations as proofs and types as propositions,
  120. by the way, is called the [Curry-Howard
  121. Isomorphism](https://en.wikipedia.org/wiki/Curry-Howard_correspondence).
  122. The regular, simply-typed lambda calculus corresponds to natural
  123. deduction, while $\lambda_{\prod{}}$ corresponds to predicate logic.
  124. ### The type system
  125. Should this be called the term system?
  126. Our type inference algorithm, contrary to what you might expect for such
  127. a complicated system, is actually quite simple. Unfortunately, the code
  128. isn't, and thus isn't reproduced in its entirety below.
  129. #### Variables
  130. The simplest case in any type system. The typing judgement that gives
  131. rise to this case is pretty much the identity: $\Gamma \vdash \alpha:
  132. \tau \therefore \Gamma \vdash \alpha: \tau$. If, from the current typing
  133. context we know that $\alpha$ has type $\tau$, then we know that
  134. $\alpha$ has type $\tau$.
  135. ```haskell
  136. Variable x -> do
  137. ty <- lookupType x -- (I)
  138. case ty of
  139. Just t -> pure t -- (II)
  140. Nothing -> throwError (NotFound x) -- (III)
  141. ```
  142. 1. Look up the type of the variable in the current context.
  143. 2. If we found a type for it, then return that (this is the happy path)
  144. 3. If we didn't find a type for it, we raise a type error.
  145. #### `Set`{.haskell}s
  146. Since dtt has a cummulative hierarchy of universes, $\mathtt{Set}_k:
  147. \mathtt{Set}_{(k + 1)}$. This helps us avoid the logical inconsistency
  148. introduced by having _type-in-type_[^2], i.e. $\mathtt{Type}:
  149. \mathtt{Type}$. We say that $\mathtt{Set}_0$ is the type of _small
  150. types_: in fact, $\mathtt{Set}_0$ is where most computation actually
  151. happens, seeing as $\mathtt{Set}_k$ for $k \ge 1$ is reserved for
  152. $\prod$-abstractions quantifying over such types.
  153. ```haskell
  154. Set k -> pure . Set . (+1) $ k
  155. ```
  156. #### Type hints
  157. Type hints are the first appearance of the unification engine, by far
  158. the most complex part of dtt's type checker. But for now, suffices to
  159. know that ``t1 `assertEquality` t2``{.haskell} errors if the types t1
  160. and t2 can't be made to _line up_, i.e., unify.
  161. For type hints, we infer the type of given expression, and compare it
  162. against the user-provided type, raising an error if they don't match.
  163. Because of how the unification engine works, the given type may be more
  164. general (or specific) than the inferred one.
  165. ```haskell
  166. TypeHint v t -> do
  167. it <- infer v
  168. t `assertEquality` it
  169. pure t
  170. ```
  171. #### $\prod$-abstractions
  172. This is where it starts to get interesting. First, we mandate that the
  173. parameter type is inhabited (basically, that it _is_, in fact, a type).
  174. The dependent product $\displaystyle\prod_{(x : 0)} \alpha$, while allowed by the
  175. language's grammar, is entirely meaningless: There's no way to construct
  176. an inhabitant of $0$, and thus this function may never be applied.
  177. Then, in the context extended with $(\alpha : \tau)$, we require that
  178. the consequent is also a type itself: The function
  179. $\displaystyle\prod_{(x: \mathbb{N})} 0$, while again a valid parse, is
  180. also meaningless.
  181. The type of the overall abstraction is, then, the maximum value of the
  182. indices of the universes of the parameter and the consequent.
  183. ```haskell
  184. Pi x p c -> do
  185. k1 <- inferSet tx
  186. k2 <- local (insertType (x, p)) $
  187. inferSet c
  188. pure $ Set (k1 `max` k2)
  189. ```
  190. #### $\lambda$-abstractions
  191. Much like in the simply-typed lambda calculus, the type of
  192. a $\lambda$-abstraction is an arrow between the type of its parameter
  193. and the type of its body. Of course, $\lambda_{\prod{}}$ incurs the
  194. additional constraint that the type of the parameter is inhabited.
  195. Alas, we don't have arrows. So, we "lift" the lambda's parameter to the
  196. type level, and bind it in a $\prod$-abstraction.
  197. ```haskell
  198. Lam x t b -> do
  199. _ <- inferSet t
  200. Pi x t <$> local (insertType (x, t)) (infer b)
  201. ```
  202. Note that, much like in the `Pi`{.haskell} case, we type-check the body
  203. in a context extended with the parameter's type.
  204. #### Application
  205. Application is the most interesting rule, as it has to not only handle
  206. inference, it also has to handle instantiation of $\prod$-abstractions.
  207. Instantation is, much like application, handled by $\beta$-reduction,
  208. with the difference being that instantiation happens during type
  209. checking (applying a $\prod$-abstraction is meaningless) and application
  210. happens during normalisation (instancing a $\lambda$-abstraction is
  211. meaningless).
  212. The type of the function being applied needs to be
  213. a $\prod$-abstraction, while the type of the operand needs to be
  214. inhabited. Note that the second constraint is not written out
  215. explicitly: It's handled by the `Pi`{.haskell} case above, and
  216. furthermore by the unification engine.
  217. ```haskell
  218. App e1 e2 -> do
  219. t1 <- infer e1
  220. case t1 of
  221. Pi vr i o -> do
  222. t2 <- infer e2
  223. t `assertEquality` i
  224. N.normalise =<< subst [(vr, e2)] o -- (I)
  225. e -> throwError (ExpectedPi e) -- (II)
  226. ```
  227. 1. Notice that, here, we don't substitute the $\prod$-bound variable by
  228. the type of $e_2$: That'd make us equivalent to System $F$. The whole
  229. _deal_ with dependent types is that types depend on values, and that
  230. entirely stems from this one line. By instancing a type variable with
  231. a value, we allow _types_ to depend on _values_.
  232. 2. Oh, and if we didn't get a $\prod$-abstraction, error.
  233. ---
  234. You'll notice that two typing rules are missing here: One for handling
  235. `let`{.haskell}s, which was not included because it is entirely
  236. uninteresting, and one for `case ... of`{.haskell} expressions, which
  237. was redacted because it is entirely a mess.
  238. Hopefully, in the future, the typing of `case` expressions is simpler
  239. - if not, they'll probably be replaced by eliminators.
  240. ### Unification and Constraint Solving
  241. The unification engine is the man behind the curtain in type checking:
  242. We often don't pay attention to it, but it's the driving force behind it
  243. all. Fortunately, in our case, unification is entirely trivial: Solving
  244. is the hard bit.
  245. The job of the unification engine is to produce a set of constraints
  246. that have to be satisfied in order for two types to be equal. Then, the
  247. solver is run on these constraints to assert that they are logically
  248. consistent, and potentially produce substitutions that _reify_ those
  249. constraints.
  250. Our solver isn't that cool, though, so it just verifies consitency.
  251. The kinds of constraints we can generate are as in the data type below.
  252. ```haskell
  253. data Constraint
  254. = Instance Var Term -- (1)
  255. | Equal Term Term -- (2)
  256. | EqualTypes Type Type -- (3)
  257. | IsSet Type -- (4)
  258. deriving (Eq, Show, Ord)
  259. ```
  260. 1. The constraint `Instance v t`{.haskell} corresponds to a substitution
  261. between `v` and the term `t`.
  262. 2. A constraint `Equal a b`{.haskell} states that the two terms `a` and
  263. `b` are equal under normalisation.
  264. 3. Ditto, but with their _types_ (We normalise, infer, and check for
  265. equality)
  266. 4. A constraint `IsSet t`{.haskell} asserts that the provided type has
  267. inhabitants.
  268. #### Unification
  269. Unification of most terms is entirely uninteresting. Simply line up the
  270. structures and produce the appropriate equality (or instance)
  271. constraints.
  272. ```haskell
  273. unify (Variable a) b = instanceC a b
  274. unify b (Variable a) = instanceC a b
  275. unify (Set a) (Set b) | a == b = pure []
  276. unify (App x y) (App x' y') =
  277. (++) <$> unify x x' <*> unify y y'
  278. unify (TypeHint a b) (TypeHint c d) =
  279. (++) <$> unify a c <*> unify b d
  280. unify a b = throwError (NotEqual a b)
  281. ```
  282. Those are all the boring cases, and I'm not going to comment on them.
  283. Similarly boring are binders, which were abstracted out because hlint
  284. told me to.
  285. ```haskell
  286. unify (Lam v1 t1 b1) (Lam v2 t2 b2) = unifyBinder (v1, v2) (t1, t2) (b1, b2)
  287. unify (Pi v1 t1 b1) (Pi v2 t2 b2) = unifyBinder (v1, v2) (t1, t2) (b1, b2)
  288. unify (Let v1 t1 b1) (Let v2 t2 b2) = unifyBinder (v1, v2) (t1, t2) (b1, b2)
  289. unifyBinder (v1, v2) (t1, t2) (b1, b2) = do
  290. (a, b) <- (,) <$> unify (Variable v1) (Variable v2) <*> unify t1 t2
  291. ((a ++ b) ++) <$> unify b1 b2
  292. ```
  293. There are two interesting cases: Unification between some term and a pi
  294. abstraction, and unification between two variables.
  295. ```haskell
  296. unify ta@(Variable a) tb@(Variable b)
  297. | a == b = pure []
  298. | otherwise = do
  299. (x, y) <- (,) <$> lookupType a <*> lookupType b
  300. case (x, y) of
  301. (Just _, Just _) -> do
  302. ca <- equalTypesC ta tb
  303. cb <- equalC ta tb
  304. pure (ca ++ cb)
  305. (Just x', Nothing) -> instanceC b x'
  306. (Nothing, Just x') -> instanceC a x'
  307. (Nothing, Nothing) -> instanceC a (Variable b)
  308. ```
  309. If the variables are syntactically the same, then we're done, and no
  310. constraints have to be generated (Technically you could generate an
  311. entirely trivial equality constraint, but this puts unnecessary pressure
  312. on the solver).
  313. If either variable has a known type, then we generate an instance
  314. constraint between the unknown variable and the known one.
  315. If both variables have a value, we equate their types' types and their
  316. types. This is done mostly for error messages' sakes, seeing as if two
  317. values are propositionally equal, so are their types.
  318. Unification between a term and a $\prod$-abstraction is the most
  319. interesting case: We check that the $\prod$ type abstracts over a type
  320. (i.e., it corresponds to a System F $\forall$ instead of a System
  321. F $\to$), and _instance_ the $\prod$ with a fresh type variable.
  322. ```haskell
  323. unifyPi v1 t1 b1 a = do
  324. id <- refresh Irrelevant
  325. ss <- isSetC t1
  326. pi' <- subst [(v1, Variable id)] b1
  327. (++ ss) <$> unify a pi'
  328. unify a (Pi v1 t1 b1) = unifyPi v1 t1 b1 a
  329. unify (Pi v1 t1 b1) a = unifyPi v1 t1 b1 a
  330. ```
  331. #### Solving
  332. Solving is a recursive function of the list of constraints (a
  333. catamorphism!) with some additional state: Namely, a strict map of
  334. already-performed substitutions. Let's work through the cases in reverse
  335. order of complexity (and, interestingly, reverse order of how they're in
  336. the source code).
  337. ##### No constraints
  338. Solving an empty list of constraints is entirely trivial.
  339. ```haskell
  340. solveInner _ [] = pure ()
  341. ```
  342. #### `IsSet`{.haskell}
  343. We infer the index of the universe of the given type, much like in the
  344. inferrence case for $\prod$-abstractions, and check the remaining
  345. constraints.
  346. ```haskell
  347. solveInner map (IsSet t:xs) = do
  348. _ <- inferSet t
  349. solveInner map xs
  350. ```
  351. #### `EqualTypes`{.haskell}
  352. We infer the types of both provided values, and generate an equality
  353. constraint.
  354. ```haskell
  355. solveInner map (EqualTypes a b:xs) = do
  356. ta <- infer a
  357. tb <- infer b
  358. solveInner map (Equal ta tb:xs)
  359. ```
  360. #### `Equal`{.haskell}
  361. We merely have to check for syntactic equality of the (normal forms of)
  362. terms, because the hard lifting of destructuring and lining up was done
  363. by the unification engine.
  364. ```haskell
  365. solveInner map (Equal a b:xs) = do
  366. a' <- N.normalise a
  367. b' <- N.normalise b
  368. eq <- equal a' b'
  369. if eq
  370. then solveInner map xs
  371. else throwError (NotEqual a b)
  372. ```
  373. #### `Instance`{.haskell}
  374. If the variable we're instancing is already in the map, and the thing
  375. we're instancing it to _now_ is not the same as before, we have an
  376. inconsistent set of substitutions and must error.
  377. ```haskell
  378. solveInner map (Instance a b:xs)
  379. | a `M.member` map
  380. , b /= map M.! a
  381. , Irrelevant /= a
  382. = throwError $ InconsistentSubsts (a, b) (map M.! a)
  383. ```
  384. Otherwise, if we have a coherent set of instances, we add the instance
  385. both to scope and to our local state map and continue checking.
  386. ```haskell
  387. | otherwise =
  388. local (insertType (a, b)) $
  389. solveInner (M.insert a b map) xs
  390. ```
  391. ---
  392. Now that we have both `unify` and `solve`, we can write
  393. `assertEquality`: We unify the two types, and then try to solve the set
  394. of constraints.
  395. ```haskell
  396. assertEquality t1 t2 = do
  397. cs <- unify t1 t2
  398. solve cs
  399. ```
  400. The real implementation will catch and re-throw any errors raised by
  401. `solve` to add appropriate context, and that's not the only case where
  402. "real implementation" and "blag implementation" differ.
  403. ### Conclusion
  404. Wow, that was a lot of writing. This conclusion begins on exactly the
  405. 500th line of the Markdown source of this article, and this is the
  406. longest article on this blag (by far). However, that's not to say it's
  407. bad: It was amazing to write, and writing `dtt` was also amazing. I am
  408. not good at conclusions.
  409. `dtt` is available under the BSD 3-clause licence, though I must warn
  410. you that the source code hasn't many comments.
  411. I hope you learned nearly as much as I did writing this by reading it.
  412. [^1]: As [proven](https://link.springer.com/chapter/10.1007/BFb0037103) by Gilles Dowek.
  413. [^2]: See [System U](https://en.wikipedia.org/wiki/System_U), also
  414. Girard's paradox - the type theory equivalent of [Russell's
  415. paradox](https://en.wikipedia.org/wiki/Russell%27s_paradox).