title | date | maths |
---|---|---|
Dependent Types | September 08, 2017 | true |
Dependent types are pretty cool, yo. This post is a semi-structured ramble about dtt, a small dependently-typed "programming language" inspired by Thierry Coquand's Calculus of (inductive) Constructions (though, note that the induction part is still lacking: There is support for defining inductive data types, and destructuring them by pattern matching, but since there's no totality checker, recursion is disallowed).
dtt
is written in Haskell, and served as a learning experience both in
type theory and in writing programs using extensible
effects. I do partly regret
the implementation of effects I chose (the more popular
extensible-effects
did not build on the Nixpkgs channel I had, so I went with freer
;
Refactoring between these should be easy enough, but I still haven't
gotten around to it, yet)
I originally intended for this post to be a Literate Haskell file,
interleaving explanation with code. However, for a pet project, dtt
's
code base quickly spiralled out of control, and is now over a thousand
lines long: It's safe to say I did not expect this one bit.
dtt
is a very standard $\lambda_{\prod{}}$ calculus. We have all 4 axes of
Barendgret's lambda cube, in virtue of having types be first class
values: Values depending on values (functions), values depending on
types (polymorphism), types depending on types (type operators), and
types depending on values (dependent types). This places dtt squarely at
the top, along with other type theories such as the Calculus of
Constructions (the theoretical basis for the Coq proof assistant) and TT
(the type theory behind the Idris programming language).
The syntax is very simple. We have the standard lambda calculus
constructs - $\lambda$-abstraction, application and variables - along
with let
{.haskell}-bindings, pattern matching case
expression, and
the dependent type goodies: $\prod$-abstraction and Set
{.haskell}.
As an aside, pi types are called as so because the dependent function
space may (if you follow the "types are sets of values" line of
thinking) be viewed as the cartesian product of types. Consider a type
A
{.haskell} with inhabitants Foo
{.haskell}, Bar
{.haskell} and
a type B
{.haskell} with inhabitant Quux
{.haskell}. A dependent
product $\displaystyle\prod_{(x: \mathtt{A})}\mathtt{B}$, then, has
inhabitants (Foo, Quux)
{.haskell} and (Bar, Quux)
{.haskell}.
You'll notice that dtt does not have a dedicated arrow type. Indeed, the dependent product subsumes both the $\forall$ quantifier of System $F$, and the arrow type $\to$ of the simply-typed lambda calculus. Keep this in mind: It'll be important later.
Since dtt's syntax is unified (i.e., there's no stratification of terms and types), the language can be - and is - entirely contained in a single algebraic data type. All binders are explicitly typed, seeing as inference for dependent types is undecidable (and, therefore, bad).1
type Type = Term
data Term
= Variable Var
| Set Int
| TypeHint Term Type
| Pi Var Type Type
| Lam Var Type Term
| Let Var Term Term
| App Term Term
| Match Term [(Pattern, Term)]
deriving (Eq, Show, Ord)
The TypeHint
{.haskell} term constructor, not mentioned before, is
merely a convenience: It allows the programmer to check their
assumptions and help the type checker by supplying a type (Note that we
don't assume this type is correct, as you'll see later; It merely helps
guide inference.)
Variables aren't merely strings because of the large amount of substitutions we have to perform: For this, instead of generating a new name, we increment a counter attached to the variable - the pretty printer uses the original name to great effect, when unambiguous.
data Var
= Name String
| Refresh String Int
| Irrelevant
deriving (Eq, Show, Ord)
The Irrelevant
{.haskell} variable constructor is used to support $a
\to b$ as sugar for $\displaystyle\prod_{(x: a)} b$ when $x$ does not
appear free in $b$. As soon as the type checker encounters an
Irrelevant
{.haskell} variable, it is refreshed with a new name.
dtt
does not have implicit support (as in Idris), so all parameters,
including type parameters, must be bound explicitly. For this, we
support several kinds of syntatic sugar. First, all abstractions support
multiple variables in a binding group. This allows the programmer to
write (a, b, c : α) -> β
instead of (a : α) -> (b : α) -> (c : α) -> β
. Furthermore, there is special syntax /\a
for single-parameter
abstraction with type Set 0
{.haskell}, and lambda abstractions support
multiple binding groups.
As mentioned before, the language does not support recursion (either general or well-founded). Though I would like to, writing a totality checker is hard - way harder than type checking $\lambda_{\prod{}}$, in fact. However, an alternative way of inspecting inductive values _does_ exist: eliminators. These are dependent versions of catamorphisms, and basically encode a proof by induction. An inductive data type as Nat gives rise to an eliminator much like it gives rise to a natural catamorphism.
inductive Nat : Type of {
Z : Nat;
S : Nat -> Nat
}
natElim : (P : Nat -> Type)
-> P Z
-> ((k : Nat) -> P k -> P (S k))
-> (n : Nat)
-> P n
If you squint, you'll see that the eliminator models a proof by induction (of the proposition $P$) on the natural number $n$: The type signature basically states "Given a proposition $P$ on $\mathbb{N}$, a proof of $P_0$, a proof that $P_{(k + 1)}$ follows from $P_k$ and a natural number $n$, I'll give you a proof of $P_n$."
This understanding of computations as proofs and types as propositions, by the way, is called the Curry-Howard Isomorphism. The regular, simply-typed lambda calculus corresponds to natural deduction, while $\lambda_{\prod{}}$ corresponds to predicate logic.
Should this be called the term system?
Our type inference algorithm, contrary to what you might expect for such a complicated system, is actually quite simple. Unfortunately, the code isn't, and thus isn't reproduced in its entirety below.
The simplest case in any type system. The typing judgement that gives rise to this case is pretty much the identity: $\Gamma \vdash \alpha: \tau \therefore \Gamma \vdash \alpha: \tau$. If, from the current typing context we know that $\alpha$ has type $\tau$, then we know that $\alpha$ has type $\tau$.
Variable x -> do
ty <- lookupType x -- (I)
case ty of
Just t -> pure t -- (II)
Nothing -> throwError (NotFound x) -- (III)
Set
{.haskell}sSince dtt has a cummulative hierarchy of universes, $\mathtt{Set}k: \mathtt{Set}{(k + 1)}$. This helps us avoid the logical inconsistency introduced by having type-in-type2, i.e. $\mathtt{Type}: \mathtt{Type}$. We say that $\mathtt{Set}_0$ is the type of small types: in fact, $\mathtt{Set}_0$ is where most computation actually happens, seeing as $\mathtt{Set}_k$ for $k \ge 1$ is reserved for $\prod$-abstractions quantifying over such types.
Set k -> pure . Set . (+1) $ k
Type hints are the first appearance of the unification engine, by far
the most complex part of dtt's type checker. But for now, suffices to
know that t1 `assertEquality` t2
{.haskell} errors if the types t1
and t2 can't be made to line up, i.e., unify.
For type hints, we infer the type of given expression, and compare it against the user-provided type, raising an error if they don't match. Because of how the unification engine works, the given type may be more general (or specific) than the inferred one.
TypeHint v t -> do
it <- infer v
t `assertEquality` it
pure t
This is where it starts to get interesting. First, we mandate that the parameter type is inhabited (basically, that it is, in fact, a type). The dependent product $\displaystyle\prod_{(x : 0)} \alpha$, while allowed by the language's grammar, is entirely meaningless: There's no way to construct an inhabitant of $0$, and thus this function may never be applied.
Then, in the context extended with $(\alpha : \tau)$, we require that the consequent is also a type itself: The function $\displaystyle\prod_{(x: \mathbb{N})} 0$, while again a valid parse, is also meaningless.
The type of the overall abstraction is, then, the maximum value of the indices of the universes of the parameter and the consequent.
Pi x p c -> do
k1 <- inferSet tx
k2 <- local (insertType (x, p)) $
inferSet c
pure $ Set (k1 `max` k2)
Much like in the simply-typed lambda calculus, the type of a $\lambda$-abstraction is an arrow between the type of its parameter and the type of its body. Of course, $\lambda_{\prod{}}$ incurs the additional constraint that the type of the parameter is inhabited.
Alas, we don't have arrows. So, we "lift" the lambda's parameter to the type level, and bind it in a $\prod$-abstraction.
Lam x t b -> do
_ <- inferSet t
Pi x t <$> local (insertType (x, t)) (infer b)
Note that, much like in the Pi
{.haskell} case, we type-check the body
in a context extended with the parameter's type.
Application is the most interesting rule, as it has to not only handle inference, it also has to handle instantiation of $\prod$-abstractions.
Instantation is, much like application, handled by $\beta$-reduction, with the difference being that instantiation happens during type checking (applying a $\prod$-abstraction is meaningless) and application happens during normalisation (instancing a $\lambda$-abstraction is meaningless).
The type of the function being applied needs to be
a $\prod$-abstraction, while the type of the operand needs to be
inhabited. Note that the second constraint is not written out
explicitly: It's handled by the Pi
{.haskell} case above, and
furthermore by the unification engine.
App e1 e2 -> do
t1 <- infer e1
case t1 of
Pi vr i o -> do
t2 <- infer e2
t `assertEquality` i
N.normalise =<< subst [(vr, e2)] o -- (I)
e -> throwError (ExpectedPi e) -- (II)
Notice that, here, we don't substitute the $\prod$-bound variable by the type of $e_2$: That'd make us equivalent to System $F$. The whole deal with dependent types is that types depend on values, and that entirely stems from this one line. By instancing a type variable with a value, we allow types to depend on values.
Oh, and if we didn't get a $\prod$-abstraction, error.
You'll notice that two typing rules are missing here: One for handling
let
{.haskell}s, which was not included because it is entirely
uninteresting, and one for case ... of
{.haskell} expressions, which
was redacted because it is entirely a mess.
Hopefully, in the future, the typing of case
expressions is simpler
The unification engine is the man behind the curtain in type checking: We often don't pay attention to it, but it's the driving force behind it all. Fortunately, in our case, unification is entirely trivial: Solving is the hard bit.
The job of the unification engine is to produce a set of constraints
that have to be satisfied in order for two types to be equal. Then, the
solver is run on these constraints to assert that they are logically
consistent, and potentially produce substitutions that reify those
constraints.
Our solver isn't that cool, though, so it just verifies consitency.
The kinds of constraints we can generate are as in the data type below.
data Constraint
= Instance Var Term -- (1)
| Equal Term Term -- (2)
| EqualTypes Type Type -- (3)
| IsSet Type -- (4)
deriving (Eq, Show, Ord)
Instance v t
{.haskell} corresponds to a substitution
between v
and the term t
.Equal a b
{.haskell} states that the two terms a
and
b
are equal under normalisation.IsSet t
{.haskell} asserts that the provided type has
inhabitants.Unification of most terms is entirely uninteresting. Simply line up the structures and produce the appropriate equality (or instance) constraints.
unify (Variable a) b = instanceC a b
unify b (Variable a) = instanceC a b
unify (Set a) (Set b) | a == b = pure []
unify (App x y) (App x' y') =
(++) <$> unify x x' <*> unify y y'
unify (TypeHint a b) (TypeHint c d) =
(++) <$> unify a c <*> unify b d
unify a b = throwError (NotEqual a b)
Those are all the boring cases, and I'm not going to comment on them. Similarly boring are binders, which were abstracted out because hlint told me to.
unify (Lam v1 t1 b1) (Lam v2 t2 b2) = unifyBinder (v1, v2) (t1, t2) (b1, b2)
unify (Pi v1 t1 b1) (Pi v2 t2 b2) = unifyBinder (v1, v2) (t1, t2) (b1, b2)
unify (Let v1 t1 b1) (Let v2 t2 b2) = unifyBinder (v1, v2) (t1, t2) (b1, b2)
unifyBinder (v1, v2) (t1, t2) (b1, b2) = do
(a, b) <- (,) <$> unify (Variable v1) (Variable v2) <*> unify t1 t2
((a ++ b) ++) <$> unify b1 b2
There are two interesting cases: Unification between some term and a pi abstraction, and unification between two variables.
unify ta@(Variable a) tb@(Variable b)
| a == b = pure []
| otherwise = do
(x, y) <- (,) <$> lookupType a <*> lookupType b
case (x, y) of
(Just _, Just _) -> do
ca <- equalTypesC ta tb
cb <- equalC ta tb
pure (ca ++ cb)
(Just x', Nothing) -> instanceC b x'
(Nothing, Just x') -> instanceC a x'
(Nothing, Nothing) -> instanceC a (Variable b)
If the variables are syntactically the same, then we're done, and no constraints have to be generated (Technically you could generate an entirely trivial equality constraint, but this puts unnecessary pressure on the solver).
If either variable has a known type, then we generate an instance constraint between the unknown variable and the known one.
If both variables have a value, we equate their types' types and their types. This is done mostly for error messages' sakes, seeing as if two values are propositionally equal, so are their types.
Unification between a term and a $\prod$-abstraction is the most interesting case: We check that the $\prod$ type abstracts over a type (i.e., it corresponds to a System F $\forall$ instead of a System F $\to$), and instance the $\prod$ with a fresh type variable.
unifyPi v1 t1 b1 a = do
id <- refresh Irrelevant
ss <- isSetC t1
pi' <- subst [(v1, Variable id)] b1
(++ ss) <$> unify a pi'
unify a (Pi v1 t1 b1) = unifyPi v1 t1 b1 a
unify (Pi v1 t1 b1) a = unifyPi v1 t1 b1 a
Solving is a recursive function of the list of constraints (a catamorphism!) with some additional state: Namely, a strict map of already-performed substitutions. Let's work through the cases in reverse order of complexity (and, interestingly, reverse order of how they're in the source code).
Solving an empty list of constraints is entirely trivial.
solveInner _ [] = pure ()
IsSet
We infer the index of the universe of the given type, much like in the inferrence case for $\prod$-abstractions, and check the remaining constraints.
solveInner map (IsSet t:xs) = do
_ <- inferSet t
solveInner map xs
EqualTypes
We infer the types of both provided values, and generate an equality constraint.
solveInner map (EqualTypes a b:xs) = do
ta <- infer a
tb <- infer b
solveInner map (Equal ta tb:xs)
Equal
We merely have to check for syntactic equality of the (normal forms of) terms, because the hard lifting of destructuring and lining up was done by the unification engine.
solveInner map (Equal a b:xs) = do
a' <- N.normalise a
b' <- N.normalise b
eq <- equal a' b'
if eq
then solveInner map xs
else throwError (NotEqual a b)
Instance
If the variable we're instancing is already in the map, and the thing we're instancing it to now is not the same as before, we have an inconsistent set of substitutions and must error.
solveInner map (Instance a b:xs)
| a `M.member` map
, b /= map M.! a
, Irrelevant /= a
= throwError $ InconsistentSubsts (a, b) (map M.! a)
Otherwise, if we have a coherent set of instances, we add the instance both to scope and to our local state map and continue checking.
| otherwise =
local (insertType (a, b)) $
solveInner (M.insert a b map) xs
Now that we have both unify
and solve
, we can write
assertEquality
: We unify the two types, and then try to solve the set
of constraints.
assertEquality t1 t2 = do
cs <- unify t1 t2
solve cs
The real implementation will catch and re-throw any errors raised by
solve
to add appropriate context, and that's not the only case where
"real implementation" and "blag implementation" differ.
Wow, that was a lot of writing. This conclusion begins on exactly the
500th line of the Markdown source of this article, and this is the
longest article on this blag (by far). However, that's not to say it's
bad: It was amazing to write, and writing dtt
was also amazing. I am
not good at conclusions.
dtt
is available under the BSD 3-clause licence, though I must warn
you that the source code hasn't many comments.
I hope you learned nearly as much as I did writing this by reading it.
See System U, also Girard's paradox - the type theory equivalent of Russell's paradox. ↩︎