amelia/blag: my blog lives here now - pages/posts/2017-09-08-dependent-types.md at 3ac48903e74d03ec9eb1a639eb3549b28c3891a1 - blag

my blog lives here now

18 KiB

Raw Blame History

title	date	maths
Dependent Types	September 08, 2017	true

Dependent types are pretty cool, yo. This post is a semi-structured ramble about dtt, a small dependently-typed "programming language" inspired by Thierry Coquand's Calculus of (inductive) Constructions (though, note that the induction part is still lacking: There is support for defining inductive data types, and destructuring them by pattern matching, but since there's no totality checker, recursion is disallowed).

dtt is written in Haskell, and served as a learning experience both in type theory and in writing programs using extensible effects. I do partly regret the implementation of effects I chose (the more popular extensible-effects did not build on the Nixpkgs channel I had, so I went with freer; Refactoring between these should be easy enough, but I still haven't gotten around to it, yet)

I originally intended for this post to be a Literate Haskell file, interleaving explanation with code. However, for a pet project, dtt's code base quickly spiralled out of control, and is now over a thousand lines long: It's safe to say I did not expect this one bit.

The language

dtt is a very standard $\lambda_{\prod{}}$ calculus. We have all 4 axes of Barendgret's lambda cube, in virtue of having types be first class values: Values depending on values (functions), values depending on types (polymorphism), types depending on types (type operators), and types depending on values (dependent types). This places dtt squarely at the top, along with other type theories such as the Calculus of Constructions (the theoretical basis for the Coq proof assistant) and TT (the type theory behind the Idris programming language).

The syntax is very simple. We have the standard lambda calculus constructs - $\lambda$-abstraction, application and variables - along with let{.haskell}-bindings, pattern matching case expression, and the dependent type goodies: $\prod$-abstraction and Set{.haskell}.

As an aside, pi types are called as so because the dependent function space may (if you follow the "types are sets of values" line of thinking) be viewed as the cartesian product of types. Consider a type A{.haskell} with inhabitants Foo{.haskell}, Bar{.haskell} and a type B{.haskell} with inhabitant Quux{.haskell}. A dependent product $\displaystyle\prod_{(x: \mathtt{A})}\mathtt{B}$, then, has inhabitants (Foo, Quux){.haskell} and (Bar, Quux){.haskell}.

You'll notice that dtt does not have a dedicated arrow type. Indeed, the dependent product subsumes both the $\forall$ quantifier of System $F$, and the arrow type $\to$ of the simply-typed lambda calculus. Keep this in mind: It'll be important later.

Since dtt's syntax is unified (i.e., there's no stratification of terms and types), the language can be - and is - entirely contained in a single algebraic data type. All binders are explicitly typed, seeing as inference for dependent types is undecidable (and, therefore, bad).¹

type Type = Term
data Term
  = Variable Var
  | Set Int
  | TypeHint Term Type
  | Pi Var Type Type
  | Lam Var Type Term
  | Let Var Term Term
  | App Term Term
  | Match Term [(Pattern, Term)]
  deriving (Eq, Show, Ord)

The TypeHint{.haskell} term constructor, not mentioned before, is merely a convenience: It allows the programmer to check their assumptions and help the type checker by supplying a type (Note that we don't assume this type is correct, as you'll see later; It merely helps guide inference.)

Variables aren't merely strings because of the large amount of substitutions we have to perform: For this, instead of generating a new name, we increment a counter attached to the variable - the pretty printer uses the original name to great effect, when unambiguous.

data Var
  = Name String
  | Refresh String Int
  | Irrelevant
  deriving (Eq, Show, Ord)

The Irrelevant{.haskell} variable constructor is used to support $a \to b$ as sugar for $\displaystyle\prod_{(x: a)} b$ when $x$ does not appear free in $b$. As soon as the type checker encounters an Irrelevant{.haskell} variable, it is refreshed with a new name.

dtt does not have implicit support (as in Idris), so all parameters, including type parameters, must be bound explicitly. For this, we support several kinds of syntatic sugar. First, all abstractions support multiple variables in a binding group. This allows the programmer to write (a, b, c : α) -> β instead of (a : α) -> (b : α) -> (c : α) -> β. Furthermore, there is special syntax /\a for single-parameter abstraction with type Set 0{.haskell}, and lambda abstractions support multiple binding groups.

As mentioned before, the language does not support recursion (either general or well-founded). Though I would like to, writing a totality checker is hard - way harder than type checking $\lambda_{\prod{}}$, in fact. However, an alternative way of inspecting inductive values _does_ exist: eliminators. These are dependent versions of catamorphisms, and basically encode a proof by induction. An inductive data type as Nat gives rise to an eliminator much like it gives rise to a natural catamorphism.

inductive Nat : Type of {
  Z : Nat;
  S : Nat -> Nat
}

natElim : (P : Nat -> Type)
       -> P Z
       -> ((k : Nat) -> P k -> P (S k))
       -> (n : Nat)
       -> P n

If you squint, you'll see that the eliminator models a proof by induction (of the proposition $P$) on the natural number $n$: The type signature basically states "Given a proposition $P$ on $\mathbb{N}$, a proof of $P_0$, a proof that $P_{(k + 1)}$ follows from $P_k$ and a natural number $n$, I'll give you a proof of $P_n$."

This understanding of computations as proofs and types as propositions, by the way, is called the Curry-Howard Isomorphism. The regular, simply-typed lambda calculus corresponds to natural deduction, while $\lambda_{\prod{}}$ corresponds to predicate logic.

The type system

Should this be called the term system?

Our type inference algorithm, contrary to what you might expect for such a complicated system, is actually quite simple. Unfortunately, the code isn't, and thus isn't reproduced in its entirety below.

Variables

The simplest case in any type system. The typing judgement that gives rise to this case is pretty much the identity: $\Gamma \vdash \alpha: \tau \therefore \Gamma \vdash \alpha: \tau$. If, from the current typing context we know that $\alpha$ has type $\tau$, then we know that $\alpha$ has type $\tau$.

  Variable x -> do
    ty <- lookupType x -- (I)
    case ty of
      Just t -> pure t -- (II)
      Nothing -> throwError (NotFound x) -- (III)

Look up the type of the variable in the current context.
If we found a type for it, then return that (this is the happy path)
If we didn't find a type for it, we raise a type error.

`Set`{.haskell}s

Since dtt has a cummulative hierarchy of universes, $\mathtt{Set}k: \mathtt{Set}{(k + 1)}$. This helps us avoid the logical inconsistency introduced by having type-in-type², i.e. $\mathtt{Type}: \mathtt{Type}$. We say that $\mathtt{Set}_0$ is the type of small types: in fact, $\mathtt{Set}_0$ is where most computation actually happens, seeing as $\mathtt{Set}_k$ for $k \ge 1$ is reserved for $\prod$-abstractions quantifying over such types.

  Set k -> pure . Set . (+1) $ k

Type hints

Type hints are the first appearance of the unification engine, by far the most complex part of dtt's type checker. But for now, suffices to know that t1 `assertEquality` t2{.haskell} errors if the types t1 and t2 can't be made to line up, i.e., unify.

For type hints, we infer the type of given expression, and compare it against the user-provided type, raising an error if they don't match. Because of how the unification engine works, the given type may be more general (or specific) than the inferred one.

  TypeHint v t -> do
    it <- infer v
    t `assertEquality` it
    pure t

$\prod$-abstractions

This is where it starts to get interesting. First, we mandate that the parameter type is inhabited (basically, that it is, in fact, a type). The dependent product $\displaystyle\prod_{(x : 0)} \alpha$, while allowed by the language's grammar, is entirely meaningless: There's no way to construct an inhabitant of $0$, and thus this function may never be applied.

Then, in the context extended with $(\alpha : \tau)$, we require that the consequent is also a type itself: The function $\displaystyle\prod_{(x: \mathbb{N})} 0$, while again a valid parse, is also meaningless.

The type of the overall abstraction is, then, the maximum value of the indices of the universes of the parameter and the consequent.

  Pi x p c -> do
    k1 <- inferSet tx
    k2 <- local (insertType (x, p)) $
      inferSet c
    pure $ Set (k1 `max` k2)

$\lambda$-abstractions

Much like in the simply-typed lambda calculus, the type of a $\lambda$-abstraction is an arrow between the type of its parameter and the type of its body. Of course, $\lambda_{\prod{}}$ incurs the additional constraint that the type of the parameter is inhabited.

Alas, we don't have arrows. So, we "lift" the lambda's parameter to the type level, and bind it in a $\prod$-abstraction.

  Lam x t b -> do
    _ <- inferSet t
    Pi x t <$> local (insertType (x, t)) (infer b)

Note that, much like in the Pi{.haskell} case, we type-check the body in a context extended with the parameter's type.

Application

Application is the most interesting rule, as it has to not only handle inference, it also has to handle instantiation of $\prod$-abstractions.

Instantation is, much like application, handled by $\beta$-reduction, with the difference being that instantiation happens during type checking (applying a $\prod$-abstraction is meaningless) and application happens during normalisation (instancing a $\lambda$-abstraction is meaningless).

The type of the function being applied needs to be a $\prod$-abstraction, while the type of the operand needs to be inhabited. Note that the second constraint is not written out explicitly: It's handled by the Pi{.haskell} case above, and furthermore by the unification engine.

  App e1 e2 -> do
    t1 <- infer e1
    case t1 of
      Pi vr i o -> do
        t2 <- infer e2
        t `assertEquality` i
        N.normalise =<< subst [(vr, e2)] o -- (I)
      e -> throwError (ExpectedPi e) -- (II)

Notice that, here, we don't substitute the $\prod$-bound variable by the type of $e_2$: That'd make us equivalent to System $F$. The whole deal with dependent types is that types depend on values, and that entirely stems from this one line. By instancing a type variable with a value, we allow types to depend on values.
Oh, and if we didn't get a $\prod$-abstraction, error.

You'll notice that two typing rules are missing here: One for handling let{.haskell}s, which was not included because it is entirely uninteresting, and one for case ... of{.haskell} expressions, which was redacted because it is entirely a mess.

Hopefully, in the future, the typing of case expressions is simpler

if not, they'll probably be replaced by eliminators.

Unification and Constraint Solving

The unification engine is the man behind the curtain in type checking: We often don't pay attention to it, but it's the driving force behind it all. Fortunately, in our case, unification is entirely trivial: Solving is the hard bit.

The job of the unification engine is to produce a set of constraints that have to be satisfied in order for two types to be equal. Then, the solver is run on these constraints to assert that they are logically consistent, and potentially produce substitutions that reify those constraints.
Our solver isn't that cool, though, so it just verifies consitency.

The kinds of constraints we can generate are as in the data type below.

data Constraint
  = Instance Var Term -- (1)
  | Equal Term Term -- (2)
  | EqualTypes Type Type -- (3)
  | IsSet Type -- (4)
  deriving (Eq, Show, Ord)

The constraint Instance v t{.haskell} corresponds to a substitution between v and the term t.
A constraint Equal a b{.haskell} states that the two terms a and b are equal under normalisation.
Ditto, but with their types (We normalise, infer, and check for equality)
A constraint IsSet t{.haskell} asserts that the provided type has inhabitants.

Unification

Unification of most terms is entirely uninteresting. Simply line up the structures and produce the appropriate equality (or instance) constraints.

unify (Variable a) b = instanceC a b
unify b (Variable a) = instanceC a b
unify (Set a) (Set b) | a == b = pure []
unify (App x y) (App x' y') =
  (++) <$> unify x x' <*> unify y y'
unify (TypeHint a b) (TypeHint c d) = 
  (++) <$> unify a c <*> unify b d
unify a b = throwError (NotEqual a b)

Those are all the boring cases, and I'm not going to comment on them. Similarly boring are binders, which were abstracted out because hlint told me to.

unify (Lam v1 t1 b1) (Lam v2 t2 b2) = unifyBinder (v1, v2) (t1, t2) (b1, b2)
unify (Pi v1 t1 b1) (Pi v2 t2 b2) = unifyBinder (v1, v2) (t1, t2) (b1, b2)
unify (Let v1 t1 b1) (Let v2 t2 b2) = unifyBinder (v1, v2) (t1, t2) (b1, b2)
unifyBinder (v1, v2) (t1, t2) (b1, b2) = do
  (a, b) <- (,) <$> unify (Variable v1) (Variable v2) <*> unify t1 t2
  ((a ++ b) ++) <$> unify b1 b2

There are two interesting cases: Unification between some term and a pi abstraction, and unification between two variables.

unify ta@(Variable a) tb@(Variable b)
  | a == b = pure []
  | otherwise = do
  (x, y) <- (,) <$> lookupType a <*> lookupType b
  case (x, y) of
    (Just _, Just _) -> do
      ca <- equalTypesC ta tb
      cb <- equalC ta tb
      pure (ca ++ cb)
    (Just x', Nothing) -> instanceC b x'
    (Nothing, Just x') -> instanceC a x'
    (Nothing, Nothing) -> instanceC a (Variable b)

If the variables are syntactically the same, then we're done, and no constraints have to be generated (Technically you could generate an entirely trivial equality constraint, but this puts unnecessary pressure on the solver).

If either variable has a known type, then we generate an instance constraint between the unknown variable and the known one.

If both variables have a value, we equate their types' types and their types. This is done mostly for error messages' sakes, seeing as if two values are propositionally equal, so are their types.

Unification between a term and a $\prod$-abstraction is the most interesting case: We check that the $\prod$ type abstracts over a type (i.e., it corresponds to a System F $\forall$ instead of a System F $\to$), and instance the $\prod$ with a fresh type variable.

unifyPi v1 t1 b1 a = do
  id <- refresh Irrelevant
  ss <- isSetC t1
  pi' <- subst [(v1, Variable id)] b1
  (++ ss) <$> unify a pi'

unify a (Pi v1 t1 b1) = unifyPi v1 t1 b1 a
unify (Pi v1 t1 b1) a = unifyPi v1 t1 b1 a

Solving

Solving is a recursive function of the list of constraints (a catamorphism!) with some additional state: Namely, a strict map of already-performed substitutions. Let's work through the cases in reverse order of complexity (and, interestingly, reverse order of how they're in the source code).

No constraints

Solving an empty list of constraints is entirely trivial.

solveInner _ [] = pure ()

`IsSet`

We infer the index of the universe of the given type, much like in the inferrence case for $\prod$-abstractions, and check the remaining constraints.

solveInner map (IsSet t:xs) = do
  _ <- inferSet t
  solveInner map xs

`EqualTypes`

We infer the types of both provided values, and generate an equality constraint.

solveInner map (EqualTypes a b:xs) = do
  ta <- infer a
  tb <- infer b
  solveInner map (Equal ta tb:xs)

`Equal`

We merely have to check for syntactic equality of the (normal forms of) terms, because the hard lifting of destructuring and lining up was done by the unification engine.

solveInner map (Equal a b:xs) = do
  a' <- N.normalise a
  b' <- N.normalise b
  eq <- equal a' b'
  if eq
     then solveInner map xs
     else throwError (NotEqual a b)

`Instance`

If the variable we're instancing is already in the map, and the thing we're instancing it to now is not the same as before, we have an inconsistent set of substitutions and must error.

solveInner map (Instance a b:xs)
  | a `M.member` map
  , b /= map M.! a
  , Irrelevant /= a
  = throwError $ InconsistentSubsts (a, b) (map M.! a)

Otherwise, if we have a coherent set of instances, we add the instance both to scope and to our local state map and continue checking.

  | otherwise =
    local (insertType (a, b)) $
      solveInner (M.insert a b map) xs

Now that we have both unify and solve, we can write assertEquality: We unify the two types, and then try to solve the set of constraints.

assertEquality t1 t2 = do
  cs <- unify t1 t2
  solve cs

The real implementation will catch and re-throw any errors raised by solve to add appropriate context, and that's not the only case where "real implementation" and "blag implementation" differ.

Conclusion

Wow, that was a lot of writing. This conclusion begins on exactly the 500th line of the Markdown source of this article, and this is the longest article on this blag (by far). However, that's not to say it's bad: It was amazing to write, and writing dtt was also amazing. I am not good at conclusions.

dtt is available under the BSD 3-clause licence, though I must warn you that the source code hasn't many comments.

I hope you learned nearly as much as I did writing this by reading it.

As proven by Gilles Dowek. ↩︎
See System U, also Girard's paradox - the type theory equivalent of Russell's paradox. ↩︎

18 KiB Raw Blame History