amelia/blag: my blog lives here now - pages/posts/2021-03-07-cubical.md at d4a8d4ddd1b0da3e47ef1f86aa4979d71cb40633 - blag

my blog lives here now

54 KiB

Raw Blame History

title	date
Cubical Type Theory	March 7th, 2021

Hello, everyone! It's been a while, hasn't it? Somehow, after every post, I manage to convince myself that I'm gonna be better and not let a whole season go by between posts, but it never happens. For the last two posts I've been going on at length about fancy type theories, and this post, as the title implies, is no exception. In fact, two posts ago I mentioned, offhand, cubical type theory as a possibility for realising HoTT in a constructive way, but 128 days ago I did not understand cubical type theory in the slightest.

Now, however, I do! I still don't know what the hell the word "fibration" is supposed to mean, or indeed "fibrant", but we can gloss over that and present cubical type theory with as little category-theoretical jargon as possible. In fact, I have a mostly¹-complete implementation of cubical type theory for us to use as a discussion framework.

As mentioned in Reflections on Equality, the main idea of Cubical Type Theory is the type of paths, so let's talk about that at length.

Paths

Even in boring old Martin-Löf type theory, as soon as we have equalities and induction, we can prove a very interesting theorem: Every function preserves paths. This is actually a simplification of a more fundamental fact in MLTT, its groupoid structure, in which functions are interpreted as functors. Like a category-theoretical functor has an action on objects and an action on morphisms, a type-theoretical function has an action on values and an action on paths.

Using path induction, we can prove it (roughly) like this. Suppose (given a $f : A \to B$), there is a path $p : x \equiv_A y$. By induction, we may assume $y$ is $x$ and $p$ is $\mathrm{refl}$, in which case what we need to prove is $f(x) \equiv_{B} f(x)$. But this is what $\mathrm{refl}$ states. This isn't a complicated proof, because it's not a complicated theorem: the images of equal elements are equal, big deal.

This is where things get a little mind-bending. What would happen if we had a type with "two" values, with a path between them? The values of the function at either end could be different, but they would still be... equal. This is the main idea of cubical type theory: We add an interval type, $\mathbb{I}$, which denotes the interval object $[0,1]$ in our model. Then we can drop the inductive definition of equalities as generated by $\mathrm{refl}$ and simply define equalities in $A$ as functions $\mathbb{I} \to A$.

Let's not get ahead of ourselves, though, and talk a bit more about the interval type. It has two elements, $i0$ and $i1$, but it's not isomorphic to the type of booleans: Internally to the type theory, we have no way of distinguishing $i0$ and $i1$, since every function must be continuous.

Since it denotes $[0,1]$, we can define a lattice operations on elements of the interval, enough to equip it with the structure of a De Morgan algebra, but not a boolean algebra. We have meets, $a \land b$, the logical operation of "and", interpreted as $\min(a,b)$; Joins, $a \lor b$, the logical operation of "or", interpreted as $\max(a,b)$; And an involution, $\neg a$, which denotes the algebraic operation $1 - a$.²

These operations follow the usual laws of Boolean logic save for two: In general, $min(x, 1 - x)$ is not $0$ and $max(x, 1 - x)$ is not $1$, only for the endpoints. While internally to the type theory we have no element representing "half", since the object $\mathbb{I}$ denotes does have these filler points, we can't in general expect those equations to hold. Hence, De Morgan algebra, not Boolean.

Another thing to keep in mind is that, while the interval is an expression which other expressions have as a type (namely, $\Gamma \vdash i0 : \mathbb{I}$ and $\Gamma \vdash i1 : \mathbb{I}$), we do not call it a type. We reserve the word type for objects with more structure (which we will discuss later). For now, it's enough to think of the interval as a "pre"type, something which is almost, but not quite, a type. Cubical type theory has plenty of these pretypes so we include a separate universe $\mathscr{U}_{\omega}$ to classify them.

Now that we're familiar with the interval, we can discuss the actual title of this section, paths. We define the type of paths in $A$ as a refinement of the function space $f : \mathbb{I} \to A$, where the values of $f(i0)$ and $f(i1)$ are indicated in the type. Hence the formation rule, on the left:

$$\frac{\Gamma, i : \mathbb{I} \vdash e : A}{\Gamma \vdash \lambda i. e : \mathrm{Path}\ A\ e[i0/i]\ e[i1/i]}$$

$$\frac{\Gamma \vdash p : \mathrm{Path}\ A\ x\ y\quad \Gamma \vdash i : \mathbb{I}}{\Gamma \vdash p(i) : A}$$

On the right is the elimination rule, which says that if we have an element of the interval we can project the value the path takes at that point. Alternatively we could represent paths by the type with an inclusion $\mathrm{inP} : \prod{(f : \mathbb{I} \to A)} \to \mathrm{Path}\ A\ f(i0)\ f(i1)$ and projection $\mathrm{outP} : \mathrm{Path}\ A\ x\ y \to \mathbb{I} \to A$. Furthermore, we impose a pair of "regularity" equations, which state that $p(i0) = x$ and $p(i1) = y$ for paths $p : \mathrm{Path}\ A\ x\ y$.

One important difference between functions out of the interval and paths is that, while the former would be put in the universe $\mathscr{U}_{\omega}$ by virtue of its domain being a pretype, paths do have the required additional structure to be in the universe $\mathscr{U}$ of "proper types", as long as the type $A$ of the endpoints does.

Using the algebraic structure of the interval we can define some operations on paths, which we may represent diagramatically. For simplicity, paths will be drawn as direct lines between their endpoints, and the type will be left to be inferred from the context; A path whose bound variable is $i$ will be drawn in the left-to-right direction, and a path whose bound variable is $j$ will be drawn in the upwards direction.

Since bound interval variables are variables, they have all the same structural rules as normal variables! In particular, weakening lets us drop an interval variable to have a constant path. This is a proof of reflexivity, which we diagram as follows:

The reflexivity path for $a$ is represented by a constant path.

Given a path $p$ with endpoints $x$ and $y$ (concisely written as $p : x \equiv y$) we compute its inversion, $sym(p) : y \equiv x$ by "precomposition" with the interval involution:

p from a to b — By inverting the interval argument, we can invert paths.

the inverse of p from b to a — By inverting the interval argument, we can invert paths.

The meet and join operations on the interval let us define two kinds of squares called connections, which let us concisely turn a one-dimensional path into a two-dimensional square, which gives us paths between paths (paths in the second dimension). The connection generated by $i \land j$ is going to be especially helpful in a bit, when we prove that singletons are contractible, and hence that paths are a suitable definition of equality.

and connection — The square generated by $\lambda i\ j. p(i \land j)$

or connection — The square generated by $\lambda i\ j. p(i \lor j)$

Let's walk through the construction of the left square, keeping in mind that $i$ goes right and $j$ goes up. Since the top and bottom faces vary in the $i$ direction but not the $j$ direction, they'll all have a prefixed $\lambda i$; The left and right faces just correspond to applying the outermost lambda inside the square. For the faces, we have:

Left: $(\lambda i\ j. p(i \land j))\ i0$, which reduces to $\lambda j. p(i0)$, is the constant path at $a$;
Top: $\lambda i. (\lambda i\ j. p(i \land j))\ i\ i1$, which reduces to $\lambda i. p(i)$, is the path $p$;
Bottom: $\lambda i. (\lambda i\ j. p(i \land j))\ i\ i0$, which reduces to $\lambda i. p(i0)$. Again, $\mathrm{refl}_a$.
Right: $(\lambda i\ j. p(i \land j))\ i1$, which reduces to $\lambda j. p(j)$--- you guessed it, it's $p$;

You can see that in either the $i$ or $j$ direction the inside of this square connects the path $p$ with the constant path at its left endpoint. This is exactly what we need for the following proof that singletons are contractible:

singContr : {A : Type} {a : A} -> isContr (Singl A a)
singContr {A} {a} = ((a, \i -> a), \y i -> (y.2 i, \j -> y.2 (iand i j)))

This proof is written syntactically, in the language of cubical. This proof appears on line 114 of the massive source file which has everything I've tried to prove with this so far. What's a module system? The actual proof file has some preliminaries which would be interesting if you care about how cubical type theory is actually implemented.

Another operation on equalities which is very hard in MLTT, but trivial with cubes, is function extensionality. You can see why this would be simple if you consider that a pointwise equality between functions would be an element of $A \to \mathbb{I} \to B$, while an equality between functions themselves is an element of $\mathbb{I} \to A \to B$. By simply swapping the binders, we get the naive function extensionality.

The proof of full function extensionality as per the HoTT book is also very simple, but it requires quite a bit more infrastructure to talk about; For now, rather than saying happly (line 667) is an equivalence, we can simply say that happly has funext as right and left inverses, and the proof is trivial in both directions (line 675).

With the infrastructure so far we can't prove a whole lot, though. For instance, we have prove that singletons are contractible, but this doesn't freely get us axiom J; Neither can we prove that every property respects equalities, or anything like that. For that sort of proof, we need to introduce a transport operation, which, given the left endpoint of a path of types, returns the right endpoint. However, cubical type theory refuses to be simple.

Quick sidenote, path application corresponds to the eliminator for $\mathbb{I}$, since it conceptually has the type in the box below. We use here the type of dependent paths, PathP.

iElim : {A : I -> Type} {x : A i0} {y : A i1} -> PathP A x y
     -> (i : I) -> A i
iElim p i = p i

Simplicity is disallowed

While providing a primitive $\mathrm{transp} : \prod{(A : \mathbb{I}) \to \mathscr{U}} \to A(i0) \to A(i1)$ might seem like all we need to make paths a sensible notion of equality, reality is not that simple. In particular, transport on paths is hard to define with such an operation, so, as is tradition in type theory, we make things simpler by making them more general. Rather than providing a primitive transport, we provide a primitive composition operation, which generalises transport and composition of paths.

Composition expresses the funny-sounding principle that "every open box has a lid". No, that is not a joke; That's actually what we're talking about. A description in (almost!) English would be to say that composition, given a shape, a partial cube of that shape, and a face (which must agree with the partial cube), returns the opposite face. If you think that description is nonsensical, strap in, because interpreting it type-theoretically requires another 67 lines of definitions in the code! For reference, the almost 2000 words which precede this paragraph covered roughly 20 lines of actual code.

Crying over what I still have to write won't help me get this blog post out any sooner though, so let's get to it.

Cofibrations

Again that god damn word. In addition to the interval object, to define a cubical model of type theory, we need a notion of cofibration, which is a fancy way of saying "shape of a partial cube". In the papers which introduced cubical type theory, they use a "face lattice", $\mathbb{F}$. However, this turns out to be needlessly complicated, as we can get this structure from the interval object.

To each element $\phi : \mathbb{I}$ (referred to as a formula) we assign a cofibrant proposition $[\phi] : \mathscr{U}_{\omega}$³ which is inhabited when $\phi = i1$. In the code, we write IsOne phi for $[\phi]$ and it is inhabited by a distinguished element itIs1 : IsOne i1. This family of types is definitionally proof-irrelevant, which means that any two inhabitants of IsOne phi are equal.

A note on terminology

Throughout the rest of this post I'll refer to elements of the interval as either "endpoints" or "formulas" depending on how they're used. These aren't technical terms, and are meant to be just indicative. The convention is roughly that, if $i : \mathbb{I}$ is used as the argument to a path, or to a filler, or it's the bound variable in a composition (or etc), it's called an endpoint; If it's used to denote a restriction (i.e., there might reasonably be an element of $[\phi]$ in the context), it's called a formula.

Also I apologise for the garbled terminology (or even ambiguity) when talking about $[\phi]$ vs $\phi$, since both can reasonably be called formulas.

We can interpret these propositions as being shapes of partial cubes. For instance, the proposition $[i \lor \neg i]$ (for $i : \mathbb{I}$) represents a "line" which is defined when $i = i0$ or $i = i1$, but not in the middle; This isn't a line as much as it is a pair of points.

Thinking back to the "human-readable" description of the composition operation, the proposition $\phi$ specifies the shape of the open box, but not the box itself.

Partial Elements

We call a function $f : [\phi] \to A$ a partial element of $A$, that is, an element of $A$ which is only defined when $[\phi]$ is inhabited. For these we have a special pattern-matching notation, termed a system, which is written between brackets.

partialBool : (i : I) -> Partial (ior i (inot i)) Bool
partialBool = \i [ (i = i0) -> false, (i = i1) -> true ]

The element partialBool above is a boolean with different values when i = i0 or i = i1. However, this does not lead to a contradiction, because to extract the underlying bool we need to apply partialBool not only to an element of the interval, but also to an inhabitant of IsOne (ior i (inot i)). This is why it's critical that the type checker distinguishes between $i \lor \neg i$ and $i1$!

As another implementation note, the type Partial phi A is a version of IsOne phi -> A with a more extensional equality. Two elements of Partial phi A are equal when they represent the same subcube, i.e., they take equal values for every assignment of variables which makes phi = i1.

Furthermore, there is a dependent version of Partial{.kw}, PartialP{.kw}⁴, which allows the type A itself to be a partial element of $\mathscr{U}$. This will be used later when we introduce the glueing operation.

In the composition operation, the partial element with shape $\phi$ specifies the open box itself.

Extensibility

Given a type $A$ and a partial element $u : [\phi] \to A$, we can define the type of elements $a : A$ which extend $u$. These are total elements, in that their existence does not depend on the inhabitation of $[\phi]$ (for any $\phi$). To say they extend $u$ is to say that, given $[\phi]$, we have that $u(\mathtt{1is1}) = a$. In the theory, where we have All the fancy symbols, we write $A[\phi \to u]$ for the type of extensions of $u$, but in the code, where we're limited to boring ASCII, we just write Sub A phi u.

We can make any total element u : A into a partial element, with any formula that we want, by ignoring the proof. The constructor inS for the Sub-types expresses that this partial element agrees with u on any phi that we choose.

inS : {A : Type} {phi : I} (u : A) -> Sub A phi (\x -> u)

We also have a projection operation for Sub types, which undoes inS. Furthermore, outS {A} {i1} {u} x computes to u i1 itIs1, since x agrees with u on phi.

outS : {A : Type} {phi : I} {u : Partial phi A} -> Sub A phi u -> A

With the idea of a cubical Sub{.kw}type we can express the type of the fourth argument of the composition operation, the "bottom" face of an open box with agrees with (extends!) the partial element specifying the sides.

Composition

As stated before, the composition operation takes as input the description of an open cube with a face removed and computes that missing face. However this is not a helpful definition if we do not yet have intuition for what "cubes with missing faces" look like! So before explaining the computational behaviour of the composition operation (which is... quite something), let me show you some examples.

Before we get to the examples, for reference, this is the type of the composition operation, written out in syntax:

comp : (A : I -> Type) {phi : I} (u : (i : I) -> Partial phi (A i))
    -> (a0 : Sub (A i0) phi (u i0))
    -> A i1

A trivial use of composition is one where we take the formula $\phi = i0$, that is, the partial cube specifying the sides is defined nowhere. In this case we may illustrate the input face of the composition operation as agreeing with... nothing.

transport, illustrated — The arguments to `comp A {i0} (\k [])`, illustrated.

That's right, in the case where the formula is always false and the partial cube is empty, the input of the composition operation is just a point a0 : A i0, the left endpoint of a path. And by looking at the type of the composition operation, or thinking about its description, you can see where this is going! We give it a0 : A i0, and it gives us an element comp A {i0} (\k []) a0 : A i1!

That's right, by ignoring the extra power which the composition operation gives us over boring transport, we get back boring transport. Not too surprising, let's keep going.

For an example which illustrates composition with a cube, suppose we have three points, $x$, $y$, and $z$, all in some type $A$. Furthermore suppose that we have paths $p : x \equiv y$ and $q : y \equiv z$. By the transitive property of equality, we know there should be a path between $y$ and $z$. Furthermore, we know that transporting along this composite should be equivalent to transporting along $p$ then along $q$. But how can we, using cubical methods, build the composite of $p$ and $q$?

If you guessed the answer was "using composition", you... don't get a lot of extra points. It was heavily implied. But you can still have a cookie, since I suppose it can't be helped. To create this composite we need to draw a square with 3 lines, such that the missing line connects $x$ and $z$. Furthermore, the requirement that transporting along the composite transports along both constituent paths will guide us in creating this drawing. We only have two paths, though!

composition of paths — The lid of this square gives us the composite $q \circ p$ of $p$ and $q$.

Turns out that only having two paths is not an issue, since we can always take the reflexivity path to get the side we didn't have. To make it clearer, the partial element $u : (j : \mathbb{I}) \to \mathrm{Partial}\ (\neg i \lor i)\ A$ is the tube with sides $a$ and $q(j)$, and the input $p(i) : A$ is the bottom side. These agree because when $j$ (the direction of composition) is $i0$ (the base), $u$ has left endpoint $a$ and right endpoint $b$; A path between these is exactly what $p(i)$ ($i$ is the direction of the path) is.

trans : {A : Type} {x : A} {y : A} {z : A}
     -> Path x y
     -> Path y z
     -> Path x z
trans {A} {x} p q i =
  comp (\i -> A)
    {ior i (inot i)}
    (\j [ (i = i0) -> x, (i = i1) -> q j ])
    (inS (p i))

This expression is a syntactic representation of the composition drawn above; The dotted line in that diagram is the result of the composition operation.

Cubical Complication 2: Computing Compositions

It doesn't suffice to describe the composition operation in types, we also need to describe how it computes when applied to enough arguments. The composition operation reduces to a canonical element of the type $A(i1)$ based on the structure of the function $A : \mathbb{I} \to \mathscr{U}$, by cases. For example, when $A$ computes to a function type, the composition will evaluate to a lambda expression; When $A$ is a $\sum$-type, it computes to a pair, etc.

Before we get started, one thing to note is that, since we have the $i \land j$ operation on elements of the interval, the composition operation can compute not only missing faces, but the missing inside of a cube, which we call its filler. For instance, the filler fill A {i0} (\k []) a0 i connects a0 and comp A {i0} (\k []) a0 in the i direction, since it is the 1-dimensional cube (path) between the given and missing faces.

fill : (A : I -> Type) {phi : I}
       (u : (i : I) -> Partial phi (A i))
       (a0 : Sub (A i0) phi (u i0))
    -> (i : I) -> A i
fill A {phi} u a0 i =
    comp (\j -> A (iand i j))
         {ior phi (inot i)}
         (\j [ (phi = i1) as p -> u (iand i j) p, (i = i0) -> outS a0 ])
         (inS (outS a0))

Fillers will be fundamental in reducing compositions in dependent types, including pairs, functions, and general inductive types.

Simple types

A good place to start is composition for inductive types without parameters, since that is trivial. For instance, any composition in the booleans just evaluates to argument. This is also the case for many other types: the natural numbers, the integers, the rational numbers, etc.

$$\mathrm{comp}\ (\lambda i. \mathrm{Bool})\ [\phi \to u]\ a0 = a0$$

For parametrised types like lists, we need to explain composition by recursion. In the nil case it's trivial, we can just return nil. In the cons case, though, we need to recursively apply composition in the head and the tail, to end up with a list of the right type, agreeing with the right faces.

$$ \mathrm{comp}\ (\lambda i. \mathrm{List}(A))\ [\phi \to \mathtt{cons}\ x\ xs]\ (\mathtt{cons}\ a\ as) =\
\mathtt{cons}\ (\mathrm{comp}\ (\lambda i. A) [\phi \to x]\ a) (\mathrm{comp}\ (\lambda i. \mathrm{List}(A)) [\phi \to xs]\ as)$$

Dependent functions

Starting with the full reduction rule for composition in functions would be a lot, so I'll build it up incrementally. First, I'll explain transport in simple functions. Then, transport in dependent functions. After I've explained those two we can add back the sides to get the full composition for functions.

So, consider for starters transport in a line of $A \to B$, where both are functions $\mathbb{I} \to \mathscr{U}$. We're given a function $f : A(i0) \to B(i0)$ and want to compute a function $f : A(i1) \to B(i1)$. Start by introducing a $\lambda$ abstraction binding a single variable $x : A(i1)$, under which we'll work.

Since to get any sort of element of $B$ we need to apply $f$, we must first transport $x$ to get an element of $A(i0)$, to be the argument of $f$. The line $\lambda i. A(\neg i)$ connects $A(i1)$ and $A(i0)$, so that's what we transport over. Take $x\prime = \mathrm{comp}\ (\lambda i. A (\neg i))\ (\lambda k [])\ x$.

The application $f\ x\prime$ has type $B(i0)$, and we need to transport that to an element of $B(i1)$. Again we invoke the trivial composition to get $y = \mathrm{comp}\ B\ (\lambda k [])\ (f\ x\prime)$. Since we have computed an element of $B(i1)$, we're done; Define the composition Thus, we can take $\mathrm{comp}\ (\lambda i. A \to B)\ (\lambda k [])\ f = \lambda x. y$.

To see the details of how composition generalises to dependent functions, consult the appendix, since it's a bit verbose to be here.

Dependent pairs

The composition for pairs is what you'd expect. We have to transport the first element of the pair, and use a filler when transporting the second element to make sure the endpoints line up. Again, the details are in the appendix if knowing more about composition strikes your fancy, but it's not too necessary to follow the proofs.

To be concise here, a simple equation that should clarify the behaviour of transport on pairs is the simply-typed definition of transport:

$$ \mathrm{transp}\ (\lambda i. A \times B)\ (x, y) =\
(\mathrm{transp}\ (\lambda i. A)\ x, \mathrm{transp}\ (\lambda i. B)\ y) $$

Paths

In the case of paths, composition is composition. We're given a path $p0 : Path\ A\ u\ v$, where all of $A$, $u$ and $v$ can depend on a variable $i : \mathbb{I}$, which is the direction of composition. Furthermore we have a family of partial paths $p$ with which $p0$ agrees, and with which the result must also agree.

We start by assuming the existence of a dimension $j : \mathbb{I}$, which will be bound later. When $j = i0$, the resulting composition has to have value $u(i)$, and when $j = i1$, the result must be $v(i)$. Furthermore, when $phi$, the result must have the same value as $p(j)$. We can package these constraints straightforwardly in the partial element $[ \phi \to p(j), (j = i0) \to u, (j = i1) \to v ]$, again abusing notation for the applications of $u(i)$ and $v(i)$.

$$\mathrm{comp}\ (\lambda i. \mathrm{Path}\ A(i)\ u(i)\ v(i))\ [\phi \to p]\ p0 =\
\lambda j. \mathrm{comp}\ A\ [ \phi \to p(j), (j = i0) \to u, (j = i1) \to v ]\ (p0(j))$$

A note on naming: Pretypes

All of the types we explained composition for above are, well, types. In cubical type theory, or at least in this presentation, we reserve the word type for those objects which have a composition structure. The ones which don't have a composition structure are called pretypes.

Alternatively we could call the types for which we have composition the fibrant types, since they have a fibrancy structure, as in the CHM paper: They have a transport structure and a homogenous composition structure, with which we can assemble a composition structure as above.

All of the type formers inherited from MLTT ($\prod$ and $\sum$), the path types, and every inductive and higher inductive type made out of types are fibrant, leaving only the cubical primitives (the interval, partial elements, and cubical subtypes) as pretypes. However, we could consider an extension of type theory where both sorts are given equal importance: This would be a two-level type theory, a realisation of Voevodsky's Homotopy Type System.

Auxiliary Definitions

In this section we're going to talk about a handful of operations, which can be defined in terms of what we have so far, which will be used in discussing the $\mathrm{Glue}$ types, which are used in interpreting the univalence axiom. In contrast to the CCHM paper, I'll only talk about the notions which are mandatory for defining the glueing operation. Composition for glue is very complex, and needlessly detailed for the purposes of this post.

Contractible Types

We define a type $A$ to be contractible if, and only if, there exists an element $x : A$ (called the centre of contraction) to which all other elements $y : A$ are Path-equal. Cubically, we can give an alternative formulation of contractibility: $A$ is contractible iff. every partial element $u : \mathrm{Partial}\ \phi\ A$ is extensible.

Let $p$ be the proof that $A$ is contractible, a pair containing the centre of contraction and the proof that any element of the type is equal to the centre. We define $\mathrm{contr}\ [\phi \to u] = \mathrm{comp}\ (\lambda i. A)\ [\phi \to (p.2\ u)(i)]\ p.1$.

Conversely, if we have an extension for any partial element, we can prove that type is contractible in the typical sense: Take the centre of contraction to be $\mathrm{contr}\ []$ and the proof that any $y$ is equal to that is given by extending the partial element $[ (i = i0) \to \mathrm{contr}\ [], (i = i1) \to y]$.

As an example of contractible types, we have already seen Singl A a, the type of "elements of A equal to a". This has a centre at (a, refl), which can be proven by a connection. The unit (or top) type is also contractible, having tt as a centre, which can be proven by induction. It can be proven that any contractible type is equivalent to the unit type, making all of them maximally uninteresting.

Equivalences

Since we have the univalence axiom, it is important for soundness that we define a notion of equivalence for which "being an equivalence" is a mere proposition: Either a function is an equivalence, or it isn't. We choose one which is cubically convenient, namely that of "contractible fibers".

The fiber of a function $f : A \to B$ at a point $y : B$ is a pair of an input $x : A$ together with a proof that $f(x) \equiv y$. We define $f$ to be an equivalence if for every element $y : B$, the fiber $\mathrm{fiber}\ f\ y$ is contractible. That means that, for every element in the range, there is a corresponding element in the domain, and this element is unique.

Using this notion of equivalence we can prove that every equivalence has an inverse, by taking the first element of the centre of contraction for every fiber:

inverse : {A : Type} {B : Type} {f : A -> B} -> isEquiv f -> B -> A
inverse eqv y = (eqv y) .1 .1

Furthermore, this function is an actual inverse:

section : {A : Type} {B : Type} (f : A -> B) (eqv : isEquiv f)
       -> Path (\x -> f (inverse eqv x)) id
section f eqv i y = (eqv y) .1 .2 i

We can also formulate the requirement that a function has contractible fibers cubically: A function is an equivalence iff every one of its partial fibers is extensible.

Glueing & Univalence

Since I like quoting the impenetrable definitions of the paper, glueing expresses that "extensibility is invariant under equivalence". Concretely, though, it's better to think that the $\mathrm{Glue}$ operation "glues" together a partial type $T$ onto a total type $A$ (which we call the base) to get a total type which extends $T$. We can't do this freely, though, so we require an extra datum: A (partial) equivalence between $T$ and $A$.

Glue : (A : Type) {phi : I} -> Partial phi ((T : Type) * Equiv T A) -> Type

The type $\mathrm{Glue}\ A\ [\phi \to (T, f)]$ extends $T$ in the sense that, when $\phi = i1$, $\mathrm{Glue}\ A\ [\phi \to (T, f)] = T$.

The "user-friendly" typing rule for Glue is as presented above. Internally we separate the type $T$ from the equivalences $f$ to make defining composition in Glue simpler. These types come with a constructor, $\mathrm{glue}$, which says that, given an inhabitant $t : \mathrm{PartialP}\ \phi\ T$, and a total element $a : A$ which extends the image of $f\ t$ (the equivalence), we can make an inhabitatnt of $\mathrm{Glue}\ A\ [\phi \to (T, f)]$.

Conversely there is a projection, $\mathrm{unglue}$, which extracts a value of $A$ from a value of $\mathrm{Glue}\ A\ [\phi \to (T, f)]$. When applied to an element constructed with $\mathrm{glue}$, unglueing simply extracts it; When applied to a neutral value, as long as $\phi = i1$, the value of the glued type will be a value of $T$, and the equivalence is defined; We can then apply the equivalence to get a value of $A$.

Using the boundary conditions for $\mathrm{Glue}$ we can define, from any equivalence $A \simeq B$, a path $A \equiv B$.

univalence : {A : Type} {B : Type} -> Equiv A B -> Path A B
univalence {A} {B} equiv i =
  Glue B (\[ (i = i0) -> (A, equiv),
             (i = i1) -> (B, the B, idEquiv {B}) ])

For the proof that transporting along this path has the effect of applying the equivalence, I'll need to handwave some stuff about the behaviour of transport in $\mathrm{Glue}$. First, we can illustrate the Glue done above as the dotted line in the square below:

How would one go about transporting an element across the dotted line there? Well, I have a three-step program, which, since we're talking about squares, has to be rounded up to a neat 4. Suppose we have an element $x : A$ which we want to turn into an inhabitant of $B$.

First, we can apply the equivalence $\mathrm{equiv}$ to $x$, getting us an element $\mathrm{equiv}.1\ x : B$. In the ideal world we'd be done here, but, in a more general case, we still have to do the other three filled-in lines.
We transport $\mathrm{equiv}.1\ x$ along the path $\lambda i. B$ to get an element $\mathrm{comp}\ (\lambda i. B)\ (\lambda i [])\ (\mathrm{equiv}.1\ x) : B$
Finally we can apply the inverse of the identity equivalence (which is, again, the identity) which does not alter what we've done so far.

We'd be done here, but since transport is a special case of composition, we need to compose along the line $\lambda i. B$ with the faces of the overall composition to get a proper element of the type $B$. Of course, in this case, the faces are trivial and the system is empty, but we still have to do it.

To construct a $\mathrm{Path}\ (\mathrm{transp}\ (\lambda i. \mathrm{univalence}\ f\ i))\ f.1$, there is a bit of cubical trickery which needs to be done. This proof is commented in the repository here, so I recommend you read it there for the details. The short of it is that $\mathrm{univalence}$ plus this path, which we call $\mathrm{univalence}\beta$, implies the full univalence axiom, namely that $(A \simeq B) \simeq (A \equiv B)$.

Proofs using univalence

With univalence, and a proof that isomorphisms give rise to equivalences, we can get to proving some stuff about types! That's exciting, right? I'm excited. The proof that isomorphisms give rise to equivalences is, uh, very complicated, so I won't explain it here. Full disclosure, it seems like this proof is a bit of folklore: I got it from the cubicaltt repo, and I think the version in Cubical Agda's base library is the same!

One very simple use of univalence, which doesn't require more fancy types, is proving that the universe $\mathscr{U}$ is not a set, in the sense of HoTT. Recall that a set (or h-set, to be more precise), is a type where any parallel equalities are themselves equal. In a type:

isHSet : Type -> Type
isHSet A = {x : A} {y : A} (p : Path x y) (q : Path x y) -> Path p q

We are going to prove that any inhabitant of $\mathrm{isHSet}\ \mathscr{U}$ is baloney. For this, we must define the type of booleans, the discrete space with two points:

data Bool : Type where
  true : Bool
  false : Bool

First, we can prove that $\mathrm{true} \not\equiv \mathrm{false}$. For this, suppose it were: Given a proof $p : \mathrm{true} \equiv \mathrm{false}$, we can build the path $\lambda i. \mathrm{if}\ p(i)\ \mathrm{then}\ Bool\ \mathrm{else}\ \bot$, which connects $\mathrm{Bool}$ (an arbitrary choice) and $\bot$. Transporting $\mathrm{true}$ (another arbitrary choice) along this path gives us an inhabitant $\mathrm{transp}\ (\lambda i. \dots)\ true : \bot$, which is what we wanted.^[No, there is no reason to use the QED symbol here. It's my blog, though!] $\blacksquare$

Define the function $\mathrm{not}\ x = \mathrm{if}\ x\ \mathrm{then}\ \mathrm{false}\ \mathrm{else}\ \mathrm{true}$. By induction, one can prove that $\mathrm{not}\ (\mathrm{not}\ x) \equiv x$ for any boolean, and thus $\mathrm{not}$ is its own inverse. Appealing to the fact that isomorphisms are equivalences, and then to univalence, we get a path $\mathrm{notp} : \mathrm{Bool} \equiv \mathrm{Bool}$ such that $\mathrm{transp}\ \mathrm{notp}\ x = \mathrm{not}\ x$.

Now we assume an inhabitant $\mathrm{sure}$ (to be read in a very sarcastic voice) of $\mathrm{isHSet}\ \mathscr{U}$ and derive a contradiction, that is, an inhabitant of $\bot$. The path $\mathrm{sure}\ \mathrm{notp}\ \mathrm{refl}$ connects $\mathrm{notp}$ and $\mathrm{refl}$ in the direction $i$. From this we build the path $\lambda i. \mathrm{transp}\ (\mathrm{sure}\ \mathrm{notp}\ \mathrm{refl})(i)\ \mathrm{false}$, which has as endpoints $true$ and $false$. To see this, compute:

For $i = i0$, we have $\mathrm{transp}\ \mathrm{notp}\ \mathrm{false} = \mathrm{not}\ \mathrm{false} = \mathrm{true}$.
For $i = i1$, we have $\mathrm{transp}\ \mathrm{refl}\ \mathrm{false} = \mathrm{false}$.

Applying the proof that $\mathrm{true} \not\equiv \mathrm{false}$ we have a contradiction, which is exactly what we wanted.^[I know, I know, I have to stop. Did you know I had to add the word "exactly" there so the paragraph overflew onto the next line and the QED symbol would show up right? It's terrible!]$\blacksquare$

"Big deal," I hear you say. "So what, the universe isn't a set?" Well, you're right. This isn't an exciting fact, or an exciting proof. To read. Getting this to go through was incredibly satisfying. But if we want to prove non-trivial facts using univalence, we're going to need a bigger ~~boat~~ universe. Ours doesn't have enough types.

Higher Induction

To say that our universe $\mathscr{U}$ with its infinitely many types is lacking some is... weird, I'll admit. However, it's missing a lot of them! A countably infinite amount, in fact. While we have all inductive types, we only have the zero-dimensional inductive types, and not the higher inductive types!

I've written about these before a bit in the previous post, about induction. In short, while inductive types allow us to define types with points, higher inductive types let us define types with points and paths. Full disclosure, of time of writing, the implementation of HITs in cubical is partial, in that their fibrancy structure is a big error. However we can still write some simple proofs involving them.

The Interval

Wait, didn't we talk about this before? No, no, this is the right interval. We're still on track.

The $\mathrm{Interval}$ is the inductive type freely generator by two constructors, $\mathrm{ii0}$ and $\mathrm{ii1}$, and a path $\mathrm{seg}$ connecting them. Well, that's the theory, but the reality is a bit different. In order to support eliminating (read: pattern matching on) inductive types, we can't simply assume paths exist, even in cubical type theory. What we end up with instead is a constructor parametrised by some interval (that's $\mathbb{I}$) variables, and an attached boundary condition.

In the case of the Interval, we have this definition:

data Interval : Type where
  ii0 : Interval
  ii1 : Interval
  seg i : Interval [ (i = i0) -> ii0
                   , (i = i1) -> ii1
                   ]

This says that seg i0 is definitionally equal to ii0, and seg i1 is definitionally equal to ii1. We can get a path connecting them by abstracting over the $i$ variable: $\lambda i. \mathrm{seg}\ i : \mathrm{Path}\ \mathrm{ii0}\ \mathrm{ii1}$. To pattern match on an element of the interval we need three (really, four, but one is details---and automated) things:

A case for c0 : P ii0
A case for c1 : P i11
A proof cseg which says the cases for c0 and c1 agree.

To express the type of cseg, we need to power up our path types a bit. Conceptually, just like a $\mathrm{Path}$ is a specialised version of $\mathbb{I} \to A$, we need a dependent path, called $\mathrm{PathP}$, which specialises $\prod{(i : \mathrm{I})} A\ i$, that is, the type of the endpoints is allowed to depend on the interval variable. With that, the type of p becomes PathP (\i -> P (seg i)) c0 c1, since c0 : P (seg i0) and c1 : P (seg i1).

As for that fourth thing I mentioned? In addition to preserving each of the constructor data, a map between Interval-algebras needs to be fibrancy preserving: Compositions in the domain are mapped to the "appropriate" compositions in the range. In implementations of cubical type theory, this is automatic, since the range has a fibrancy structure (since it is in $\mathscr{U}$), and preserving compositions can be done automatically and uniformly.

Since we already have an interval pretype $\mathbb{I}$, having an interval type isn't too interesting. One thing we can do is prove function extensionality... again... reproducing an argument from the HoTT book.

iFunext : {A : Type} {B : A -> Type}
          (f : (x : A) -> B x)
          (g : (x : A) -> B x)
       -> ((x : A) -> Path (f x) (g x)) -> Path f g
iFunext f g p i = h' (seg i) where
  h : (x : A) -> Interval -> B x
  h x = \case
    ii0 -> f x
    ii1 -> g x
    seg i -> p x i

  h' : Interval -> (x : A) -> B x
  h' i x = h x i

I'm pretty sure that I had reproduced this proof in the previous blog post as well, so you can check there for a more thorough explanation. Let's move on to some more exciting higher inductive types.

Synthetic Homotopy Theory: $\mathbb{S}^1$

I am not a homotopy type theorist, but I am a homotopy type theorist, which means I am qualified to prove some facts about spaces. A particularly simple space, which is nonetheless non trivial, is the circle, $\mathbb{S}^1$, the type freely generated by a point and a loop.

data S1 : Type where
  base : S1
  loop i : S1 [ (i = i1) -> base, (i = i0) -> base ]

We can illustrate this type like this:

The elimination principle for this is just like for the interval. We need a point b : P base and a dependent path l : PathP (\i -> P (loop i)) b b (since loop i0 = loop i1 = base the dependent path is not strictly necessary). For example, to define a function $\mathbb{S}^1 \to \mathscr{U}$, we need to pick a type $X : \mathscr{U}$ and a path $X \equiv X$. All non-trivial paths in types are going to be generated by univalence on some interesting equivalence.

Allow me one paragraph's worth of digression before we get to the point. The type of integers is defined as the coproduct of $\mathbb{N} + \mathbb{N}$^[In the implementation, this definition is unfolded], were $\mathrm{inl}\ x$ is interpreted as $+x$ and $\mathrm{inr}\ x$ is $-(x + 1)$. With this representation, one can define the functions $\mathrm{sucZ} = x + 1$ and $\mathrm{predZ} = x - 1$, and prove that they are inverses, such that $\mathrm{sucZ}$ is an autoequivalence of $\mathbb{Z}$.

Consider the function $\mathrm{helix} : \mathbb{S}^1 \to \mathscr{U}$ which maps $\mathrm{base}$ to $\mathbb{Z}$ and $\mathrm{loop}(i)$ to $(\mathrm{univalence}\ sucZ)(i)$. It's easy to check that this definition is type-correct (and boundary-correct), so we can apply it to elements of the circle and get back types and equivalences. Now we can define the function $winding : \mathrm{base} \equiv \mathrm{base} \to \mathbb{Z}$ by

winding : Path base base -> Int
winding p = transp (\i -> helix (p i)) (pos zero)

This map counts, for any loop $x : \mathrm{base} \equiv \mathrm{base}$, the number of times x "goes around" the $\mathrm{loop}$. For example, going around it once:

windingLoop : Path (winding (\i -> loop i)) (pos (succ zero))
windingLoop = refl

or once in the other direction:

windingSymLoop : Path (winding (\i -> loop (inot i))) (neg zero)
windingSymLoop = refl

or no times at all:

windingBase : Path (winding (\i -> base)) (pos zero)
windingBase = refl

If we also write a function $wind : \mathbb{Z} \to \mathrm{base} \equiv \mathrm{base}$ and prove that they are inverses, what we end up with is a fully synthetic, machine-checked proof that $\Omega(\mathbb{S}^1) \equiv \mathbb{Z}$. Of course, we could also define $\mathbb{Z}$ as $\Omega(\mathbb{S}^1)$, but in that case the proof is a lot less interesting!

Category Theory: The Homotopy Pushout

Category theory has the notion of limits and colimits of diagrams, which give rise to lots of important concepts. A full explanation of colimits is not due here, but it should suffice to say that if we want to do mathematics internally to cubical type theory, a complete and co-complete category is a fine setting to do it. Given a diagram like the one on the left, a cocone under it is a diagram like the one on the right, which commutes. The pushout of a span is its colimt, that is, the "smallest" such cocone.

Span — A **span** is a triple of types $A$, $B$, $C$ with maps $f : A \to B$ and $g : A \to C$

Colimit of a span — A **cocone under a span** is a type $P$ and inclusions $i_1 : B \to P$ and $i_2 : C \to P$ such that $i_1 \circ f = i_2 \circ g$

Normal Martin-Löf type theory does not give us the tools to define pushouts, but, as you will have guessed, cubical type theory does. We can define pushouts as a higher inductive type, like this:

data Pushout {A B C : Type} (f : A -> B) (g : A -> C) : Type where
  inl : (x : B) -> Pushout f g
  inr : (y : C) -> Pushout f g
  push i : (a : A) -> Pushout f g [ (i = i0) -> inl (f a)
                                  , (i = i1) -> inr (g a) ]

The push path constructor is parametrised by an element $a : A$ and an endpoint $i : \mathbb{I}$. Applying function extensionality, one can turn this into a path between $f \circ \mathrm{inl}$ and $g \circ \mathrm{inr}$, which is what we need for the diagram to commute. Homotopy pushouts are very general and can be used to define a number of homotopy-theoretic constructions. Quoting the HoTT book, section 6.8, we have:

The pushout of $1 \leftarrow A \to 1$ is the suspension $\Sigma A$

The pushout of $A \leftarrow A \times B \to B$ is the join of $A$ and $B$, written $A * B$

The pushout of $1 \leftarrow A \xrightarrow{f} B$ is the cone or cofiber of $f$

The big file with all the proofs in cubical features a proof that the suspension $\Sigma A$ defined directly as a HIT is the same as the one defined by the pushout of $1 \leftarrow A \to 1$.

But Why?

The motivation for cubical type theory was made explicit two posts ago, when I was talking about equality for the first time, but it's worth mentioning it again, especially after all^[Actually, the most complex part of Cubical Type Theory is the definition of composition for $\mathrm{Glue}$, which is far too hardcore for a blog post, even for its appendix.] of its complexity has been exposed like this. And let me be clear, it is very complex. No amount of handwaving away details can make cubical type theory seem like a "natural" extension: It's not something we found, like the groupoid interpretation of type theory. It's something we found.

And what did we find? A type system with great computational behaviour for all of Homotopy Type Theory. In particular, an argument based on the cubical set model of type theory, rather than on the syntax, proves that cubical type theory enjoys canonicity: Every boolean in the empty context is strictly equal to either $\mathrm{true}$ or $\mathrm{false}$, and other types enjoy similar properties for their canonical elements.

The big failing of Homotopy Type Theory before the cubes came to save us was that there were closed inhabitants of types not equal to any of their constructors. In particular, any construction with path induction would get stuck on the terms $\mathrm{ua}(e)$ of the univalence axiom. Cubical type theory solves this twofold: It gives us ways of working with paths directly, using operations on the interval and composition, and it explains what the computational behaviour of $\mathrm{univalence}$ is.

So, if you ask me, the complexity is justified. It's one of those things that took me a while to get my head around, but where the learning process and the result (knowing about cubes) were beneficial. And god, did it take a while. The first time I encountered the cubical type theory paper was in mid 2019, almost two years ago! It took me that long to go from "what the hell is this" to "this is neat but it confuses me" to "I understand this" to "I can implement this" (we are here).

Writing about it has been my white whale for that long---I'll need a new one, suggestions welcome! Maybe I should write a monad tutorial? Heard those are complicated, too.

If you made it this far, I thank you deeply. This post is a behemoth! In fact the next word is the 7000th, which almost makes this post longer than my two previous longest posts combined! If you haven't abandoned me yet, I swear: I will never make you read this much again. However, if you made it this far and understood everything, I only have one thing to say: Go forth, dear reader, and fill those cubes.

Well, that sounded weird. I won't say it again.

Appendix: Blog/EXTRA CCC

^{^{CCC stands for "computing cubical compositions", I'm so sorry}}

Functions

Now we add one step of generalisation, and consider transport in a line of $\prod{(x : A)} B\ x$, where $A : \mathbb{I} \to \mathscr{U}$ as before but $B : \prod{(i : \mathbb{I})} \to A(i) \to \mathscr{U}$. A given $f : (x : A(i0)) \to B(i0)\ x$ will become, through a trick of magic, a function $(x : A(i1)) \to B(i1)\ x$.

The first step is to define $x\prime : A(i0)$ as before, apply $f$ to get an element $B(i0)\ x\prime$, then cast the result of the application along $\lambda i. B(i)$... Wait. The function is dependent. Can we cast along $\lambda. B(i)\ x$? No, not quite. $x : A(i1)$, but we need an element of $A(i)$. $\lambda. B(i)\ x\prime$ won't do either, since that has type $A(i0)$.

What we need is a line, dependent on $i$, which connects $x\prime$ and $x$, call it $p$; Then we can transport along $\lambda i. B(i)\ p(i)$ to get the element of $B(i1)\ x\prime$ which we want. The filler of the composition which generated $x\prime$ is exactly what we need. Define $v(i) = \mathrm{fill}\ (\lambda i. A (\neg i))\ (\lambda j [])\ x$, so that we may define $y = \mathrm{comp}\ (\lambda i. B(i)\ v(i)) \ (\lambda k [])\ (f\ x\prime)$, and the composition is $\lambda x. y$ as before.

To generalise this to non-empty compositions only requires a very small change. If you think of functions as extensional black boxes, like we do, one thing to realise is that it doesn't really matter how we turn $x$ into an argument to the function, as long as we do; The only thing which needs to respect the constraints of the composition is the overall function, that is, its result. So we can simply take $x\prime$, $v(i)$ as in the case for dependent compositions and define the full composition to be:

$$ \mathrm{comp}\ (\lambda i. \prod{(x : A(i))} B(i)\ x)\ [\phi \to u]\ a0 =\
\lambda x. \mathrm{comp}\ (\lambda i. B(i)\ v(i))\ [\phi \to u\ v]\ (a0\ x\prime) $$

Note the light abuse of notation we use in the mathematics; More properly, the system of sides in the resulting composition would be written $\lambda i\ x. u(i)(x)\ v(i)$.

Pairs

Assume, we're given an element $p : \sum{(x : A)} B\ x$, and take $x = p.1$ and $y = p.2$. Just like in the case for dependent functions, $A$ is a line and $B$ is a dependent line; What we want is an element $p\prime : \sum{(x : A(i1))} B(i1)\ x\prime$ for some $x\prime$.

To define $\mathrm{comp}\ (\lambda i. \sum{(x : A(i))} B(i)\ x)\ [\phi \to u]\ p$, first define $v(i) = \mathrm{fill}\ A\ [\phi \to u.1]\ x$, which is a line connecting $x$ and $\mathrm{comp}\ A\ [\phi \to u.1]\ x$. For the second element we'll do the same thing as we did for dependent functions, and define $y\prime = \mathrm{comp}\ (\lambda i. B(i)\ v(i))\ [\phi \to u.2]\ y$. Then we can define composition as follows:

$$\mathrm{comp}\ (\lambda i. \textstyle\sum{(x : A(i))} B(i)\ x)\ [\phi \to u]\ p = (v(i1), y\prime)$$

At the time of writing, in the very early AM between Saturday and Sunday, the only thing missing is the implementation of composition for higher inductive types. However, this is mostly because I'm hella bored of writing code and wanted to write words instead. This way I can have more fun! ↩︎
If you, like me, are always confused by why $a \land b$ is min and $a \lor b$ is max, check out these Desmos links: min and max. Keep these in mind the next time you're confused :) ↩︎
As a funny sidenote, the object in a category (if it exists) which corresponds to the type-theoretical universe of propositions is called the subobject classifier, written $\Omega$. So $[]$ is a family of maps $\mathbb{I} \to \Omega_{\omega}$. If only we could fit another $\Omega$ in there... ↩︎
By convention we call the dependent versions of cubical primitives their name suffixed with a big P. PathP, PartialP, etc. Don't ask me why. ↩︎

54 KiB Raw Blame History

Paths

Simplicity is disallowed

Cofibrations

Partial Elements

Extensibility

Composition

Cubical Complication 2: Computing Compositions

Simple types

Dependent functions

Dependent pairs

Paths

A note on naming: Pretypes

Auxiliary Definitions

Contractible Types

Equivalences

Glueing & Univalence

Proofs using univalence

Higher Induction

The Interval

Synthetic Homotopy Theory: $\mathbb{S}^1$

Category Theory: The Homotopy Pushout

But Why?

Appendix: Blog/EXTRA CCC

Functions

Pairs

54 KiB

Raw Blame History