title | date |
---|---|
Optimisation through Constraint Propagation | August 06, 2017 |
Constraint propagation is a new optimisation proposed for implementation in the Urn compiler1. It is a variation on the idea of flow-sensitive typing in that it is not applied to increasing program safety, rather being used to improve speed.
The Urn compiler is decently fast for being implemented in Lua.
Currently, it manages to compile itself (and a decent chunk of the
standard library) in about 4.5 seconds (when using LuaJIT; When using
the lua.org interpreter, this time roughly doubles). Looking at
a call-stack profile of the compiler, we notice a very interesting data
point: about 11% of compiler runtime is spent in the (type)
function.
There are two ways to fix this: Either we introduce a type system (which
is insanely hard to do for a language as dynamic as Urn - or Lisp in
general) or we reduce the number of calls to (type)
by means of
optimisation. Our current plan is to do the latter.
The proposed solution is to collect all the branches that the program has taken to end up in the state it currently is. Thus, every branch grows the set of "constraints" - the predicates which have been invoked to get the program here.
Most useful predicates involve a variable: Checking if it is or isn't nil, if is positive or negative, even or odd, a list or a string, and etc. However, when talking about a single variable, this test only has to be performed once (in general - mutating the variable invalidates the set of collected constraints), and their truthiness can be kept, by the compiler, for later use.
As an example, consider the following code. It has three branches, all
of which imply something different about the type of the variable x
.
(cond
[(list? x)] ; first case
[(string? x)] ; second case
[(number? x)]) ; third case
If, in the first case, the program then evaluated (car x)
, it'd end up
doing a redundant type check. (car)
, is, in the standard library,
implemented like so:
(defun car (x)
(assert-type! x list)
(.> x 0))
assert-type!
is merely a macro to make checking the types of arguments
more convenient. Let's make the example of branching code a bit more
complicated by making it take and print the car
of the list.
(cond
[(list? x)
(print! (car x))])
; other branches elided for clarity
To see how constraint propagation would aid the runtime performance of this code, let's play optimiser for a bit, and see what this code would end up looking like at each step.
First, (car x)
is inlined.
(cond
[(list? x)
(print! (progn (assert-type! x list)
(.> x 0)))])
assert-type!
is expanded, and the problem becomes apparent: the type
of x
is being computed twice!
(cond
[(list? x)
(print! (progn (if (! (list? x))
(error! "the argument x is not a list"))
(.> x 0)))])
If the compiler had constraint propagation (and the associated code motions), this code could be simplified further.
(cond
[(list? x)
(print! (.> x 0))])
Seeing as we already know that (list? x)
is true, we don't need to
test anymore, and the conditional can be entirely eliminated. Figuring
out (! (list? x))
from (list? x)
is entirely trivial constant
folding (the compiler already does it)
This code is optimal. The (list? x)
test can't be eliminated because
nothing else is known about x
. If its value were statically known, the
compiler could eliminate the branch and invocation of (car x)
completely by constant propagation and folding ((car)
is, type
assertion notwithstanding, a pure function - it returns the same results
for the same inputs. Thus, it is safe to execute at compile time)
In this section I'm going to outline a very simple implementation of the constraint propagation algorithm to be employed in the Urn compiler. It'll work on a simple Lisp with no quoting or macros (thus, basically the lambda calculus).
(lambda (var1 var2) exp) ; λ-abstraction
(foo bar baz) ; procedure application
var ; variable reference
(list x y z) ; list
t, nil ; boolean
(cond [t1 b1] [t2 b2]) ; conditional
The language has very simple semantics. It has three kinds of values (closures, lists and booleans), and only a couple reduction rules. The evaluation rules are presented as an interpretation function (in Urn, not the language itself).
(defun interpret (x env)
(case x
[(lambda ?params . ?body)
`(:closure ,params ,body ,(copy env))] ; 1
[(list . ?xs)
(map (cut interpret <> env) xs)] ; 2
[t true] [nil false] ; 3
[(cond . ?alts) ; 4
(interpret
(block (map (lambda (alt)
(when (interpret (car alt) env)
(break (cdr alt))))))
env)]
[(?fn . ?args)
(case (eval fn env)
[(:closure ?params ?body ?cl-env) ; 5
(map (lambda (a k)
(.<! cl-env (symbol->string a) (interpret k env)))
params args)
(last (map (cut interpret <> env) body))]
[_ (error! $"not a procedure: ${fn}")])]
[else (.> env (symbol->string x))]))
Collecting constraints in a language as simple as this is fairly easy, so here's an implementation.
(defun collect-constraints (expr (constrs '()))
(case expr
[(lambda ?params . ?body)
`(:constraints (lambda ,params
,@(map (cut collect-constraints <> constrs) body))
,constrs)]
Lambda expressions incur no additional constraints, so the inner expressions (namely, the body) receive the old set. The same is true for lists:
[(list . ?xs)
`(:constraints (list ,@(map (cut collect-constraints <> constrs) xs))
,constrs)]
Booleans are simpler:
[t `(:constraints ,'t ,constrs)]
[nil `(:constraints ,'nil ,constrs)]
Since there are no sub-expressions to go through, we only associate the constraints with the boolean values.
Conditionals are where the real work happens. For each case, we add that case's test as a constraint in its body.
[(cond . ?alts)
`(:constraints
(cond
,@(map (lambda (x)
`(,(collect-constraints (car x) constrs)
,(collect-constraints (cadr x) (cons (car x) constrs))))
alts))
,constrs)]
Applications are as simple as lists. Note that we make no distinction between valid applications and invalid ones, and just tag both.
[(?fn . ?args)
`(:constraints
(,(collect-constraints fn constrs)
,@(map (cut collect-constraints <> constrs)
args))
,constrs)]
References are also straightforward:
[else `(:constraints ,expr ,constrs)]))
That's it! Now, this information can be exploited to select a case branch at compile time, and eliminate the overhead of performing the test again.
This is really easy to do in a compiler that already has constant folding of alternatives. All we have to do is associate constraints to truthy values. For instance:
(defun fold-on-constraints (x)
(case x
[((:constraints ?e ?x)
:when (known? e x))
't]
[else x]))
That's it! We check if the expression is in the set of known constraints, and if so, reduce it to true. Then, the constant folding code will take care of eliminating the redundant branches.
This is a really complicated question. The Urn core language, unfortunately, is a tad more complicated, as is the existing optimiser. Collecting constraints and eliminating tests would be in completely different parts of the compiler.
There is also a series of code motions that need to be in place for constraints to be propagated optimally, especially when panic edges are involved. Fortunately, these are all simple to implement, but it's still a whole lot of work.
I don't feel confident setting a specific timeframe for this, but I will post more blags on this topic. It's fascinating (for me, at least) and will hopefully make the compiler faster!