title | date |
---|---|
Multimethods in Urn | August 15, 2017 |
multimethod
, noun. A procedure which decides runtime behaviour based
on the types of its arguments.
At some point, most programming language designers realise that they've outgrown the language's original feature set and must somehow expand it. Sometimes, this expansion is painless for example, if the language had already features in place to facilitate this, such as type classes or message passing.
In our case, however, we had to decide on and implement a performant
system for extensibility in the standard library, from scratch. For
a while, Urn was using Lua's scheme for modifying the behaviour of
standard library functions: metamethods in metatables. For the
uninitiated, Lua tables can have meta-tables attached to modify their
behaviour with respect to several language features. As an example, the
metamethod __add
{.lua} controls how Lua will add two tables.
However, this was not satisfactory, the most important reason as to why being the fact that metamethods are associated with particular object instances, instead of being associated with the types themselves. This meant that all the operations you'd like to modify had to be modified in one big go - inside the constructor. Consider the constructor for hash-sets as it was implemented before the addition of multimethods.
(defun make-set (hash-function)
(let* [(hash (or hash-function id))]
(setmetatable
{ :tag "set"
:hash hash
:data {} }
{ :--pretty-print
(lambda (x)
(.. "«hash-set: " (concat (map pretty (set->list x)) " ") "»"))
:--compare #| elided for brevity |# })))
That second table, the meta table, is entirely noise. The fact that constructors also had to specify behaviour, instead of just data, was annoying from a code style point of view and terrible from a reuse point of view. Behaviour is closely tied to the implementation - remember that metamethods are tied to the instance. To extend the behaviour of standard library functions (which you can't redefine) for a type you do not control (whose constructor you also can not override), you suddenly need to wrap the constructor and add your own metamethods.
Displeased with the situation as it stood, I set out to discover what other Lisps did, and it seemed like the consensus solution was to implement open multimethods. And so we did.
Multimethods - or multiple dispatch in general - is one of the best solutions to the expression problem. We can easily add new types, and new operations to work on existing types - and most importantly, this means touching no existing code.
Our implementation is, like almost everything in Urn, a combination of
clever (ab)use of macros, tables and functions. A method is represented
as a table - more specifically, a n-ary tree of possible cases, with
a metamethod, __call
{.lua}, which means multimethods can be called and
passed around like regular functions - they are first-order.
Upon calling a multimethod, it'll look up the correct method body to call for the given arguments - or the default method, or throw an error, if no default method is provided - and tail-call that, with all the arguments.
Before diving into the ridiculously simple implementation, let's look at a handful of examples.
Pretty printing is, quite possibly, the simplest application of multiple
dispatch to extensibility. As of
ba289d2d
,
the standard library implementation of pretty
is a multimethod.
Before, the implementation1 would perform a series of type tests and decide on the behaviour, including testing if the given object had a metatable which overrides the pretty-printing behaviour.
The new implementation is significantly shorter, so much so that I'm comfortable pasting it here.
(defgeneric pretty (x)
"Pretty-print a value.")
That's it! All of the logic that used to exist is now provided by the
defgeneric
macro, and adding support for your types is as simple as
using defmethod
.2
(defmethod (pretty string) (x)
(format "%q" x))
As another example, let's define - and assume the following are separate modules - a new type, and add pretty printing support for that.
; Module A - A box.
(defun box (x)
{ :tag "box"
:value x })
The Urn function type
will look for a tag
element in tables and
report that as the type if it is present, and that function is what the
multimethod infrastructure uses to determine the correct body to call.
This means that all we need to do if we want to add support for
pretty-printing boxes is use defmethod again!
(defmethod (pretty box) (x) "🎁")
A more complicated application of multiple dispatch for extensibility is
the implementation of the eq?
method in the standard library.
Before3, based on a series of conditionals, the equality test was
chosen at runtime.
Anyone with experience optimising code is wincing at the mere thought of
this code.
The new implementation of eq?
is also comically short - a mere 2 lines
for the definition, and only a handful of lines for all the previously
existing cases.
(defgeneric eq? (x y)
"Compare values for equality deeply.")
(defmethod (eq? symbol symbol) (x y)
(= (get-idx x :contents) (get-idx y :contents)))
(defmethod (eq? string symbol) (x y) (= x (get-idx y :contents)))
(defmethod (eq? symbol string) (x y) (= (get-idx x :contents) y))
If we would, as an example, add support for comparing boxes, the implementation would similarly be short.
(defmethod (eq? box box) (x y)
(= (.> x :value) (.> y :value)))
defgeneric
and defmethod
are, quite clearly, macros. However,
contrary to what one would expect, both their implementations are
quite simple.
(defmacro defgeneric (name ll &attrs)
(let* [(this (gensym 'this))
(method (gensym 'method))]
`(define ,name
,@attrs
(setmetatable
{ :lookup {} }
{ :__call (lambda (,this ,@ll)
(let* [(,method (deep-get ,this :lookup ,@(map (lambda (x)
`(type ,x)) ll)))]
(unless ,method
(if (get-idx ,this :default)
(set! ,method (get-idx ,this :default))
(error "elided for brevity")))
(,method ,@ll))) }))))
Everything defgeneric
has to do is define a top-level symbol to hold
the multimethod table, and generate, at compile time, a lookup function
specialised for the correct number of arguments. In a language without
macros, multimethod calls would have to - at runtime - loop over the
provided arguments, take their types, and access the correct elements in
the table.
As an example of how generating the lookup function at compile time is
better for performance, consider the (cleaned up4) lookup function
generated for the (eq?)
method defined above.
function(this, x, y)
local method
if this.lookup then
local temp1 = this.lookup[type(x)]
if temp1 then
method = temp1[type(y)] or nil
else
method = nil
end
elseif this.default then
method = this.default
end
if not method then
error("No matching method to call for...")
end
return method(x, y)
end
defmethod
and defdefault
are very simple and uninteresting macros:
All they do is wrap the provided body in a lambda expression along with
the proper argument list and associate them to the correct element in
the tree.
(defmacro defmethod (name ll &body)
`(put! ,(car name) (list :lookup ,@(map s->s (cdr name)))
(let* [(,'myself nil)]
(set! ,'myself (lambda ,ll ,@body))
,'myself)))
Switching to methods instead of a big if-else chain improved compiler performance by 12% under LuaJIT, and 2% under PUC Lua. The performace increase under LuaJIT can be attributed to the use of polymorphic inline caches to speed up dispatch, which is now just a handful of table accesses - Doing it with the if-else chain is much harder.
Defining complex multiple-dispatch methods used to be an unthinkable hassle what with keeping straight which cases have been defined yet and which cases haven't, but they're now very simple to define: Just state out the number of arguments and list all possible cases.
The fact that multimethods are open means that new cases can be added on the fly, at runtime (though this is not officially supported, and we don't claim responsibility if you shoot your own foot), and that modules loaded later may improve upon the behaviour of modules loaded earlier. This means less coupling between the standard library, which has been growing to be quite large.
This change has, in my opinion, made Urn a lot more expressive as
a language, and I'd like to take a minute to point out the power of the
Lisp family in adding complicated features such as these as merely
library code: no changes were made to the compiler, apart from a tiny
one regarding environments in the REPL - previously, it'd use the
compiler's version of (pretty)
even if the user had overridden it,
which wasn't a problem with the metatable approach, but definitely is
with the multimethod approach.
Of course, no solution is all good. Compiled code size has increased a fair bit, and for the Urn compiler to inline across multimethod boundaries would be incredibly difficult - These functions are essentially opaque boxes to the compiler.
Dead code elimination is harder, what with defining functions now being a side-effect to be performed at runtime - Telling which method cases are or aren't used is incredibly difficult with the extent of the dynamicity.
Here. Do keep in mind that the implementation is quite hairy, and grew to be like that because of our lack of a standard way of making functions extensible.
Here. Do keep in mind that that the above warnings apply to this one, too.
%q
is the format specifier for quoted strings. ↩︎
The original generated code is quite similar, except the generated variable names make it a tad harder to read. ↩︎