my blog lives here now
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

280 lines
10 KiB

7 years ago
  1. ---
  2. title: Multimethods in Urn
  3. date: August 15, 2017
  4. ---
  5. `multimethod`, noun. A procedure which decides runtime behaviour based
  6. on the types of its arguments.
  7. ### Introduction
  8. At some point, most programming language designers realise that they've
  9. outgrown the language's original feature set and must somehow expand it.
  10. Sometimes, this expansion is painless for example, if the language had
  11. already features in place to facilitate this, such as type classes or
  12. message passing.
  13. In our case, however, we had to decide on and implement a performant
  14. system for extensibility in the standard library, from scratch. For
  15. a while, Urn was using Lua's scheme for modifying the behaviour of
  16. standard library functions: metamethods in metatables. For the
  17. uninitiated, Lua tables can have _meta_-tables attached to modify their
  18. behaviour with respect to several language features. As an example, the
  19. metamethod `__add`{.lua} controls how Lua will add two tables.
  20. However, this was not satisfactory, the most important reason as to why
  21. being the fact that metamethods are associated with particular object
  22. _instances_, instead of being associated with the _types_ themselves.
  23. This meant that all the operations you'd like to modify had to be
  24. modified in one big go - inside the constructor. Consider the
  25. constructor for hash-sets as it was implemented before the addition of
  26. multimethods.
  27. ```lisp
  28. (defun make-set (hash-function)
  29. (let* [(hash (or hash-function id))]
  30. (setmetatable
  31. { :tag "set"
  32. :hash hash
  33. :data {} }
  34. { :--pretty-print
  35. (lambda (x)
  36. (.. "«hash-set: " (concat (map pretty (set->list x)) " ") "»"))
  37. :--compare #| elided for brevity |# })))
  38. ```
  39. That second table, the meta table, is entirely noise. The fact that
  40. constructors also had to specify behaviour, instead of just data, was
  41. annoying from a code style point of view and _terrible_ from a reuse
  42. point of view. Behaviour is closely tied to the implementation - remember
  43. that metamethods are tied to the _instance_. To extend the behaviour of
  44. standard library functions (which you can't redefine) for a type you do
  45. not control (whose constructor you also can not override), you suddenly
  46. need to wrap the constructor and add your own metamethods.
  47. ### Finding a Solution
  48. Displeased with the situation as it stood, I set out to discover what
  49. other Lisps did, and it seemed like the consensus solution was to
  50. implement open multimethods. And so we did.
  51. Multimethods - or multiple dispatch in general - is one of the best
  52. solutions to the expression problem. We can easily add new types, and
  53. new operations to work on existing types - and most importantly, this
  54. means touching _no_ existing code.
  55. Our implementation is, like almost everything in Urn, a combination of
  56. clever (ab)use of macros, tables and functions. A method is represented
  57. as a table - more specifically, a n-ary tree of possible cases, with
  58. a metamethod, `__call`{.lua}, which means multimethods can be called and
  59. passed around like regular functions - they are first-order.
  60. Upon calling a multimethod, it'll look up the correct method body to
  61. call for the given arguments - or the default method, or throw an error,
  62. if no default method is provided - and tail-call that, with all the
  63. arguments.
  64. Before diving into the ridiculously simple implementation, let's look at
  65. a handful of examples.
  66. #### Pretty printing
  67. Pretty printing is, quite possibly, the simplest application of multiple
  68. dispatch to extensibility. As of
  69. [`ba289d2d`](https://gitlab.com/urn/urn/commit/ba829d2de30e3b1bef4fa1a22a5e4bbdf243426b),
  70. the standard library implementation of `pretty` is a multimethod.
  71. Before, the implementation[^1] would perform a series of type tests and
  72. decide on the behaviour, including testing if the given object had
  73. a metatable which overrides the pretty-printing behaviour.
  74. The new implementation is _significantly_ shorter, so much so that I'm
  75. comfortable pasting it here.
  76. ```lisp
  77. (defgeneric pretty (x)
  78. "Pretty-print a value.")
  79. ```
  80. That's it! All of the logic that used to exist is now provided by the
  81. `defgeneric` macro, and adding support for your types is as simple as
  82. using `defmethod`.[^2]
  83. ```lisp
  84. (defmethod (pretty string) (x)
  85. (format "%q" x))
  86. ```
  87. As another example, let's define - and assume the following are separate
  88. modules - a new type, and add pretty printing support for that.
  89. ```lisp
  90. ; Module A - A box.
  91. (defun box (x)
  92. { :tag "box"
  93. :value x })
  94. ```
  95. The Urn function `type` will look for a `tag` element in tables and
  96. report that as the type if it is present, and that function is what the
  97. multimethod infrastructure uses to determine the correct body to call.
  98. This means that all we need to do if we want to add support for
  99. pretty-printing boxes is use defmethod again!
  100. ```lisp
  101. (defmethod (pretty box) (x) "🎁")
  102. ```
  103. #### Comparison
  104. A more complicated application of multiple dispatch for extensibility is
  105. the implementation of the `eq?` method in the standard library.
  106. Before[^3], based on a series of conditionals, the equality test was
  107. chosen at runtime.
  108. Anyone with experience optimising code is wincing at the mere thought of
  109. this code.
  110. The new implementation of `eq?` is also comically short - a mere 2 lines
  111. for the definition, and only a handful of lines for all the previously
  112. existing cases.
  113. ```lisp
  114. (defgeneric eq? (x y)
  115. "Compare values for equality deeply.")
  116. (defmethod (eq? symbol symbol) (x y)
  117. (= (get-idx x :contents) (get-idx y :contents)))
  118. (defmethod (eq? string symbol) (x y) (= x (get-idx y :contents)))
  119. (defmethod (eq? symbol string) (x y) (= (get-idx x :contents) y))
  120. ```
  121. If we would, as an example, add support for comparing boxes, the
  122. implementation would similarly be short.
  123. ```lisp
  124. (defmethod (eq? box box) (x y)
  125. (= (.> x :value) (.> y :value)))
  126. ```
  127. ### Implementation
  128. `defgeneric` and `defmethod` are, quite clearly, macros. However,
  129. contrary to what one would expect, both their implementations are
  130. _quite_ simple.
  131. ```lisp
  132. (defmacro defgeneric (name ll &attrs)
  133. (let* [(this (gensym 'this))
  134. (method (gensym 'method))]
  135. `(define ,name
  136. ,@attrs
  137. (setmetatable
  138. { :lookup {} }
  139. { :__call (lambda (,this ,@ll)
  140. (let* [(,method (deep-get ,this :lookup ,@(map (lambda (x)
  141. `(type ,x)) ll)))]
  142. (unless ,method
  143. (if (get-idx ,this :default)
  144. (set! ,method (get-idx ,this :default))
  145. (error "elided for brevity")))
  146. (,method ,@ll))) }))))
  147. ```
  148. Everything `defgeneric` has to do is define a top-level symbol to hold
  149. the multimethod table, and generate, at compile time, a lookup function
  150. specialised for the correct number of arguments. In a language without
  151. macros, multimethod calls would have to - at runtime - loop over the
  152. provided arguments, take their types, and access the correct elements in
  153. the table.
  154. As an example of how generating the lookup function at compile time is
  155. better for performance, consider the (cleaned up[^4]) lookup function
  156. generated for the `(eq?)` method defined above.
  157. ```lua
  158. function(this, x, y)
  159. local method
  160. if this.lookup then
  161. local temp1 = this.lookup[type(x)]
  162. if temp1 then
  163. method = temp1[type(y)] or nil
  164. else
  165. method = nil
  166. end
  167. elseif this.default then
  168. method = this.default
  169. end
  170. if not method then
  171. error("No matching method to call for...")
  172. end
  173. return method(x, y)
  174. end
  175. ```
  176. `defmethod` and `defdefault` are very simple and uninteresting macros:
  177. All they do is wrap the provided body in a lambda expression along with
  178. the proper argument list and associate them to the correct element in
  179. the tree.
  180. ```lisp
  181. (defmacro defmethod (name ll &body)
  182. `(put! ,(car name) (list :lookup ,@(map s->s (cdr name)))
  183. (let* [(,'myself nil)]
  184. (set! ,'myself (lambda ,ll ,@body))
  185. ,'myself)))
  186. ```
  187. ### Conclusion
  188. Switching to methods instead of a big if-else chain improved compiler
  189. performance by 12% under LuaJIT, and 2% under PUC Lua. The performace
  190. increase under LuaJIT can be attributed to the use of polymorphic inline
  191. caches to speed up dispatch, which is now just a handful of table
  192. accesses - Doing it with the if-else chain is _much_ harder.
  193. Defining complex multiple-dispatch methods used to be an unthinkable
  194. hassle what with keeping straight which cases have been defined yet and
  195. which cases haven't, but they're now very simple to define: Just state
  196. out the number of arguments and list all possible cases.
  197. The fact that multimethods are _open_ means that new cases can be added
  198. on the fly, at runtime (though this is not officially supported, and we
  199. don't claim responsibility if you shoot your own foot), and that modules
  200. loaded later may improve upon the behaviour of modules loaded earlier.
  201. This means less coupling between the standard library, which has been
  202. growing to be quite large.
  203. This change has, in my opinion, made Urn a lot more expressive as
  204. a language, and I'd like to take a minute to point out the power of the
  205. Lisp family in adding complicated features such as these as merely
  206. library code: no changes were made to the compiler, apart from a tiny
  207. one regarding environments in the REPL - previously, it'd use the
  208. compiler's version of `(pretty)` even if the user had overridden it,
  209. which wasn't a problem with the metatable approach, but definitely is
  210. with the multimethod approach.
  211. Of course, no solution is all _good_. Compiled code size has increased
  212. a fair bit, and for the Urn compiler to inline across multimethod
  213. boundaries would be incredibly difficult - These functions are
  214. essentially opaque boxes to the compiler.
  215. Dead code elimination is harder, what with defining functions now being
  216. a side-effect to be performed at runtime - Telling which method cases
  217. are or aren't used is incredibly difficult with the extent of the
  218. dynamicity.
  219. [^1]:
  220. [Here](https://gitlab.com/urn/urn/blob/e1e9777498e1a7d690e3b39c56f616501646b5da/lib/base.lisp#L243-270).
  221. Do keep in mind that the implementation is _quite_ hairy, and grew to be
  222. like that because of our lack of a standard way of making functions
  223. extensible.
  224. [^2]: `%q` is the format specifier for quoted strings.
  225. [^3]:
  226. [Here](https://gitlab.com/urn/urn/blob/e1e9777498e1a7d690e3b39c56f616501646b5da/lib/type.lisp#L116-1420).
  227. Do keep in mind that that the above warnings apply to this one, too.
  228. [^4]: [The original generated code](/static/generated_code.lua.html) is
  229. quite similar, except the generated variable names make it a tad harder
  230. to read.