An introduction to deep code-walking macros with Clojure
With the release of the core.async library the Clojure community has been exposed to the joys1 of Communicating Sequential Processes and Go-style asynchronousity 2. While this is all very exciting and new for the Clojure community at-large, a likewise interesting aspect of core.async’s release has been exposed — deep code-walking macros — a subject I will give an overview of herein.3
In chapter 17 of The Joy of Clojure we show an example of a macro called defformula
that allows you to define spreadsheet-like cells in Clojure using watchers. For that example we intentionally required that the formulas defined be preceded by a binding vector, like so:
(defformula avg [at-bats ab hits h] (float (/ @hits @at-bats)))
This allowed us to tie formula-internal variable names with existing reference types without requiring our implementation to garner the ties programmatically. The point of that section had to do with the Observer Pattern, so we didn’t want to clutter the discussion with other concerns. However, we can use the same example to illustrate deep code-walking macros (DCWM) and doing so will lead to a different implementation. For this post I’ll talk about only two types4 of DCWMs, namely: additive transformers and in-place transformers.
Deep walking for additive transforms
The defformula
macro in The Joy of Clojure was indeed an additive transformer. That is, the macro took an expression and augmented it with additional stuff, specifically calls to add-watch
. However, having the binding vector allowed us to assume reference type dereferencing forms (things like @foo
). However, if I want to infer dereferencing forms within an expression then I will need to dive into it and find them. To do that I’d first like to have a function deref-form?
that takes something and checks if it’s a dereferencing form, implemented below:
(ns formulas (:require [clojure.walk :as walk])) (defn deref-form? [form] (boolean (when (seq? form) ;; <1> (let [[op & _] form] (and op (symbol? op) (or ;; <2> (= op 'deref) (= op 'clojure.core/deref)))))))
- Only work on seqs
- Check for
deref
calls
You’ll notice that the deref-form
assumes that the form
given is a read-expanded dereferencing form rather than a reader-macro for (i.e. (deref foo)
rather than @foo
). Given that assumption, it’ll work as follows:
(deref-form? 42) ;;=> false (deref-form? [deref :foo]) ;;=> false (deref-form? (quote (deref foo))) ;;=> true (deref-form? `(deref foo)) ;;=> true
Seems right. Now that I can identify a dereferencing form I’d like to dive into an expression and gather up all of the names that serve as the target in said dereferencing forms. The function to do this very task named find-derefs
is implemented below:
(defn find-derefs [expr] (if (coll? expr) (->> expr walk/macroexpand-all ;; <1> (tree-seq coll? seq) ;; <2> (filter deref-form?) ;; <3> (map second) ;; <4> set) ;; <5> #{}))
- Expand all sub-forms
- Build a seq of all composite forms
- Take only the
deref
forms - Take only their operands
- Make a set out of them
You’ll notice that with the use of walk/macroexpand-all
I can ensure that any nested @foo
s are turned into the equivalent (clojure.core/deref foo)
calls. This lets me build up a set of the embedded dereferenced targets in expressions:
(find-derefs :a) ;;=> #{} (find-derefs 42) ;;=> #{} (find-derefs []) ;;=> #{} (find-derefs '(inc @c1)) ;;=> #{c1} (find-derefs '[{:foo @bar, :baz [@quux]}]) ;;=> #{quux bar}
The find-derefs
function basically builds the information that the binding form in JoC provided explicitly. Now that I can get at the names embedded in an expression, I can now implement a macro to build spreadsheet-like formulas whose values are dependent on the embedded references within:
(defmacro formula [expr] `(let [formula# (agent ~expr) ;; <1> update-fn# (fn [key# ref# o# n#] ;; <2> (send formula# (fn [_#] ~expr)))] (doseq [r# ~(find-derefs expr)] ;; <3> (add-watch r# ;; <4> :update-formula update-fn#)) formula#))
- Set the Agent’s initial value based on result of the formula
- Build a function to update the Agent with the new formula calculation
- For every internal dereference in
expr
… - … build a call setting a watcher that updates the formula value on change
The body of a formula
is just an expression that depends on zero or more embedded dereferences. Not only does formula
deeply walk the expression to find the dereferences, but it also augments a normal call to agent
(holding the result of the formula) with calls to add-watch
on the embedded (i.e. its dependents) reference types that automatically update the formula
‘s value whenever any of its dependents change. That is, rather than just define an Agent that holds the result of some expression like (agent (float (/ @hits @at-bats)))
the formula
macro builds something like the following:
(let* [formula (agent (float (/ (deref hits) (deref at-bats)))) update-fn (fn [key ref o n] (send formula (fn [_] (float (/ (deref hits) (deref at-bats))))))] (doseq [r #{hits at-bats}] (add-watch r :update-formula update-fn)) formula)
That’s quite a transformation! Because of this transformation formula
allows you to do the following:
(def hits (ref 25)) (def at-bats (ref 100)) (def avg (formula (float (/ @hits @at-bats)))) @avg ;;=> 0.25
As shown, the avg
formula holds the result of the division of the numbers stored in the Refs hits
and at-bats
. What is so cool about DCWMs is that they allow me to define rich semantics for the simple formula
expression and augment it with spreadsheet-like capabilities:
(dosync (alter at-bats inc) (alter hits inc)) @avg ;;=> 0.25742576
As shown, changing the values stored in the Refs automatically updates the value stored in the formula avg
. This is cool stuff, but that’s not all as I’ll show below.
Deep walking for in-place transforms
Not only can I change the semantics of an expression using a DCWM, but I can also completely transform the expression as it’s traversed. Assume that for whatever reason I wanted to not deal directly with the embedded references inside of a formula. Instead, I want to refer to them through the Vars that hold them. That is, instead of doing this:
(float (/ (clojure.core/deref hits) (clojure.core/deref at-bats)))
I want to do this:
(float (/ @@#'hits @@#'at-bats))
What that odd looking carrier-lost-like menagerie means is:
@ @ #'hits deref deref get the Var holding the reference
Or:
- Get the Var
- Dereference it
- Then dereference whatever it held
As an intermediate step toward transforming dereferencing calls into this indirect Var-lookup, I’ll use the clojure.walk/postwalk-replace
function to insert some magic, like so:
(defn via-magic [expr] (walk/postwalk-replace ;; <1> '{clojure.core/deref magic ;; <2> deref magic} expr))
You can see what via-magic
does below:
(via-magic (quote (float (/ @hits @at-bats)))) ;;=> (float (/ (magic hits) (magic at-bats)))
So it just replaces the symbol deref
with the symbol magic
— nothing drastic. The cool part is that I can implement magic
as a macro that itself transforms a symbolic name5 into a series of calls like the one shown before:
(defmacro magic [ref-name] `(let [v# (resolve (quote ~ref-name))] ;; <1> (deref (deref v#)))) ;; <2>
- Look up the Var by symbol
- Dereference the Var first, then dereference what it contained
The transformation above ensures that any time a dereference occurs it happens through the Var rather than directly. You’ll notice that if I use magic
directly I’ll get the same answer:
(magic at-bats) ;;=> 101
Because magic
will eventually get me the value stored in the Var-held reference type, I can interpose its use within a new version of the formula
macro at the right place:
(defmacro formula [expr] `(let [formula# (agent ~expr) ;; <1> update-fn# (fn [key# ref# o# n#] ;; <2> (send formula# (fn [_#] ~(via-magic expr))))] ;; <3> (doseq [r# ~(find-derefs expr)] (add-watch r# ;; <4> :update-formula update-fn#)) formula#))
- Assume Vars exist for the initial value, so don’t use
magic
- Build the updating watch function like before, except…
- … use
magic
to look up through the Var - Again add watches to all of the dereferencing sub-forms
You’ll notice that I only chose to lookup through the Vars in one place, namely, the point of update of the formula
itself. There would be problems if I re-define either of the formula
‘s dependencies. That is, the update will break because the new embedded reference will not have the needed watchers. In any case, the new implementation of formula
works like the old:
(def hits (ref 25)) (def at-bats (ref 100)) (def avg (formula (float (/ @hits @at-bats)))) @avg ;;=> 0.25 (dosync (alter hits #(+ 10 %)) @avg ;;=> 0.35
The new formula
macro, in addition to walking the expression to look for dereferncing forms, changed the call-sites of the embedded deref
s at the point of update only. Specifically it generated something like the following code:
(let* [formula (agent (float (/ (deref hits) ;; <1> (deref at-bats)))) update-fn (fn [key ref o n] (send formula (fn [_] (float (/ (let* [v (resolve 'hits)] @@v) ;; <2> (let* [v (resolve 'at-bats)] @@v))))))] (doseq [r #{hits at-bats}] ;; <3> (add-watch r :update-formula update-fn)) formula)
- Non-
magic
expansion - Using
magic
the Vars are looked up at the time of the formula’s update - Assume that the Var will never be re-bound
I could make formula
even more complicated by ensuring that even on a Var re-define the formula updates and the proper watches are applied to the new reference type, but I leave that as an exercise to the reader.6
I could probably go on7 and on about deep code-walking macros, but I hope that this introduction gives an idea not only why they’re so powerful, but also why people (myself included) in the Clojure community are so excited by this technique.
Happy hacking!
:F
-
I’ve missed writing blog posts about Clojure. It feels good y’all. ↩
-
And the misconceptions of asynchronous APIs. ↩
-
If you’ve ever watched my Macronomicon talk and wondered what the heck my koan in the beginning meant then this post is a hint. ↩
-
The amazing book Let Over Lambda talks about these types of DCWMs in depth, and much much more (although he doesn’t use the same terminology). ↩
-
A Common Lisp-like symbol-macrolet would be nice for this purpose too. The Clojure implementation of this feature is a master-class in DCWMs. ↩
-
Javelin is a nice ClojureScript library that takes this idea to the nth degree. ↩
-
Maybe one day I’ll write a book about this stuff. How does The Macronomicon sound as a title? I actually started writing that book but lost interest after a while because as it turned out I needed to learn a lot more. Macros are funny. Just when you think you know all about them, you learn about some other interesting aspect of them right out of the blue. It’s crazy. So as it stands I’ve decided to hold off starting such a book because, well, because I’ve just not built an idea in my head about what it should be. This is partly because I need to learn a lot more and study a crap-ton of existing code. One day. Maybe. ↩
7 Comments, Comment or Ping
Sam Aaron
Awesome post, thanks :-) I’ll reserve a space on my bookshelf for a shiny copy of the Macronomicon when it’s out…
Just one questions though – is tree walking always safe over Clojure forms? Are there any edge cases or gotchas you need to look out for?
Jul 17th, 2013
fogus
@Sam
I use
macroexpand-all
in my code and it works fine for my purposes, but in some circumstances it might not produce the same form that Clojure will during its normal read-expand-compile-eval-print cycle. The sad part is that I forget the ways that it fails. :-( The use ofclojure.walk
should work, although I’ve not used it as extensively as I could have. Maybe someone else reading this can shed some light on these two aspect.Jul 17th, 2013
Timothy Baldridge
One of the hardest parts I’ve found (when writing the DCWM of core.async) is when you want to override existing clojure macros. For instance, case is a macro who’s expansion is completely useless to core.async. Therefore we can’t just do macroexpand-all, instead we have to recursively macroexpand-1 and check to make sure we want don’t want to override the macro at each step. This ends up being a bit harder than it sounds.
Jul 17th, 2013
Patrick Logan
macrolet is useful in common lisp for deep code-walking macros. macrolet essentially defines lexically scoped macros to be expanded during expansion of some other macro.
Jul 17th, 2013
Ben Mabey
Great post, thanks for the clear explanation.
BTW Patrick, Clojure’s tools.macro provides a macrolet: https://github.com/clojure/tools.macro#macrolet-local-macro-definitions I had seen macrolet before but never knew when it would be useful, but after reading this post I can already see of how useful it could be in cleaning up some macros.
Jul 19th, 2013
pozorvlak
Start writing the book; in the course of writing it you’ll learn enough to write it :-)
Jul 20th, 2013
Alex Baranosky
Excellent post. You don’t see a ton published on this subect. Thanks.
Aug 2nd, 2013
Reply to “An introduction to deep code-walking macros with Clojure”