read


read
or learn more

An introduction to deep code-walking macros with Clojure

Jul 17, 2013

With the release of the core.async library the Clojure community has been exposed to the joys1 of Communicating Sequential Processes and Go-style asynchronousity 2. While this is all very exciting and new for the Clojure community at-large, a likewise interesting aspect of core.async’s release has been exposed — deep code-walking macros — a subject I will give an overview of herein.3

In chapter 17 of The Joy of Clojure we show an example of a macro called defformula that allows you to define spreadsheet-like cells in Clojure using watchers. For that example we intentionally required that the formulas defined be preceded by a binding vector, like so:

(defformula avg [at-bats ab hits h]
  (float (/ @hits @at-bats)))

This allowed us to tie formula-internal variable names with existing reference types without requiring our implementation to garner the ties programmatically. The point of that section had to do with the Observer Pattern, so we didn’t want to clutter the discussion with other concerns. However, we can use the same example to illustrate deep code-walking macros (DCWM) and doing so will lead to a different implementation. For this post I’ll talk about only two types4 of DCWMs, namely: additive transformers and in-place transformers.

Deep walking for additive transforms

The defformula macro in The Joy of Clojure was indeed an additive transformer. That is, the macro took an expression and augmented it with additional stuff, specifically calls to add-watch. However, having the binding vector allowed us to assume reference type dereferencing forms (things like @foo). However, if I want to infer dereferencing forms within an expression then I will need to dive into it and find them. To do that I’d first like to have a function deref-form? that takes something and checks if it’s a dereferencing form, implemented below:

(ns formulas 
  (:require [clojure.walk :as walk]))

(defn deref-form? [form]
  (boolean
    (when (seq? form)       ;; <1>                                                                                       
      (let [[op & _] form]
        (and
         op
         (symbol? op)
         (or                 ;; <2>                                                                                      
          (= op 'deref)
          (= op 'clojure.core/deref)))))))
  1. Only work on seqs
  2. Check for deref calls

You’ll notice that the deref-form assumes that the form given is a read-expanded dereferencing form rather than a reader-macro for (i.e. (deref foo) rather than @foo). Given that assumption, it’ll work as follows:

(deref-form? 42)
;;=> false

(deref-form? [deref :foo])
;;=> false

(deref-form? (quote (deref foo)))
;;=> true

(deref-form? `(deref foo))
;;=> true

Seems right. Now that I can identify a dereferencing form I’d like to dive into an expression and gather up all of the names that serve as the target in said dereferencing forms. The function to do this very task named find-derefs is implemented below:

(defn find-derefs [expr]
  (if (coll? expr)
    (->> expr
         walk/macroexpand-all  ;; <1>                                                                                  
         (tree-seq coll? seq)  ;; <2>                                                                                  
         (filter deref-form?)  ;; <3>                                                                                  
         (map second)          ;; <4>                                                                                  
         set)                  ;; <5>                                                                                  
    #{}))
  1. Expand all sub-forms
  2. Build a seq of all composite forms
  3. Take only the deref forms
  4. Take only their operands
  5. Make a set out of them

You’ll notice that with the use of walk/macroexpand-all I can ensure that any nested @foos are turned into the equivalent (clojure.core/deref foo) calls. This lets me build up a set of the embedded dereferenced targets in expressions:

(find-derefs :a)
;;=> #{}

(find-derefs 42)
;;=> #{}

(find-derefs [])
;;=> #{}

(find-derefs '(inc @c1))
;;=> #{c1}

(find-derefs '[{:foo @bar, :baz [@quux]}])
;;=> #{quux bar}

The find-derefs function basically builds the information that the binding form in JoC provided explicitly. Now that I can get at the names embedded in an expression, I can now implement a macro to build spreadsheet-like formulas whose values are dependent on the embedded references within:

(defmacro formula [expr]
  `(let [formula#   (agent ~expr)               ;; <1>                                                               
         update-fn# (fn [key# ref# o# n#]       ;; <2>                                                               
                      (send formula#
                            (fn [_#] ~expr)))]
     (doseq [r# ~(find-derefs expr)]            ;; <3>
       (add-watch r#                            ;; <4>
         :update-formula
         update-fn#))

     formula#))
  1. Set the Agent’s initial value based on result of the formula
  2. Build a function to update the Agent with the new formula calculation
  3. For every internal dereference in expr
  4. … build a call setting a watcher that updates the formula value on change

The body of a formula is just an expression that depends on zero or more embedded dereferences. Not only does formula deeply walk the expression to find the dereferences, but it also augments a normal call to agent (holding the result of the formula) with calls to add-watch on the embedded (i.e. its dependents) reference types that automatically update the formula‘s value whenever any of its dependents change. That is, rather than just define an Agent that holds the result of some expression like (agent (float (/ @hits @at-bats))) the formula macro builds something like the following:

(let* [formula (agent (float (/ (deref hits) 
                                (deref at-bats))))
       update-fn (fn [key ref o n]
                   (send formula
                     (fn [_]
                       (float (/ (deref hits) 
                                 (deref at-bats))))))]
   (doseq [r #{hits at-bats}]
     (add-watch r :update-formula update-fn))

   formula)

That’s quite a transformation! Because of this transformation formula allows you to do the following:

(def hits (ref 25))
(def at-bats (ref 100))

(def avg (formula (float (/ @hits @at-bats))))

@avg
;;=> 0.25

As shown, the avg formula holds the result of the division of the numbers stored in the Refs hits and at-bats. What is so cool about DCWMs is that they allow me to define rich semantics for the simple formula expression and augment it with spreadsheet-like capabilities:

(dosync
  (alter at-bats inc)
  (alter hits inc))

@avg
;;=> 0.25742576

As shown, changing the values stored in the Refs automatically updates the value stored in the formula avg. This is cool stuff, but that’s not all as I’ll show below.

Deep walking for in-place transforms

Not only can I change the semantics of an expression using a DCWM, but I can also completely transform the expression as it’s traversed. Assume that for whatever reason I wanted to not deal directly with the embedded references inside of a formula. Instead, I want to refer to them through the Vars that hold them. That is, instead of doing this:

(float (/ (clojure.core/deref hits) 
          (clojure.core/deref at-bats)))

I want to do this:

(float (/ @@#'hits @@#'at-bats))

What that odd looking carrier-lost-like menagerie means is:

@      @      #'hits
deref  deref  get the Var holding the reference

Or:

  • Get the Var
  • Dereference it
  • Then dereference whatever it held

As an intermediate step toward transforming dereferencing calls into this indirect Var-lookup, I’ll use the clojure.walk/postwalk-replace function to insert some magic, like so:

(defn via-magic [expr]
  (walk/postwalk-replace        ;; <1>
    '{clojure.core/deref magic  ;; <2>
      deref magic}
    expr))

You can see what via-magic does below:

(via-magic (quote (float (/ @hits @at-bats))))

;;=> (float (/ (magic hits) (magic at-bats)))

So it just replaces the symbol deref with the symbol magic — nothing drastic. The cool part is that I can implement magic as a macro that itself transforms a symbolic name5 into a series of calls like the one shown before:

(defmacro magic [ref-name]
  `(let [v# (resolve (quote ~ref-name))] ;; <1>
     (deref (deref v#))))                ;; <2>
  1. Look up the Var by symbol
  2. Dereference the Var first, then dereference what it contained

The transformation above ensures that any time a dereference occurs it happens through the Var rather than directly. You’ll notice that if I use magic directly I’ll get the same answer:

(magic at-bats)
;;=> 101

Because magic will eventually get me the value stored in the Var-held reference type, I can interpose its use within a new version of the formula macro at the right place:

(defmacro formula [expr]
  `(let [formula#   (agent ~expr)          ;; <1>
         update-fn# 
          (fn [key# ref# o# n#]            ;; <2>
            (send formula#
                  (fn [_#]
                    ~(via-magic expr))))]  ;; <3>
     (doseq [r# ~(find-derefs expr)]
       (add-watch r#                       ;; <4>
                  :update-formula
                  update-fn#))

     formula#))
  1. Assume Vars exist for the initial value, so don’t use magic
  2. Build the updating watch function like before, except…
  3. … use magic to look up through the Var
  4. Again add watches to all of the dereferencing sub-forms

You’ll notice that I only chose to lookup through the Vars in one place, namely, the point of update of the formula itself. There would be problems if I re-define either of the formula‘s dependencies. That is, the update will break because the new embedded reference will not have the needed watchers. In any case, the new implementation of formula works like the old:

(def hits (ref 25))
(def at-bats (ref 100))

(def avg (formula (float (/ @hits @at-bats))))

@avg
;;=> 0.25

(dosync (alter hits #(+ 10 %))

@avg
;;=> 0.35

The new formula macro, in addition to walking the expression to look for dereferncing forms, changed the call-sites of the embedded derefs at the point of update only. Specifically it generated something like the following code:

(let* [formula 
        (agent (float (/ (deref hits)        ;; <1>
                         (deref at-bats))))
       update-fn 
        (fn [key ref o n]
          (send formula
                (fn [_]
                  (float 
                    (/ (let* [v (resolve 'hits)] 
                         @@v)                ;; <2>
                       (let* [v (resolve 'at-bats)] 
                         @@v))))))]
   (doseq [r #{hits at-bats}]                ;; <3>         
     (add-watch r :update-formula update-fn))

   formula)
  1. Non-magic expansion
  2. Using magic the Vars are looked up at the time of the formula’s update
  3. Assume that the Var will never be re-bound

I could make formula even more complicated by ensuring that even on a Var re-define the formula updates and the proper watches are applied to the new reference type, but I leave that as an exercise to the reader.6

I could probably go on7 and on about deep code-walking macros, but I hope that this introduction gives an idea not only why they’re so powerful, but also why people (myself included) in the Clojure community are so excited by this technique.

Happy hacking!

:F


  1. I’ve missed writing blog posts about Clojure. It feels good y’all. 

  2. And the misconceptions of asynchronous APIs

  3. If you’ve ever watched my Macronomicon talk and wondered what the heck my koan in the beginning meant then this post is a hint. 

  4. The amazing book Let Over Lambda talks about these types of DCWMs in depth, and much much more (although he doesn’t use the same terminology). 

  5. A Common Lisp-like symbol-macrolet would be nice for this purpose too. The Clojure implementation of this feature is a master-class in DCWMs. 

  6. Javelin is a nice ClojureScript library that takes this idea to the nth degree. 

  7. Maybe one day I’ll write a book about this stuff. How does The Macronomicon sound as a title? I actually started writing that book but lost interest after a while because as it turned out I needed to learn a lot more. Macros are funny. Just when you think you know all about them, you learn about some other interesting aspect of them right out of the blue. It’s crazy. So as it stands I’ve decided to hold off starting such a book because, well, because I’ve just not built an idea in my head about what it should be. This is partly because I need to learn a lot more and study a crap-ton of existing code. One day. Maybe. 

7 Comments, Comment or Ping

  1. Awesome post, thanks :-) I’ll reserve a space on my bookshelf for a shiny copy of the Macronomicon when it’s out…

    Just one questions though – is tree walking always safe over Clojure forms? Are there any edge cases or gotchas you need to look out for?

  2. @Sam

    I use macroexpand-all in my code and it works fine for my purposes, but in some circumstances it might not produce the same form that Clojure will during its normal read-expand-compile-eval-print cycle. The sad part is that I forget the ways that it fails. :-( The use of clojure.walk should work, although I’ve not used it as extensively as I could have. Maybe someone else reading this can shed some light on these two aspect.

  3. Timothy Baldridge

    One of the hardest parts I’ve found (when writing the DCWM of core.async) is when you want to override existing clojure macros. For instance, case is a macro who’s expansion is completely useless to core.async. Therefore we can’t just do macroexpand-all, instead we have to recursively macroexpand-1 and check to make sure we want don’t want to override the macro at each step. This ends up being a bit harder than it sounds.

  4. Patrick Logan

    macrolet is useful in common lisp for deep code-walking macros. macrolet essentially defines lexically scoped macros to be expanded during expansion of some other macro.

  5. Great post, thanks for the clear explanation.

    BTW Patrick, Clojure’s tools.macro provides a macrolet: https://github.com/clojure/tools.macro#macrolet-local-macro-definitions I had seen macrolet before but never knew when it would be useful, but after reading this post I can already see of how useful it could be in cleaning up some macros.

  6. Start writing the book; in the course of writing it you’ll learn enough to write it :-)

  7. Alex Baranosky

    Excellent post. You don’t see a ton published on this subect. Thanks.

Reply to “An introduction to deep code-walking macros with Clojure”