Metalinguistic Abstraction

Computer Languages, Programming, and Free Software

Archive for the ‘mathematics’ Category

Monads for Schemers/Lispers

with 4 comments

I was originally writing a much more ambitious post that tried to introduce category theory and its uses, but have been having a hard time writing it. Unfortunately it is not something that I think can be easily explained tersely, although my attempts to do so have lead me to learn a lot more about category theory than I thought I’d ever want to know. Yet, I would still like to expose monads in a way that Schemers and Lispers would relate to them and perhaps grok the potential usefulness of the concept, which at times may seem to be a strange dodge performed by those “pure” functional people. I am forgoing some of the background theory I learned in exchange for trying to highlight some of the key ideas as applied to writing programs by example and narrative.

The only obvious prerequisite knowledge here is that of higher order functions and using functions as data. These concepts are well-covered and taught by a variety of computer science texts, such as the venerable SICP, webcast university lectures, and many other mediums. I assert that this is an important prerequisite because monads provide you is a rigorous and structured abstraction for dealing with data transformations and the application of higher order functions.


Consider the “Maybe” monad that is often used as a primer. We’re going to offer a quick rough translation into Scheme. The semantics we’re interested in of the Maybe monad are as follows:

  • Any function should be able to be applied to the monad
    • Not necessarily successfully such as in the case of a type error, although with a little more work we could get these semantics as well via dynamic-wind or unwind-protect.
  • The monad will prevent us from operating on the value NIL, and instead simply ignore computations that would attempt to operate on NIL.
    • This is similar to how computer architecture will propagate NaNs in computations.

We only need three tools to get this behavior:

  1. A procedure to construct the monad from some object in the category of Scheme values
  2. A procedure to transform functions that would normally operate on those values into a function that will operate on the monad value
  3. A procedure to merge double-wrapped “Maybe” monads to normalize them

Without further ado, here are some function definitions:

(define (make-maybe value)

(define (map-function-to-maybe fn)
  (lambda (maybe-object)
    (if (null? maybe-object)
        (make-maybe (fn maybe-object)))))

(define (join-maybe maybe-object)

This may seem incredibly pointless, but consider this motivating example:

> (define maybe-cdr (map-function-to-maybe cdr))
> (maybe-cdr (make-maybe '(1 2)))
> (maybe-cdr (maybe-cdr (make-maybe '(1 2))))
> (maybe-cdr (maybe-cdr (maybe-cdr (maybe-cdr (make-maybe '(1 2))))))

As you can see you would have normally had to have check for NIL values to prevent crashing in the last expression, but we now have a generic way to instrument functions with checks. The result is a function that accepts and returns the monad’s type. This is the essence of Functors (of which monads are a special type of) to take home: Functors are morphisms between categories, categories themselves contain objects and morphisms, and in order to a be a functor we must be able to transform both the objects in a category and the morphisms between them. There are few additional properties and rules to be considered a functor or monad that can be useful in proving correctness or getting certain guaranteed behavior, but the main take-away idea is writing a uniform “layer” that will map both functions and the data they operate on to another “type” with different semantics. Monads also have to obey rules for compositions of “join” and “map” to maintain invariants, but it is sometimes okay to bend these rules: there are many not-quite-monad functors that can prove very useful. Consider the monad and the functor as guiding metaphors.

Moving on: Why do we bother at all with make-maybe and join-maybe? It so happens that the underlying representation is simple in this case (since we only need to tweak functions) and thus the construction and join operators are trivial. This would not be the case in say, Haskell, where the typing system will prevent happy accidents like this one; (make-maybe (make-maybe ‘(7))) would actually have a double-wrapped Maybe-type and “join” would have to do some thinking on how to sensibly put the nested Maybes together. (This is also called “multiplication” in category theory parlance)

Also notice there is no way to “get out of” the monad. Getting “out of” the monad type is an additional functionality that goes above and beyond what it means to be a monad; it would be a monad with an additional transformation to whatever is seen to be fit. However, once again, dynamic typing provides us a happy accident in this particular case and we can actually use this particular monad’s representation directly in our programs. Generally this is not the case and probably should not be encouraged: one should think carefully before breaking into a monad to deal with its representation ad-hoc, otherwise we lose some benefits of containment.

List, the monad you’ve used before

Let’s just rephrase “list” as a monad here:

(define (make-list value)
  (list value))

(define (map-function-to-list fn)
  (lambda (list-object)
    ;; "Map" has the proper return type "for free"
    (map fn list-object)))

(define (join-list value)
  (apply append value))

We have now defined a monad. Again, it may seem stupidly trivial, but consider what it has in common with “Maybe”:

  • We can accept anything and create a list monad instance
  • We have some transformation that modifies a function to operate on list-monads instances.
  • We can get rid of monad layers
    • Notice that this definition maintains composition, a useful property of monads. This means you can perform a map of a join and then another join on the result or simply perform two joins to get the same value.
    • This presumes the lists are well-formed, i.e. have no non-lists as elements when join is called. Otherwise we rightly should have a type error because you are applying “join” to a non-list, although you could imagine writing a more forgiving implementation with Maybe-like qualities, or force all internal values to be singleton lists rather than scalars. Such is the flexibility of thinking in terms of this abstraction.
  • We have no way of getting anything out of the monad, and this time it’s more visible: we can never go from ‘(1) -> 1, unless we provide another transformation, such as “car”, which would have the type “List-Monad -> *”.

Something less contrived, a “Watchful” monad

The final example I will be presenting here is not-quite useful, but pretty close. It is not complicated, but it is non-trivial. We seek to accomplish the simple task of letting us hook into the application of a function to a value, useful in debugging or notifying other components of a system of changes. This is often a task given over to mutation because it’s a pain to pass around this state all the time. Here is an attempt to try to give an alternative to mutating global state while allowing arbitrary procedures to be notified of when a value is being operated on by any function that has been transformed by our “map-function-to-watchful” procedure. I implement some simple functionality using this monad to attach arbitrary functions (called “snoopers”) that are allowed to look at the value being passed to a function being applied to the monad and their own previous state.

In this section I will first show how one can use define and use the snoopers, how using them appears in an interpreter, and then finally the monad implementation itself.

Using the Watchful monad

Here is how one would define some “snoopers” to watch a value, along with defining a watched-value instantiation for the number zero.

;; Some watchers
(define (modification-watcher state thing)
  ;;Counts the number of times the value has functions applied to it.
  ;;Notice that we do not use "thing," but we must accept it to have the
  ;;proper number of arguments
  (if (null? state)
      (+ state 1)))

(define (previous-values-watcher state thing)
  ;; Retains previous values in the stat
  (cons thing state))

;;Using watchers
;;Add both watchers, notice how each add-watcher call returns a monad in turn.
(define zero-being-watched (add-watcher previous-values-watcher
                                        (add-watcher modification-watcher
                                                     (make-watchful 0))))

;; Let's transform a function...say an increment function
(define watcher-incr (map-function-to-watchful
                      (lambda (x) (+ x 1))))

Showing use of the watched-value and the watch-increment function in the interpreter

Here’s an example interaction:

> zero-being-watched
(((#<procedure:previous-values-watcher> ()) (#<procedure:modification-watcher> ())) . 0) 0)
> (watcher-incr zero-being-watched)
(((#<procedure:previous-values-watcher> (0)) (#<procedure:modification-watcher> 1)) . 1)
> (watcher-incr (watcher-incr zero-being-watched))
(((#<procedure:previous-values-watcher> (1 0)) (#<procedure:modification-watcher> 2)) . 2)
> (watcher-incr (watcher-incr (watcher-incr zero-being-watched)))
(((#<procedure:previous-values-watcher> (2 1 0)) (#<procedure:modification-watcher> 3)) . 3)

As you may be able to tell from the above, the internal representation format of the monad is an association list paired with the watched value. That association list contains procedures and state that they are allowed to store things in (you could do an even cleaner job with continuations, but then they wouldn’t print as nicely), so right now all state-saving has to be done explicitly.

If this example were to be used for anything serious one would have to fix up a couple of things, but it’s pretty functional as-is (And purely functional in the other meaning of the word). The following definition is not as long as it looks; there are lots of comments and some stuff (like filter and the merge-and-remove-duplicates for “join”) that eat up lines without being intrinsically related to monads. There is no subtle twist; it’s exactly the same as the previous two examples (make, map, and join) with an additional function (add-watcher) that operates on the monad directly to add snoopers. Notice that functions passed through map-function-to-watchful are instrumented to call all the snoopers and construct a new list full of the procedure reference and the new state each snooper returns for its next invocation.

One final thing deserving explanation is that my add-watcher reuses the join operator: basically, I construct a new monad with the input monad as the wrapped value, then run join. This is so that I can get the duplication elimination and state list construction for free. It certainly makes the definition of add-watcher short. (Despite its relatively huge commenting)

The watchful monad implementation

;;Watchful Monad
(define (make-watchful value)
  ;; the car of the monad is a list of snoopers, the cdr is the value being watched
  ;; Each snooper is in the form (function state), where state is returned
  ;; by the snooper after every invocation so it can store things as it chooses.
  ;; The snooper function should accept two args: state and data, so that they
  ;; can have some memory and report on the data being operated on by the
  ;; watched procedure.
  (cons '() value))

(define (map-function-to-watchful fn)
  (lambda (watchful-object)
    (let* ((snoopers (car watchful-object))
           (arg-data (cdr watchful-object)))
      ;; Compute new snooper state, now they all know what "fn" is operating on.
      (cons (map (lambda (snooper)
                   (let ((snooper-fn (car snooper))
                         (snooper-state (cadr snooper)))
                     (list snooper-fn (snooper-fn snooper-state arg-data))))
            (fn arg-data)))))

(define (join-watchful value)
  (letrec ((inner-snoopers (car (cdr value)))
           (outer-snoopers (car value))
           (inner-value (cddr value))
           ;; The standard "filter" we all know and love, but not included in
           ;; R5RS
           (filter (lambda (predicate seq)
                     (if (null? seq)
                         (if (predicate (car seq))
                             (cons (car seq) (filter predicate (cdr seq)))
                             (filter predicate (cdr seq)))))))
    ;; Merge the outer-snoopers and inner-snoopers, outer-snoopers win in event
    ;; of a collision, which in this case means the same procedure with the same
    ;; state.
    (cons (append outer-snoopers (filter (lambda (inner-snooper)
                                           (not (member inner-snooper outer-snoopers)))

;; functions that operate on the watcher-monad directly (e.g. Watcher-Monad -> Watcher-Monad)
(define (add-watcher watcher-fn watcher-object)
  ;;watcher-fn must accept two args: state and the value in the monad when a
  ;;function is called on that value and return a new state. We abuse the
  ;;join-watchful function I wrote by first encapsulating the watcher-object
  ;;as a value inside another watcher-object and then employing "join-watchful"

   ;; We make a singleton assoc-list with watcher-fn with state NIL to start.
   ;; We know our monad internally is just a cons pair, for brevity break the
   ;; abstraction here...
   (cons `((,watcher-fn ()))


Hopefully these examples shed more light on monads and their uses for those who come from the Lisp-ish backgrounds, or dynamic typing backgrounds in general. I often found the Haskell-annotated versions caught up in typing which sometimes made it more difficult than absolutely necessary to explain the generalities the idea and motivation for monads with languages where mutation is more convenient and typing is more relaxed. As such, I wrote this for the programmer already adept in using higher order functions and thinking of functions as data yet unsure of what the fuss is about monads and how they can be useful.

Written by fdr

July 21, 2007 at 5:37 am

Posted in lisp, mathematics, theory

SAGE Computer Algebra System

leave a comment »

While skimming the RSS feed for Lambda the Ultimate I spied an entry about a functional definition of TeX’s formula typesetting algorithm. For me the real gem here was not some insight about TeX, but a solitary (at the time of this writing) comment about the CAS SAGE (and its integration with jsMath). SAGE is pretty interesting, but we’ll get to why it’s interesting in a moment after discussing some other well known CAS and scientific computation programs.

I have never understood some people’s obsession with Matlab, Mathematica, and similar tools. I mean, I kind of do: they are allegedly well designed tools, the former providing well-optimized versions of just about every well known numerical recipe you want. Matlab was popular with the EE guys in college for plotting and numerical methods, and I know that the Mathematica guys can do some impressive symbolic manipulation and interactive visualization with relatively minimal effort…

But something feels wrong. It’s not just a matter of the proprietary nature of these tools, although that and the high price can sometimes be most…dissuading. Especially when the pricing information is hidden after a login screen, or one of the first calls to action is to request a quote. One knows that they are not in for a good time under these circumstances, generally speaking. But projects like GNU Octave also set off my radar, even though it has low cost. Why?

The real answer is the “tool among tools[0]” philosophy. This is also possibly the main reason why Lisp lost out to C and UNIX, as well as the relative obscurity of Smalltalk, despite its alleged brilliance (it is not something I have used myself). Part of this is cultural (GNU Octave: a scientific computation system mostly compatible with MATLAB, nowhere in the mission is high levels of integration) and part can be more sinister motivation (MATLAB has a negative incentive to ease transition to Mathematica and vice versa). Mathworks and Wolfram are both given incentive to instead make their own platform as attractive as possible and trying to one-up everyone else instead of wasting time making sure that everyone can move their codes around: the latter is just not good business.

Enter SAGE. This package is shamelessly and unabashedly about integration, much in the Python philosophy. In fact it’s written in Python with liberal FFI bindings using the most excellent Pyrex extension language and compiler. It has unidirectional bindings to an impressive number of CAS systems (including many I have never heard of) with some basic parsing/data type normalization. And, most importantly, it avoids the problem of getting yourself locked into a particular CAS with no hope for escape[1].

For me personally it may be useful for plotting; I still typically write my actual math codes in C (maybe Lisp for prototyping, but that’s another post). I will have to take a look. I hope that in the future that they may gain some interoperability with statistical tools such as SAS and R (which already has a Python integration package that, alas, only works on Python 2.4), as I think this would only increase the leverage of this tool tremendously. One hopes that the proprietary vendors won’t take offense and start trying to break things, ala Microsoft. If things get even better then maybe they’ll pay due notice and at least try to avoid breaking things all the time. At best, they’ll start to feature SAGE (or similar system) interoperability as a feature, and the world may be better off.

[0]: I first heard this phrasing from Guido van Rossum when he gave a talk at Berkeley.
[1]: As detailed by another blogger, who also gives a different take and overview of SAGE. Actually, his is a lot more detailed in just about every respect, so I’d highly recommend reading this if you are seriously interested.

Additional Resources:

Written by fdr

June 28, 2007 at 1:51 am

Posted in mathematics