Metalinguistic Abstraction

Computer Languages, Programming, and Free Software

Monads for Schemers/Lispers

with 4 comments

I was originally writing a much more ambitious post that tried to introduce category theory and its uses, but have been having a hard time writing it. Unfortunately it is not something that I think can be easily explained tersely, although my attempts to do so have lead me to learn a lot more about category theory than I thought I’d ever want to know. Yet, I would still like to expose monads in a way that Schemers and Lispers would relate to them and perhaps grok the potential usefulness of the concept, which at times may seem to be a strange dodge performed by those “pure” functional people. I am forgoing some of the background theory I learned in exchange for trying to highlight some of the key ideas as applied to writing programs by example and narrative.

The only obvious prerequisite knowledge here is that of higher order functions and using functions as data. These concepts are well-covered and taught by a variety of computer science texts, such as the venerable SICP, webcast university lectures, and many other mediums. I assert that this is an important prerequisite because monads provide you is a rigorous and structured abstraction for dealing with data transformations and the application of higher order functions.

Maybe

Consider the “Maybe” monad that is often used as a primer. We’re going to offer a quick rough translation into Scheme. The semantics we’re interested in of the Maybe monad are as follows:

  • Any function should be able to be applied to the monad
    • Not necessarily successfully such as in the case of a type error, although with a little more work we could get these semantics as well via dynamic-wind or unwind-protect.
  • The monad will prevent us from operating on the value NIL, and instead simply ignore computations that would attempt to operate on NIL.
    • This is similar to how computer architecture will propagate NaNs in computations.

We only need three tools to get this behavior:

  1. A procedure to construct the monad from some object in the category of Scheme values
  2. A procedure to transform functions that would normally operate on those values into a function that will operate on the monad value
  3. A procedure to merge double-wrapped “Maybe” monads to normalize them

Without further ado, here are some function definitions:

(define (make-maybe value)
  value)

(define (map-function-to-maybe fn)
  (lambda (maybe-object)
    (if (null? maybe-object)
        '()
        (make-maybe (fn maybe-object)))))

(define (join-maybe maybe-object)
  maybe-object)

This may seem incredibly pointless, but consider this motivating example:

> (define maybe-cdr (map-function-to-maybe cdr))
> (maybe-cdr (make-maybe '(1 2)))
(2)
> (maybe-cdr (maybe-cdr (make-maybe '(1 2))))
()
> (maybe-cdr (maybe-cdr (maybe-cdr (maybe-cdr (make-maybe '(1 2))))))
()

As you can see you would have normally had to have check for NIL values to prevent crashing in the last expression, but we now have a generic way to instrument functions with checks. The result is a function that accepts and returns the monad’s type. This is the essence of Functors (of which monads are a special type of) to take home: Functors are morphisms between categories, categories themselves contain objects and morphisms, and in order to a be a functor we must be able to transform both the objects in a category and the morphisms between them. There are few additional properties and rules to be considered a functor or monad that can be useful in proving correctness or getting certain guaranteed behavior, but the main take-away idea is writing a uniform “layer” that will map both functions and the data they operate on to another “type” with different semantics. Monads also have to obey rules for compositions of “join” and “map” to maintain invariants, but it is sometimes okay to bend these rules: there are many not-quite-monad functors that can prove very useful. Consider the monad and the functor as guiding metaphors.

Moving on: Why do we bother at all with make-maybe and join-maybe? It so happens that the underlying representation is simple in this case (since we only need to tweak functions) and thus the construction and join operators are trivial. This would not be the case in say, Haskell, where the typing system will prevent happy accidents like this one; (make-maybe (make-maybe ‘(7))) would actually have a double-wrapped Maybe-type and “join” would have to do some thinking on how to sensibly put the nested Maybes together. (This is also called “multiplication” in category theory parlance)

Also notice there is no way to “get out of” the monad. Getting “out of” the monad type is an additional functionality that goes above and beyond what it means to be a monad; it would be a monad with an additional transformation to whatever is seen to be fit. However, once again, dynamic typing provides us a happy accident in this particular case and we can actually use this particular monad’s representation directly in our programs. Generally this is not the case and probably should not be encouraged: one should think carefully before breaking into a monad to deal with its representation ad-hoc, otherwise we lose some benefits of containment.

List, the monad you’ve used before

Let’s just rephrase “list” as a monad here:

(define (make-list value)
  (list value))

(define (map-function-to-list fn)
  (lambda (list-object)
    ;; "Map" has the proper return type "for free"
    (map fn list-object)))

(define (join-list value)
  (apply append value))

We have now defined a monad. Again, it may seem stupidly trivial, but consider what it has in common with “Maybe”:

  • We can accept anything and create a list monad instance
  • We have some transformation that modifies a function to operate on list-monads instances.
  • We can get rid of monad layers
    • Notice that this definition maintains composition, a useful property of monads. This means you can perform a map of a join and then another join on the result or simply perform two joins to get the same value.
    • This presumes the lists are well-formed, i.e. have no non-lists as elements when join is called. Otherwise we rightly should have a type error because you are applying “join” to a non-list, although you could imagine writing a more forgiving implementation with Maybe-like qualities, or force all internal values to be singleton lists rather than scalars. Such is the flexibility of thinking in terms of this abstraction.
  • We have no way of getting anything out of the monad, and this time it’s more visible: we can never go from ‘(1) -> 1, unless we provide another transformation, such as “car”, which would have the type “List-Monad -> *”.

Something less contrived, a “Watchful” monad

The final example I will be presenting here is not-quite useful, but pretty close. It is not complicated, but it is non-trivial. We seek to accomplish the simple task of letting us hook into the application of a function to a value, useful in debugging or notifying other components of a system of changes. This is often a task given over to mutation because it’s a pain to pass around this state all the time. Here is an attempt to try to give an alternative to mutating global state while allowing arbitrary procedures to be notified of when a value is being operated on by any function that has been transformed by our “map-function-to-watchful” procedure. I implement some simple functionality using this monad to attach arbitrary functions (called “snoopers”) that are allowed to look at the value being passed to a function being applied to the monad and their own previous state.

In this section I will first show how one can use define and use the snoopers, how using them appears in an interpreter, and then finally the monad implementation itself.

Using the Watchful monad

Here is how one would define some “snoopers” to watch a value, along with defining a watched-value instantiation for the number zero.

;; Some watchers
(define (modification-watcher state thing)
  ;;Counts the number of times the value has functions applied to it.
  ;;Notice that we do not use "thing," but we must accept it to have the
  ;;proper number of arguments
  (if (null? state)
      1
      (+ state 1)))

(define (previous-values-watcher state thing)
  ;; Retains previous values in the stat
  (cons thing state))

;;Using watchers
;;Add both watchers, notice how each add-watcher call returns a monad in turn.
(define zero-being-watched (add-watcher previous-values-watcher
                                        (add-watcher modification-watcher
                                                     (make-watchful 0))))

;; Let's transform a function...say an increment function
(define watcher-incr (map-function-to-watchful
                      (lambda (x) (+ x 1))))

Showing use of the watched-value and the watch-increment function in the interpreter

Here’s an example interaction:

> zero-being-watched
(((#<procedure:previous-values-watcher> ()) (#<procedure:modification-watcher> ())) . 0) 0)
> (watcher-incr zero-being-watched)
(((#<procedure:previous-values-watcher> (0)) (#<procedure:modification-watcher> 1)) . 1)
> (watcher-incr (watcher-incr zero-being-watched))
(((#<procedure:previous-values-watcher> (1 0)) (#<procedure:modification-watcher> 2)) . 2)
> (watcher-incr (watcher-incr (watcher-incr zero-being-watched)))
(((#<procedure:previous-values-watcher> (2 1 0)) (#<procedure:modification-watcher> 3)) . 3)

As you may be able to tell from the above, the internal representation format of the monad is an association list paired with the watched value. That association list contains procedures and state that they are allowed to store things in (you could do an even cleaner job with continuations, but then they wouldn’t print as nicely), so right now all state-saving has to be done explicitly.

If this example were to be used for anything serious one would have to fix up a couple of things, but it’s pretty functional as-is (And purely functional in the other meaning of the word). The following definition is not as long as it looks; there are lots of comments and some stuff (like filter and the merge-and-remove-duplicates for “join”) that eat up lines without being intrinsically related to monads. There is no subtle twist; it’s exactly the same as the previous two examples (make, map, and join) with an additional function (add-watcher) that operates on the monad directly to add snoopers. Notice that functions passed through map-function-to-watchful are instrumented to call all the snoopers and construct a new list full of the procedure reference and the new state each snooper returns for its next invocation.

One final thing deserving explanation is that my add-watcher reuses the join operator: basically, I construct a new monad with the input monad as the wrapped value, then run join. This is so that I can get the duplication elimination and state list construction for free. It certainly makes the definition of add-watcher short. (Despite its relatively huge commenting)

The watchful monad implementation

;;Watchful Monad
(define (make-watchful value)
  ;; the car of the monad is a list of snoopers, the cdr is the value being watched
  ;; Each snooper is in the form (function state), where state is returned
  ;; by the snooper after every invocation so it can store things as it chooses.
  ;; The snooper function should accept two args: state and data, so that they
  ;; can have some memory and report on the data being operated on by the
  ;; watched procedure.
  (cons '() value))

(define (map-function-to-watchful fn)
  (lambda (watchful-object)
    (let* ((snoopers (car watchful-object))
           (arg-data (cdr watchful-object)))
      ;; Compute new snooper state, now they all know what "fn" is operating on.
      (cons (map (lambda (snooper)
                   (let ((snooper-fn (car snooper))
                         (snooper-state (cadr snooper)))
                     (list snooper-fn (snooper-fn snooper-state arg-data))))
                 snoopers)
            (fn arg-data)))))

(define (join-watchful value)
  (letrec ((inner-snoopers (car (cdr value)))
           (outer-snoopers (car value))
           (inner-value (cddr value))
           ;; The standard "filter" we all know and love, but not included in
           ;; R5RS
           (filter (lambda (predicate seq)
                     (if (null? seq)
                         '()
                         (if (predicate (car seq))
                             (cons (car seq) (filter predicate (cdr seq)))
                             (filter predicate (cdr seq)))))))
    ;; Merge the outer-snoopers and inner-snoopers, outer-snoopers win in event
    ;; of a collision, which in this case means the same procedure with the same
    ;; state.
    (cons (append outer-snoopers (filter (lambda (inner-snooper)
                                           (not (member inner-snooper outer-snoopers)))
                                         inner-snoopers))
          inner-value)))

;; functions that operate on the watcher-monad directly (e.g. Watcher-Monad -> Watcher-Monad)
(define (add-watcher watcher-fn watcher-object)
  ;;watcher-fn must accept two args: state and the value in the monad when a
  ;;function is called on that value and return a new state. We abuse the
  ;;join-watchful function I wrote by first encapsulating the watcher-object
  ;;as a value inside another watcher-object and then employing "join-watchful"

  (join-watchful
   ;; We make a singleton assoc-list with watcher-fn with state NIL to start.
   ;; We know our monad internally is just a cons pair, for brevity break the
   ;; abstraction here...
   (cons `((,watcher-fn ()))
         watcher-object)))

Conclusion

Hopefully these examples shed more light on monads and their uses for those who come from the Lisp-ish backgrounds, or dynamic typing backgrounds in general. I often found the Haskell-annotated versions caught up in typing which sometimes made it more difficult than absolutely necessary to explain the generalities the idea and motivation for monads with languages where mutation is more convenient and typing is more relaxed. As such, I wrote this for the programmer already adept in using higher order functions and thinking of functions as data yet unsure of what the fuss is about monads and how they can be useful.

Advertisements

Written by fdr

July 21, 2007 at 5:37 am

Posted in lisp, mathematics, theory

Large Binary Data is (not) a Weakness of Erlang

with 4 comments

Update: Good news in the comments, with some good details. To sum up:

split_binary/2 is the call to use. Binaries are also refcounted instead of garbage collected. The next release will fix large binary pattern matching due to improper handling of bignums in the current release. So the internet wins again and the original blog author should be happy.

Finally — even if we get all this bookkeeping right — we have to deal with the possibility of obscene wastage. Suppose someone loads five hundred megabytes of binary data into memory, takes a five hundred byte slice, and discards the old instance. We can’t deallocate just parts of an array with standard [mc]alloc(), so we have to make a decision on how many bytes of wastage is worth not copying the salient part of the array to a fresh, appropriately sized heap allocation. Sometimes it may be clear, but with lots of small-ish binary instances wasting some memory it may be less clear. Or I guess we could just call realloc() every time, which may be a little bit overkill but would be simple-ish and predictable…

One thing that does remain, however, is shrinking the region allocated for the binary memory. There is a danger of keeping big chunks of binary around when one has taken a tiny slice, so either some GC work needs to be put in place to realloc() properly or the user just needs to be cognizant and copy the binary if they feel that it may release a large chunk of memory and then let all previous references die.

Venturing onward on this posting should be reserved for archivists and the curious.

Read the rest of this entry »

Written by fdr

July 9, 2007 at 3:24 am

Posted in erlang, languages

A Case for Erlang

with 17 comments

Brief prelude on how I got to this topic for flavor and context: earlier this week I was having discussion from an old friend from Berkeley about methodologies in scaling, set off by a discussion rooted in a set of Motorola slides[0] comparing an Erlang, C plus Erlang, and C(++) telecom equipment code that I had forwarded to him. He was aware of Erlang and its general properties since I had been talking about Erlang some time back when I had been playing around with it as a way to coordinate and cluster nodes for KM[1].

He then remarked that he was working on some stuff that needed to be parallelized and support high concurrency and referred me to the SEDA research project as a guiding influence on his budding architecture. SEDA emphasized using event queues to send messages between parallel parts of the architecture with feedback loops for control and updates. I took a look at this and felt there were a few problems:

  1. SEDA is a mothballed research project, so there’s no up to date code
  2. No project I know of maintains a well proven, high quality implementation to abstract a comfortable amount of the mundane details away from me.
  3. Sometimes the model you are working with calls for a thread and not events.

Qualms one and two are strictly at practical and not at a ideological level. SEDA has some high level ideas that I have no strong crystallized negative reaction to and are probably good reading…but at a nuts and bolts level I am left unenthusiastic by the general prospect of not having powerful abstractions readily available to achieve these aims.

Qualm three is much more philosophical and meaty. I have seen a paper that pretty much sums up some of my feelings on the matter: sometimes, you don’t want an event-driven abstraction. The paper even mentions SEDA by name as an example of an event-oriented tool used by the authors to set up high-concurrency servers. Despite the fact that SEDA tried to make this easy (or easier) the authors felt that sometimes a thread was really what one wanted and the event-driven model was just not as clear or easy to write as the thread-oriented one. Not being as clear or easy to write means more bugs. However, not all is lost: the paper concludes that there’s no fundamental reason why events should be the only way to achieve high concurrency. There is some passing mention of Erlang, but nothing substantial. But what have we gained here? Validation of threads? Aren’t threads the road to madness? We can probably do better than just threads and synchronization constructs which themselves pose a substantial risk to program reliability.

With this background information it’s easy to imagine a relatively annoying scenario with a {C, Cpp, C#, Java, Python, Ruby, Lisp, Haskell, damn-near-anything} program: What happens when you write part of your system in a threaded manner (because it was natural feeling, and that’s not a necessarily a dirty instinct, as supported by the paper in qualm three) but then need to extract this threaded functionality because it needs to handle more concurrency or be made network-accessible to work over multiple machines? Generally you get to rewrite a lot of code to fit into the SEDA diagram, including re-writing threaded code to be event-driven and network-accessible, which also means having to take care of the network and protocol issues. Don’t forget to having to update your old code to use the event-driven version, a painful affair if you used synchronization constructs. Your only alternative to these rewrites is to defensively program everything with the intention of being event-driven which will only waste a lot of time and make your program less efficient unless you provide even more code to do shared-memory interactions as a secondary mode; otherwise, you will be stuck doing lots of serialization/serialization to interact with stuff on the same machine. Let’s not even mention that code that could have been handled more gracefully with context switches rather than event handling will end up being maddening to write and more opaque than one feels it should be. Welcome to Tartarus, enjoy your stay.

So now we finally talk about Erlang. Erlang attacks a lot of these problems on many fronts including in its implementation, syntax, and semantics. Yet, people seem to be unfazed by the idea of re-inventing Erlang’s wheels when it comes to Erlang’s choice application domain, and I suspect a large part of the reason for this is that most people who are vaguely aware of Erlang and its reputation don’t know what wheels have already been invented in Erlang. Included in those are some wheels that they probably haven’t thought of yet when starting out and could use to assist implementation, others are wheels (some quite elaborate) that they’d be forced to implement, test, and maintain on their own otherwise or suffer from something painful.

Here is a list of some of the more important things that came to my mind that you get “for free” for using Erlang:

  • A generally expressive syntax that reduces the amount of code from somewhere between one tenth and one forth of the roughly equivalent code in C/C++[3]. Error density was seen to be about the same, so that also means about a fourth to a tenth of the number of bugs.
  • A virtual machine that itself supports kernel event polling (at least under Linux) to allow you to easily handle tens of thousands of persistent connections (such as TCP streams) modeled with a simple one-context-per-connection abstraction[4]. This is not the default and can be enabled with “+K true” when starting the “erl” interpreter.
  • The overhead of a process is 318 machine words.
  • A virtual machine that can efficiently automatically handle SMP (at least under Linux) and distribute processes between nodes
  • Semantically simple process-oriented concurrency (which avoids a lot of bugs seen with shared-state threads) with high-speed message passing and process scheduling (how else could it handle 80,000 processes?), thanks in large part to no-mutate language semantics
  • Extensive heap sharing between processes to avoid message copying, once again from no-mutate language semantics. (used the “-shared” switch)
    • This is not the default behavior: otherwise, per-process heaps and full-copies are used to maintain short GC pauses for real-time applications
  • Network-transparency between processes, even when passing higher-order objects like closures(!)
    • Some of the more “leaky” abstractions made for performance such as ETS or process tables that allow for mutation can have opaque continuation returns that cannot be serialized in this way. In any case, it’s in a very small minority.
  • Node and process monitoring and restart strategies to allow you to write robust, idiomatic, fault-tolerant programs
  • Automatic transitive closing of Erlang nodes for maximum reliablity/message passing speeds as well as (when that full-connectivity is impossible) message routing between intermediate nodes.
  • Pattern-matching of several data types. Not only the obvious tuples, but also for binary streams, largely eliminating temptations to use inefficient ASCII-based wire protocols.
  • Distributed synchronization constructs
  • Safe local and global name registration
  • A powerful distributed database, MNesia
  • Code hot-swapping
  • Application versioning and packaging support
  • Metaprogramming using the AST, so Lispish-style macros exist. NOTE: in Erlang parlance, this is called the “parse_transform” procedure. “Macros” in Erlang lexicon refer to something more like the C preprocessor.
  • Generic constructs for common tasks: finite state machines, event-driven (for when they are the most natural model), and the ever-useful and flexible “generic server” (gen_server) behavior.
  • A community that is focused on reliability, performance, and distributed applications
  • More community that is trying to give Erlang some more tools with “general appeal” such as a web application framework
  • Heavy documentation, both of the libraries and of methodology refined by twenty years of language development, research, and application

Let’s revisit the scenario above, where you were stuck in Tartarus re-writing your threads into events and making them accessible via network, but now starting out with an Erlang code.

Once again, as before, some of your code which you had written using a process-oriented model has outgrown its britches and needs to be made scalable and concurrent. The latter part of that is mostly taken care of for you: simply spawn a process for every work item, as you were before. Thanks to the Erlang VM, your eight-processes turned eight-thousand are having no problem handling the flood of work and utilizing as many machine resources as as possible. You don’t need to coerce yourself into writing an event-based server, introducing bugs and obfuscating code as you go, a huge win already.

Now you get to worry about making things distributed, which is a little more complicated. Your first attempt is to allocate some dedicated machines to running the code in question for more power. Since message-passing is network-transparent, the changes to your code elsewhere in your application is minimal. A send is still a send, whether across a network or on the same node. You write some code to decide how to allocate those queries across these machine resources, which themselves may dynamically reallocate work, carrying along with it the process name to send the return message to to avoid centralized response multiplexing overhead[5]. Ultimately some node in the cluster sends the response or the requester times out. In many cases you are now done, you can just constantly add machines to this glob to get more power.

To spice up the story, let us suppose that you notice that there’s a lot of state-passing going on to synchronize nodes that’s just too network-intensive that wasn’t a problem when this was a single-node solution, so you rewrite some of your code to pass a closure that contains instructions on how to update the node’s state. This means you just avoided having to write some sort of fancy differential state transfer procedure; you’ve simply told the node at the other end how to compute the new state from an old one by sending the procedure itself on the wire instead of the finished product.

Finally, if you had followed some of the OTP design principles[6] to begin with (which is not uncommon, even when working in a single-node environment, they are exceedingly convenient abstractions) and used the gen_server (or likewise) behavior you can get (or might have already had) a supervision tree going that’ll make your application serving this stuff fight down till the last man. And so ends our tale.

Don’t make this glowing report make you think there aren’t difficulties here; there definitely are real tangible downsides to Erlang, not the least of which is recruiting programmers, questionable string handling, and the somewhat-warty records. It’s also considerably slower than C when it comes to sequential computation. However, consider that it is clearly not a toy, and that groups of programmers — not all of them Gods, I’m sure — have employed it to process unfathomable quantities of our telephony data with nine nines of reliability[7], all in two million lines of code[8]. Erlang is an artifact designed with a purpose and a vision and will be difficult to best in technical merit in its chosen problem domain without embarking on a project of similar or greater scope.

Footnotes:

[0]: Gist: Erlang is more terse, has better degradation under high load conditions, better reliability. You know, what you might expect against hand-rolled C++ that’s significantly less complex and tested than Erlang’s implementation itself. (See Greenspun’s Tenth Rule, except with the obvious reapplication)

[1]: Yet unfinished. Actually, stalled for some other priorities. The clustering part of it is done, the missing section is writing the appropriate bindings between the Erlang nodes and the Lisp process, as well as a job allocator to decide on how to allocate work to the nodes. We also don’t yet have a lot of machines to run this thing on, a circumstance that may change in coming months. In the off chance that you are interested in contributing to a parallel, clustered knowledge base, let me know in the comments.

[2]: As opposed to full-blown Prolog style unification where everything is in terms of rules. That is, you can say sum(3, 4, A) and conclude A is 7, but you can also ask Prolog sum(A, B, 7), and constantly ask it for legitimate values of A and B, which it’ll happily return for as long as you’d like. This is why a simplistic fixed-size Sodoku solvers in Prolog look just like articulating the rules instead of actually finding the answers.

[4]: An oft cited “benchmark” of sorts showing an Erlang web server, YAWS, vs. Apache, which uses pthreads. Both of these servers are handing a trivial load — a slow “GET” request — so the main determination of who wins here is who is least choked by the massive number of requests. Since Apache uses a pthreads based server it is largely limited by the operating system’s threading implementation.

[5]: Notice the security problem here? Erlang by default will only communicate with other nodes with the same magic cookie, a simple and robust security mechanism to prevent “messages from strangers.” In case you were wondering: message encryption in Erlang is supported in case you don’t trust your link. It’s not as straightforward, but someone has written something about it.

[6]: See OTP Design Principles which discuss supervision trees, monitors, and links, among many other things. Also see Joe Armstrong’s Thesis; it’s easy to read and extremely informative as a perspective on writing reliable software, even if you are not writing Erlang.

[7]: An article written by Armstrong. He links to his thesis, but this is a little bit more conversational and brings out some highlights. I have excerpted the relevant portion for lazy clickers:

Does it work?

Yes. Erlang is used all over the world in high-tech projects where reliability counts. The Erlang flagship project (built by Ericsson, the Swedish telecom company) is the AXD301. This has over 2 million lines of Erlang.

The AXD301 has achieved a NINE nines reliability (yes, you read that right, 99.9999999%). Let’s put this in context: 5 nines is reckoned to be good (5.2 minutes of downtime/year). 7 nines almost unachievable … but we did 9.

Why is this? No shared state, plus a sophisticated error recovery model. You can read all the details in my PhD thesis.

[8]: In case you thought this was a piddly amount of code, a paper pegs the equivalent amount of C/C++ code at somewhere between four to ten times as much code to get the same stuff done. This is not a extraordinary claim considering Erlang’s advantages in automatic memory management and functional programming constructs such as the ever-useful map(). This is a huge win, despite what some people try to tell me…to be explored in a future blog post which I have tentatively named in my head “Verbosity is a valid complaint!,” or something like that.

Written by fdr

July 4, 2007 at 4:33 am

Posted in erlang, languages

DeckWiki: Proposed Collaborative Presentation Creation

with one comment

This is something I have been turning over in my head for at least six or seven months as I discovered the prevalence of Microsoft PowerPoint™ in business settings. There are a number of shortcomings:

  • Big binary files exchanged by email[0]
  • Relatively weak version control/merging capabilities
  • Out of date borrowed slides[1]
  • Lobotomizing an old set of slides just to get the same style in an integrated Microsoft PowerPoint™ file
  • Having to reprocess slides manually to get them to conform to some new style
  • Weak metadata/commenting capabilities, leading to bad slides
  • Lack of visual support for “off the rails” discussion, leading to unnecessary hand waving
  • Almost no serious collaboration support whatsoever[2]
  • Doesn’t facilitate location of useful, other slides available in the organization

This is kind of a solved problem, if you could get everyone to accept using something like Prosper and a bunch of TeX files on some sort of version control plus a few scripts. But that’s not going to happen. Even I will admit that it would probably be a little painful.

So let’s discuss something viable.

Many people at this point are pretty familiar with the idea of a Wiki…reading one, at least. Instead of some heavy-handed new tool that has to convince everyone its way is the One True Way and takes no prisoners[3], I suggest using a wiki package or writing something employing the wiki model to facilitate authorship of slides for presentations. Most professionals in technology sort of understand the idea of a wiki, even if in practice they never use them or contribute to them, but at least it’s not completely alien and probably not too scary to new users. I think. Let’s start with that premise. Let’s also presume the Wiki has a notion of “Presentations,” or a path through a series of presentations, including the empty presentation (no pages) and automatic singleton presentations (consisting of one page, every slide is a presentation). This is an important recurrence relation, because I will henceforth never organize things in terms of slides except when referring to the current way of doing things: in this model all presentations will be formed by composing presentations.

Let’s consider how we can address the issues raised above:

  1. Big binary files exchanged by email
    If the presentations are held in a wiki, each presentation can be maintained independently and there will be a common place to view and download presentations. Hopefully this will prevent attaching big files and sending them around all the time. By exploiting page versions it is possible to make sure that a given presentation will remain with the same content for all-time.
  2. Relatively weak version control/merging capabilities
    Since each presentation is tracked and worked on independently, one would only need to be concerned about clashing of the revision of single atom of content. The definition of an “atom” is most obviously at least on the per-page-level, but could be as fine as the word/paragraph level by using diffing/patching, although this gets more complicated. This also doesn’t seem to present a humongous problem on Wikipedia, even on the busiest pages. They still seem to get contributed to and updated.
  3. Out of date borrowed slides
    One can easily obtain a list of updated sub-presentations for any or all presentations and accept or reject changes. The result is a new, unique presentation; old presentations are never lost, so reverting is easy.
  4. Lobotomizing an old set of slides just to get the same style in an integrated Microsoft PowerPoint™ file
    Styles should only be loosely connected to a presentation. One should be able to paint a new style over any presentation unit, so absolute conformity is an option.
  5. Having to reprocess slides manually to get them to conform to some new style
    Simply apply a new style to the presentation
  6. Weak metadata/commenting capabilities, leading to bad slides
    Right now presentation slides are meant as much to be distributed and read as much as they are meant to be shown in person. The results are slides with entirely too much text and visual noise, distracting the viewers during presentations. Cutting out some detail upsets some stakeholders because then the slides no longer communicate everything that was said during the presentation in person. The relatively wimpy commenting facilities seen in most presentation packages doesn’t seem to please anyone, but an up-to-date nicely-formatted cross-referenced wiki-page associated with each presentation may be better. This idea is not new at all[4], although its proper implementation may be tricky.
  7. Lack of visual support for “off the rails” discussion, leading to unnecessary hand waving
    If a presentation goes off the rails — and this is not always a bad thing — one is often left without visual aids must do blackboarding and waving of hands. It’d be better to have a big list of short presentations that one is at least moderately familiar with so that if discussion wavers to another engaging topic that deserves more through discussion one can pull up more appropriate presentations and documentation. Another way to use this is to include “see-also” sub-presentations that one can visit and then jump back from the aside back to the main presentation. In this way a presentation flow resembles a NDFA.
  8. Almost no serious collaboration support whatsoever
    All this version tracking buys us a nice, fluid system that allows for synchronized updating of artifacts with a number of authors that far exceed two (as is, in my experience, the limit with more traditional methods). Wikipedia is an empirical example of this model working.
  9. Doesn’t facilitate location of useful, other slides available in the organization
    This is an exciting one. With all this graph connectivity information searching and finding more information may be a lot easier. One can also tell roughly how much a presentation is being used elsewhere. There are many potential uses for this.

Here’s a conceptual sketch of a more formal treatment[5]:

presentation -> {id: Id, presentation: [From], presentation: [To], page: P, 

This slideshow could not be started. Try refreshing the page or viewing it in another browser.

: Ancestry} presentation -> Nil id -> UniqueIdentifier (probably 64 bit integer) page -> {Version, Data} (A version and some payload)

A presentation from Nil to Nil is the empty presentation. If From is Nil, then this presentation is the head of a presentation, if the To is Nil, then it’s the terminating point of a presentation. The Ancestry variable allows for tracking of the evolution of presentations over time by showing what presentations derived the current one[6].

Addendum:
Another interesting idea is to break free of stack-based thought and use continuation-style thought in order to model presentation traversal, but I suspect that will break the minds of many people. That way you may not jump back to the datum you started at, but somewhere else entirely…possibly never to return, or perhaps carrying all your exit continuations along with you for possible use. In any case, the ‘presentation’ datatype can support this since it has an ‘environment’ (the page and version) and a ‘label’ (the From and To nodes). In fact, all standard linear presentations are similar to invoking a continuation that never calls a return. This is another way to think about this. It’s certainly all within reach if presentations are simply recursively connected to presentations and should you define a ‘page’ datatype an adequate ‘environment’ and the presentation datatype (which includes the page) the ‘code.’ In any case, thinking of it of a NDFA (as mentioned above) is probably easier to understand and a less-stretched analogy, but the idea of carrying multiple exits is an intriguing one that deserves at least cursory attention. An NDFA analogy would also suggest a simple and well known form of visualization.

I have changed my type definitions somewhat to give more indication of the many-entrance many-exit nature of presentations by listifying To and From, although the same thing could have been accomplished by a large number of presentation type instances. I think this is more like what an actual implementation might look like. It may make ancestry tracking less hairy, too. My original goal was to define as little as possible to get things done, but then I decided this was silly and I should be spending a few more characters to more adequately carry the idea.

Footnotes:

[0]: Now slightly less-binary with the new Microsoft Office™ 2007 XML format. If anyone can hope to actually understand it in a reasonable amount of time.

[1]: Common scenario: “HR says we’re at X employees…oh, that’s old, we’re actually at X + N, let’s move on…” In isolation this is really not so bad except N is sometimes wrong due to misremembering and with enough such errata it is distracting. There’s no obvious reason why it has to be so hard to stay up to date.

[2]: Especially with over two people. The track edit feature is useful, but did you ever try giving slides out to five people and merging the changes?

[3]: Sound familiar? It’s Lisp vs UNIX again

[4]: Possibly a variant of Knuth’s notion of Literate Programming

[5]: This is hand-wavy amalgam of syntax/semantics borrowed from Prolog, ML, Erlang, Lisp (for Nil only, really) and/or Haskell. If anyone complains I’ll provide something more rigorous. My variables start with capital letters, my types start with lower case and when describing a variable use a colon, and lists are denoted with […], so

This slideshow could not be started. Try refreshing the page or viewing it in another browser.

is a list filled with “presentation”-typed things. -> is my reduction symbol. Braces denote tuples. Comments are in parentheses

[6]: One of the interesting problems presented here is that Wikis generally attempt to converge on an authoritative page that everyone sees as opposed to branching and derivative works, which itself can create a mess of cross-generational merging. The problem can be seen as the same one that plagues the distributed vs centralized version control camps. As merging is still a problem that seems to not have been solved to everyone’s satisfaction, implementors would do well to pay attention to the subtle issues engaged by both camps in that community in an attempt to make an informed decision.

Written by fdr

June 29, 2007 at 12:05 am

Posted in projects

SAGE Computer Algebra System

leave a comment »

While skimming the RSS feed for Lambda the Ultimate I spied an entry about a functional definition of TeX’s formula typesetting algorithm. For me the real gem here was not some insight about TeX, but a solitary (at the time of this writing) comment about the CAS SAGE (and its integration with jsMath). SAGE is pretty interesting, but we’ll get to why it’s interesting in a moment after discussing some other well known CAS and scientific computation programs.

I have never understood some people’s obsession with Matlab, Mathematica, and similar tools. I mean, I kind of do: they are allegedly well designed tools, the former providing well-optimized versions of just about every well known numerical recipe you want. Matlab was popular with the EE guys in college for plotting and numerical methods, and I know that the Mathematica guys can do some impressive symbolic manipulation and interactive visualization with relatively minimal effort…

But something feels wrong. It’s not just a matter of the proprietary nature of these tools, although that and the high price can sometimes be most…dissuading. Especially when the pricing information is hidden after a login screen, or one of the first calls to action is to request a quote. One knows that they are not in for a good time under these circumstances, generally speaking. But projects like GNU Octave also set off my radar, even though it has low cost. Why?

The real answer is the “tool among tools[0]” philosophy. This is also possibly the main reason why Lisp lost out to C and UNIX, as well as the relative obscurity of Smalltalk, despite its alleged brilliance (it is not something I have used myself). Part of this is cultural (GNU Octave: a scientific computation system mostly compatible with MATLAB, nowhere in the mission is high levels of integration) and part can be more sinister motivation (MATLAB has a negative incentive to ease transition to Mathematica and vice versa). Mathworks and Wolfram are both given incentive to instead make their own platform as attractive as possible and trying to one-up everyone else instead of wasting time making sure that everyone can move their codes around: the latter is just not good business.

Enter SAGE. This package is shamelessly and unabashedly about integration, much in the Python philosophy. In fact it’s written in Python with liberal FFI bindings using the most excellent Pyrex extension language and compiler. It has unidirectional bindings to an impressive number of CAS systems (including many I have never heard of) with some basic parsing/data type normalization. And, most importantly, it avoids the problem of getting yourself locked into a particular CAS with no hope for escape[1].

For me personally it may be useful for plotting; I still typically write my actual math codes in C (maybe Lisp for prototyping, but that’s another post). I will have to take a look. I hope that in the future that they may gain some interoperability with statistical tools such as SAS and R (which already has a Python integration package that, alas, only works on Python 2.4), as I think this would only increase the leverage of this tool tremendously. One hopes that the proprietary vendors won’t take offense and start trying to break things, ala Microsoft. If things get even better then maybe they’ll pay due notice and at least try to avoid breaking things all the time. At best, they’ll start to feature SAGE (or similar system) interoperability as a feature, and the world may be better off.

Footnotes:
[0]: I first heard this phrasing from Guido van Rossum when he gave a talk at Berkeley.
[1]: As detailed by another blogger, who also gives a different take and overview of SAGE. Actually, his is a lot more detailed in just about every respect, so I’d highly recommend reading this if you are seriously interested.

Additional Resources:

Written by fdr

June 28, 2007 at 1:51 am

Posted in mathematics

KPMake: A Proposed Dependency Resolution Tool

with 3 comments

One of the insane ideas I have floating around is to write yet another Make replacement. My rationale is as follows:

  • Make is simple, but not very powerful
  • Absurd macro languages and generation facilities grant enough power for “real work,” but are painful and ugly.
  • People widely use ant, probably a step down from Make if not for the incredible dedication to tool support to make it convenient for Java somewhat.

“But wait!” you say, “haven’t you ever heard of SCons or Rake, you imbecile?”

Why yes, yes I have. The nice thing about these approaches is that they recognize that most non-trivial build-systems are going to demand the power of a “real” programming language; certainly not the looseness of shell scripting nor the inflexibility of ant buildfiles. (Quote a colleague: “it’s like Lisp, without any of the advantages of Lisp!” ) The downside is they lose out on clarity. A hand-written Makefile tends to stay very simple whereas any convoluted thing can happen in a SConstruct file (the rough equivalent to Make’s “Makefile”), leaving the user doing some head scratching as to why something-or-other happened or why-or-not something built and how or why. On the plus side, the machinery of SCons also gets you things like a “clean” target for free. Another downside is agnosticism: Makefiles, thanks to their simple nature, can be easily written to control complex toolchains that do things other than compiling programs[0]. The barrier for entry in writing a builder in SCons is considerably higher than simple rules as dictated by a Makefile such as

foo : bar baz
	cat bar baz > foo

(Yes, I realize you can use automatic variables like ‘$@’ and ‘$^’, but let’s keep it less cryptic for now)

I have been working on and off on a project that uses a Makefile to orchestrate around a dozen individual tools. Not only is it robust, but after almost a year of inactivity it’s possible to read the Makefile and recall the roles of all the tiny moving parts and how they relate to one another. I then went back and wrote documentation. But now I want to improve on Make by allowing for complex behaviors while retaining succinct, declarative syntax and keeping the system easy to debug. How do I portend to do this?

KPMake, or “Knowledge Powered Make” is one possible solution. In a nutshell: I want to leverage state of the art symbolic AI techniques to solve a relatively mundane problem of dependency resolution while including modern explanation generation facilities to increase user understanding of why any particular actions were taken and to help debug these actions. I will probably use the KM knowledge base, partially because of being exposed to it at work and partially because I know it has been successfully been used in large projects such as Vulcan Inc’s ambitious Project HALO. I also happen to know it’s available as a single lisp file and has practically no set-up or portability problems. The base package for KPMake should simply be a distribution of KM, some basic models to handle basic generic dependency resolution, and macro/procedure support to make writing KPMakefiles easier. Knowledge models to build for say, particular languages (such as a gcc or a javac model) will be distributed and maintained separately and ought to have an eye towards making things even more convenient.

Addendum:
A neat feature that is enabled that I didn’t include in this writing to begin with is using situation calculus to allow programmers to load, save, and debug situational traces as well as perform advanced error recovery. KM supports situational reasoning and simulations that can be used to test build scripts (and write fuzz testing!) to make sure they respond in at least a reasonable way to unexpected events as well as increasing visibility of every step of the build process: right now one generally has to do a lot of manual logging to get a trace of what’s going on or, if one is lazy, set a breakpoint and only be able to view a few slices in the process. You can also have your build engineers in India and send your broken build “core dump” abroad and mostly avoid the “well it works on my machine” scenario.

Footnotes:
[0]: I owe my appreciation of this technique to Professor Strain. Imagine a Makefile to build the program, run the tests, spit out any pictures, compile the TeX sources, and finally deliver you a postscript file to send off to the journal. I’m sure if submissions were something that had to be done often there would be a “mail” target that’d also deliver the resultant postscript to the relevant journal(s).

Written by fdr

June 27, 2007 at 9:10 pm

Posted in lisp, projects

Why did I start this thing anyway?

leave a comment »

Great. Another blog, just what the world needed. I have been persuaded to start this syndication partially because I have been stumbling across so many interesting things in the space of computer programming and computer science and yet have no way to store or search them all and partially for the reasons detailed elsewhere. Although I hope I will be able to avoid a job search altogether and always have a colleague with an interesting workplace that they can refer me to I figure simply putting my thoughts down for documentation couldn’t hurt. I may have gotten lucky finding an interesting job out of the gate from Berkeley, but one should never count on luck all the time…

I also figured that it would be possible, if improbable, that someone else may be able to use my writings.

And so, let us take this forward…

Written by fdr

June 27, 2007 at 8:38 am

Posted in meta-blogging