Archive for the ‘lisp’ Category
This has been annoying me for some time: develock mode doesn’t support Python out of the box. I had a hacked-up develock where I simply changed all references of Ruby (via string-replace) to Python, and things worked pretty well…but I had to constantly load my hacked up develock.
Well, I finally got around to customizing develock properly post-facto, I think. Rejoice, fellow whitespace pedants. Here’s the snippet:
;; ;; develock-py.el ;; ;; Made by Daniel Farina ;; Login <email@example.com> ;; ;; Started on Sun Feb 14 09:21:21 2010 Daniel Farina ;; Last update Sun Feb 14 09:27:12 2010 Daniel Farina ;; (require 'develock) (defcustom develock-python-font-lock-keywords '(;; a long line (develock-find-long-lines (1 'develock-long-line-1 t) (2 'develock-long-line-2 t)) ;; long spaces (develock-find-tab-or-long-space (1 'develock-whitespace-2) (2 'develock-whitespace-3 nil t)) ;; trailing whitespace ("[^\t\n ]\\([\t ]+\\)$" (1 'develock-whitespace-1 t)) ;; spaces before tabs ("\\( +\\)\\(\t+\\)" (1 'develock-whitespace-1 t) (2 'develock-whitespace-2 t)) ;; tab space tab ("\\(\t\\) \t" (1 'develock-whitespace-2 append)) ;; only tabs or spaces in the line ("^[\t ]+$" (0 'develock-whitespace-2 append)) ;; reachable E-mail addresses ("<?[-+.0-9A-Z_a-z]+@[-0-9A-Z_a-z]+\\(\\.[-0-9A-Z_a-z]+\\)+>?" (0 'develock-reachable-mail-address t)) ;; things to be paid attention ("\\<\\(?:[Ff][Ii][Xx][Mm][Ee]\\|[Tt][Oo][Dd][Oo]\\)\\(?::\\|\\>\\)" (0 'develock-attention t))) "Extraordinary level highlighting for the Python mode." :type develock-keywords-custom-type :set 'develock-keywords-custom-set :group 'develock :group 'font-lock) (defvar python-font-lock-keywords-x nil "Extraordinary level font-lock keywords for the Python mode.") (setq develock-keywords-alist (cons '(python-mode python-font-lock-keywords-x develock-python-font-lock-keywords) develock-keywords-alist)) (plist-put develock-max-column-plist 'python-mode 79)
This text is made available under the license CC0 1.0 Universal (CC0 1.0).
Many wax poetic on the virtues of Lisp, and I would say for good reason: it was a language and philosophy that was (and is) far ahead of its time in principle and oftentimes in practice. But I have to cede the following: the foundations of Common Lisp are becoming somewhat ancient and there are many places that have more modern roots where I would have it borrow heavily to assist in creating my programming nirvana. In talking with yet another friend from Berkeley (and the author of sudo random) we had discussed some of these things and I decided it was worth enumerating some of them and pointing to ongoing work that implements those fragments or something close to it.
The reason this post is titled in such a sober way is because the Lisp I envision is probably many lifetimes of work to accomplish, and as such, I cannot see myself accomplishing everything on my own. Granted, I still have a lot of life ahead of me yet, but that only makes the equation all the more depressing. Implementation could probably span many PhD theses and industrial man-decades. As such, I can only hope that it’s the Lisp that more or less exists before The End Of My Lifetime. I would be glad to one day say that I contributed in some or large part to any one piece of it. This whole post smacks of the “sufficiently smart compiler” daydreaming, so turn away if you must. Alternatively, you can sit back, enjoy, and nit-pick at the details of compiler theory and implementation, some (or many) of which I’m sure have been overlooked by me.
Finally, this is not by any means a list of things that current implementations do not have, just things that I feel would seem most valuable. Some are not even necessarily technical challenges so much as social and design ones. I view this hypothetical Lisp as not only some new features, but a set of idioms that I more programmers generally agree on. “The Zen of Python” is an excellent example of this. There are definitely some lisp-idioms, but they have become somewhat antiquated and are hard to enumerate in some part because of the baroque and aging Common Lisp specification. The hardest idiom to get around is fearlessness and ease of metaprogramming, which in part is great, but also can make standardization difficult socially as it assists in making herding Lisp programmers difficult. Herding lisp programmers is about as tough as herding cats armed with machine guns.
However, I think Lisp’s guiding intentions have lied in flexibility. Common Lisp, for its time, was the kitchen sink. It still is, in large part, but may benefit from new idioms and a fresh slate, as well as deeper and more integrated compiler support for some of the features mentioned below.
1. The Compiler is your Friend.
Leaders in this area: SLIME and its Swank component
Honorable Mentions: DrScheme, IPython
Nowadays modern IDEs seem to do everything up to the semantic analysis step in compilation to give you advanced searching and refactoring capabilities. Oftentimes a lot of compiler work is reimplemented to support the features of the given IDE at hand, and much work is duplicated, sometimes to the point of implementing a whole compiler, as in Eclipse.
SLIME and Swank have a twist on this that I like: Swank is responsible for asking the compiler implementation itself (in my case SBCL) for information on various symbols, their source location, documentation, and so on. It communicates all this information through a socket to a frontend, which comprises the rest of SLIME. In doing so it gains the authoritative answer to queries about the program because the compiler of choice itself is delivering its opinion on the matter, even as it runs.
This allows for an accurate way to track down references that may be created dynamically by asking the figurative question “What would the compiler do?”. From this SLIME gains extremely powerful auto-completion facilities that are robust to techniques are either unavailable in other programming cultures or, if used, would defeat the programmer’s completeness of assessment of the program. Lisp is the only runtime/language I know of where I can eval a string and still be able to access the resulting, say, function definition and documentation strings with full auto-completion and hinting in my editing environment.
Were Lisp more popular, I would bet Swank-compatibility and feature-richness would be a defining feature for Lisp implementations, and frontends using Swank would be prolific. The socket interface was definitely the way to go here.
2. Networked & Concurrent Programming
Leaders in this area: Erlang
Honorable mentions: Termite & Gambit, Rhino, SISC, Parallel Python, Stackless Python, and many others.
Sun Microsystems, despite its beleaguered business, had at least one thing very, very right: “The network is the computer.” The ability to talk on multiple computers on a network is increasingly important in our era, and making it convenient can lead to extraordinarily powerful, robust applications. Erlang definitely leads the pack in this area: an industrial strength, reasonably efficient compiler that can do I/O pumping using efficient kernel-assisted event polling as well as automatically distributing computation across multiple processors. It also can support sending of most higher-order objects – such as closures – across the network parts of messages, as well as a powerful pattern-matching syntax that allows for relatively easy handling of binary (and other) protocols.
With processors increasing the number of cores and computers continually falling in price the ability to (mostly) correctly use multiple machines and multiple processors on each machine will become a dominant influence for writing programs that require high performance. Erlang has been demonstrated to be excellent at managing network I/O switching and handling, which is not surprising considering that is its main application as a tool. It could, however, stand to improve upon sequential execution performance: let’s just say I won’t be rewriting my numeric codes in Erlang just yet, despite the potential for mass distribution of computation. I also miss some of my amenities I’ve gotten used to in Lisp, but Erlang excels in its area for sure and has many lessons to teach.
3. First Class Environments
Leaders in this area: T, MIT-Scheme — mostly academia
Honorable Mentions: Python, Common Lisp
First class environments are the beginning and end of many problems, but I feel that having this facility would be useful for debugging and implementing creative namespacing and many other important features. Opaque environments can sometimes still be handled with mostly reasonable performance, but as far as I know nice, transparent environments — i.e., things that look like property or assoc lists straight out of the SICP — are just an absolute killer for performance and make compiler optimizations nigh near impossible. But that’s OK…because there’s nothing more annoying that shying away from using thunks or currying when these techniques are the most simple and expressive solution because you are afraid that it will become a chore to poke at the environment to debug these anonymous function instances later. By contrast, “locals()” in Python, for example, can be a godsend for special tasks and quick debugging, even if it only returns the local (and generally most useful) environment.
First class environments also help in “fixing” tricky issues that crop up and are cause for Scheme’s motivation for hygienic macros, famous for being hellishly picky to get right (Lisp-2 fans always seem to harp on this point, although what I’m suggesting may be something more like dynamic-lisp-N). I still feel that the quasiquote, despite its sometimes-ugliness, is the right primitive model to follow. And, in fact, since there seem to be hygienic macro packages build on top of the primitive variants, one could get those almost for free. Perhaps hygienic macros could also be idiomatic, I know not.
In conclusion, the goal is to break down some of the final barriers between code and data and allow for some interesting if unorthodox transformations and redefinitions at run-time and compile-time. It’s also important to have this functionality if one wants to dynamically redistribute computations across machines or perform run-time metaprogramming, which may be a great way to introduce new compiler features that can be toggled on and off.
4. An External Native-Code Generator
Leaders in this area: LLVM, JVM
Honorable Mentions: Parrot (if only because of relative vaporwareness), Mono, C–, Bit-C
More important than the individual merits of any of these specific VMs is that they are maintained separately by Other People™. It is high time to stop re-inventing architecture-specific code generators and local optimizers over and over. With the JVM catching up or passing up language-specific native code generators (it’s now more or less tied with OCaml on the Alioth compiler shootout with Java and doing well with Scala) and LLVM recently showing on-par and sometimes better performance than vanilla versions of GCC for some C code, I am buoyed with hope that one can generate relatively high-level (or at least architecture-independent-ish) bytecode and still get respectable or even good performance. JIT, ironically enough, may be more well suited to the lispy world than the Java one (although its instrumental in the Java world for sure) considering that it’s pretty common to go in and rebind definitions in Lisp while a system is running. One might argue that changing declaim/proclaim statements and evaluating code is in fact better than JIT, and I could see there being a case for that, but it just seems that lots of work is being poured into run-time code generators that could be leveraged.
One interesting idea is compiling to Bit-C, which has support for low-level manipulations and type-verification, yet also is a lisp.
5. Optionally Exposed Type Inference and Static Typing
Leaders in this area: Epigram, Qi, Haskell, the ML family
Honorable Mentions: CMUCL and descendant SBCL
Inferred static typing and type inference is all the rage these days, with claims for increased program execution and correctness. And I’m all for that, and Qi is an excellent example of the ability to do considerable amount with standard Common Lisp facilities. Qi has the extremely sensible goal of remaining in Common Lisp, and thus ensuring that it has measurable chance of having traction in my lifetime.
Although I’m not sure that the interface to typing I would expose is necessarily (or necessarily not) Qi’s, but I do want my compiler to tell me what it thinks about various tokens littered throughout my code, allowing my editor to do things like red-flagging unsafe operations and type disagreements or have a mode to show expensive dynamic operations or inferred types when I’m seeking optimization. Ultimately, not all of my code will fit neatly into the pure-functional paradigm and may be better served by the occasional side-effect or global state, and I would like type rigor to extend as far as possible, but not become a burden. Sometimes I just want a heterogeneous hash table of elements without any baggage. I think it makes sense to rigorously type nuggets of code, but the Lisp in question should not be fascist about maintaining ‘perfect’ consistency throughout an entire program. Epigram and Qi have this model exactly right: pay as you go. Flexibilty when you need it, but not to the point where it is fascist. In the future, it’d be nice to see some efficiency benefits from compiler-awareness of carefully statically typed nuggets of code that otherwise would not be possible, such as eliminating some bounds checking.
Finally, CMUCL and SBCL already do quite a bit of type-inference, it’s just not exposed to the user so nicely in SLIME except through warnings blown out of stdout. Even then, they can be very useful. Ideally I could simply ask SLIME to access the type of a given symbol and (CMU|SB)CL could tell me what it thinks.
6. Pattern Matching
Leaders in this area: Many. MLs, Haskell, Erlang, lisp macros for Scheme and CL.
This is an amenity that should become standard part of the lisper’s idiom for convenience if nothing else. It’s just that there are a number of pattern matchers and none of them that I am aware of has become the idiomatic one.
7. Continuations and Dynamic-Wind
Leaders in this area: Scheme, almost exclusively, and an implementation: STALIN
Scheme is probably the canonical continuation and dynamic-wind implementation. Implementation is subtle and performance impacts can be significant, but give pleasant generality to schemers when designing new control constructs. Combined with first-class environments one could do quite a few interesting things, such as save the entire program state as an environment-continuation pair. Unfortunately, implementation is incredibly painful. Yet, it has been done, and with a pay-as-you-go model it may not need to hurt most code’s performance very much (few people ought to be writing code riddled with continuations). See the STALIN compiler, which does all sorts of rather insane things, along with the insane compilation time. It’s mostly intended for numerical codes, though.
8. Pretty and easy (but optional) Laziness
Leaders in this area: Haskell, Python, Ruby, Common Lisp Iterate package, many others
Honorable Mentions: Anything with closures, Screamer. Scheme for call/cc allowing even more strange general flow control.
I like Python’s yield operator that transforms a normal function into a lazy one that is expressed as an iterator. In particular, I like to avoid, when possible, specifying representation formats for a sequence or set of things when they aren’t strictly necessary. With continuations one can get very nice looking implementations of generator-type functions although this may have an undesirable performance impact. As such, most languages that have laziness or generators implement special restricted-case behavior to get good performance, and that should probably sit pretty high up on the optimization list for this Lisp.
9. Convenient and Pervasive Tail Recursion
Leaders in this area: Erlang, Scheme
Honorable mentions: anything with tail recursion elimination and optional arguments
I like using tail recursion to express loops, as I find them more flexible, easier to debug, and understandable than loops and mutation. Combined with pattern matching it’s fiendishly convenient at times and can in some circumstances greatly assist a compiler if assignments go unused. I don’t meant to say this should be the only means, but I would like to see it be idiomatic and terse to write. The Scheme named-let and Erlang’s pattern matching both assist this process. One of the main priorities that is key is making it easy to hide the extra arguments often required in a tail-recursive function to hold state from the outside world, and Scheme’s named-let, I think, handles this rather beautifully for common cases.
I was originally writing a much more ambitious post that tried to introduce category theory and its uses, but have been having a hard time writing it. Unfortunately it is not something that I think can be easily explained tersely, although my attempts to do so have lead me to learn a lot more about category theory than I thought I’d ever want to know. Yet, I would still like to expose monads in a way that Schemers and Lispers would relate to them and perhaps grok the potential usefulness of the concept, which at times may seem to be a strange dodge performed by those “pure” functional people. I am forgoing some of the background theory I learned in exchange for trying to highlight some of the key ideas as applied to writing programs by example and narrative.
The only obvious prerequisite knowledge here is that of higher order functions and using functions as data. These concepts are well-covered and taught by a variety of computer science texts, such as the venerable SICP, webcast university lectures, and many other mediums. I assert that this is an important prerequisite because monads provide you is a rigorous and structured abstraction for dealing with data transformations and the application of higher order functions.
Consider the “Maybe” monad that is often used as a primer. We’re going to offer a quick rough translation into Scheme. The semantics we’re interested in of the Maybe monad are as follows:
- Any function should be able to be applied to the monad
- Not necessarily successfully such as in the case of a type error, although with a little more work we could get these semantics as well via dynamic-wind or unwind-protect.
- The monad will prevent us from operating on the value NIL, and instead simply ignore computations that would attempt to operate on NIL.
- This is similar to how computer architecture will propagate NaNs in computations.
We only need three tools to get this behavior:
- A procedure to construct the monad from some object in the category of Scheme values
- A procedure to transform functions that would normally operate on those values into a function that will operate on the monad value
- A procedure to merge double-wrapped “Maybe” monads to normalize them
Without further ado, here are some function definitions:
(define (make-maybe value) value) (define (map-function-to-maybe fn) (lambda (maybe-object) (if (null? maybe-object) '() (make-maybe (fn maybe-object))))) (define (join-maybe maybe-object) maybe-object)
This may seem incredibly pointless, but consider this motivating example:
> (define maybe-cdr (map-function-to-maybe cdr)) > (maybe-cdr (make-maybe '(1 2))) (2) > (maybe-cdr (maybe-cdr (make-maybe '(1 2)))) () > (maybe-cdr (maybe-cdr (maybe-cdr (maybe-cdr (make-maybe '(1 2)))))) ()
As you can see you would have normally had to have check for NIL values to prevent crashing in the last expression, but we now have a generic way to instrument functions with checks. The result is a function that accepts and returns the monad’s type. This is the essence of Functors (of which monads are a special type of) to take home: Functors are morphisms between categories, categories themselves contain objects and morphisms, and in order to a be a functor we must be able to transform both the objects in a category and the morphisms between them. There are few additional properties and rules to be considered a functor or monad that can be useful in proving correctness or getting certain guaranteed behavior, but the main take-away idea is writing a uniform “layer” that will map both functions and the data they operate on to another “type” with different semantics. Monads also have to obey rules for compositions of “join” and “map” to maintain invariants, but it is sometimes okay to bend these rules: there are many not-quite-monad functors that can prove very useful. Consider the monad and the functor as guiding metaphors.
Moving on: Why do we bother at all with make-maybe and join-maybe? It so happens that the underlying representation is simple in this case (since we only need to tweak functions) and thus the construction and join operators are trivial. This would not be the case in say, Haskell, where the typing system will prevent happy accidents like this one; (make-maybe (make-maybe ‘(7))) would actually have a double-wrapped Maybe-type and “join” would have to do some thinking on how to sensibly put the nested Maybes together. (This is also called “multiplication” in category theory parlance)
Also notice there is no way to “get out of” the monad. Getting “out of” the monad type is an additional functionality that goes above and beyond what it means to be a monad; it would be a monad with an additional transformation to whatever is seen to be fit. However, once again, dynamic typing provides us a happy accident in this particular case and we can actually use this particular monad’s representation directly in our programs. Generally this is not the case and probably should not be encouraged: one should think carefully before breaking into a monad to deal with its representation ad-hoc, otherwise we lose some benefits of containment.
List, the monad you’ve used before
Let’s just rephrase “list” as a monad here:
(define (make-list value) (list value)) (define (map-function-to-list fn) (lambda (list-object) ;; "Map" has the proper return type "for free" (map fn list-object))) (define (join-list value) (apply append value))
We have now defined a monad. Again, it may seem stupidly trivial, but consider what it has in common with “Maybe”:
- We can accept anything and create a list monad instance
- We have some transformation that modifies a function to operate on list-monads instances.
- We can get rid of monad layers
- Notice that this definition maintains composition, a useful property of monads. This means you can perform a map of a join and then another join on the result or simply perform two joins to get the same value.
- This presumes the lists are well-formed, i.e. have no non-lists as elements when join is called. Otherwise we rightly should have a type error because you are applying “join” to a non-list, although you could imagine writing a more forgiving implementation with Maybe-like qualities, or force all internal values to be singleton lists rather than scalars. Such is the flexibility of thinking in terms of this abstraction.
- We have no way of getting anything out of the monad, and this time it’s more visible: we can never go from ‘(1) -> 1, unless we provide another transformation, such as “car”, which would have the type “List-Monad -> *”.
Something less contrived, a “Watchful” monad
The final example I will be presenting here is not-quite useful, but pretty close. It is not complicated, but it is non-trivial. We seek to accomplish the simple task of letting us hook into the application of a function to a value, useful in debugging or notifying other components of a system of changes. This is often a task given over to mutation because it’s a pain to pass around this state all the time. Here is an attempt to try to give an alternative to mutating global state while allowing arbitrary procedures to be notified of when a value is being operated on by any function that has been transformed by our “map-function-to-watchful” procedure. I implement some simple functionality using this monad to attach arbitrary functions (called “snoopers”) that are allowed to look at the value being passed to a function being applied to the monad and their own previous state.
In this section I will first show how one can use define and use the snoopers, how using them appears in an interpreter, and then finally the monad implementation itself.
Using the Watchful monad
Here is how one would define some “snoopers” to watch a value, along with defining a watched-value instantiation for the number zero.
;; Some watchers (define (modification-watcher state thing) ;;Counts the number of times the value has functions applied to it. ;;Notice that we do not use "thing," but we must accept it to have the ;;proper number of arguments (if (null? state) 1 (+ state 1))) (define (previous-values-watcher state thing) ;; Retains previous values in the stat (cons thing state)) ;;Using watchers ;;Add both watchers, notice how each add-watcher call returns a monad in turn. (define zero-being-watched (add-watcher previous-values-watcher (add-watcher modification-watcher (make-watchful 0)))) ;; Let's transform a function...say an increment function (define watcher-incr (map-function-to-watchful (lambda (x) (+ x 1))))
Showing use of the watched-value and the watch-increment function in the interpreter
Here’s an example interaction:
> zero-being-watched (((#<procedure:previous-values-watcher> ()) (#<procedure:modification-watcher> ())) . 0) 0)
> (watcher-incr zero-being-watched) (((#<procedure:previous-values-watcher> (0)) (#<procedure:modification-watcher> 1)) . 1)
> (watcher-incr (watcher-incr zero-being-watched)) (((#<procedure:previous-values-watcher> (1 0)) (#<procedure:modification-watcher> 2)) . 2)
> (watcher-incr (watcher-incr (watcher-incr zero-being-watched))) (((#<procedure:previous-values-watcher> (2 1 0)) (#<procedure:modification-watcher> 3)) . 3)
As you may be able to tell from the above, the internal representation format of the monad is an association list paired with the watched value. That association list contains procedures and state that they are allowed to store things in (you could do an even cleaner job with continuations, but then they wouldn’t print as nicely), so right now all state-saving has to be done explicitly.
If this example were to be used for anything serious one would have to fix up a couple of things, but it’s pretty functional as-is (And purely functional in the other meaning of the word). The following definition is not as long as it looks; there are lots of comments and some stuff (like filter and the merge-and-remove-duplicates for “join”) that eat up lines without being intrinsically related to monads. There is no subtle twist; it’s exactly the same as the previous two examples (make, map, and join) with an additional function (add-watcher) that operates on the monad directly to add snoopers. Notice that functions passed through map-function-to-watchful are instrumented to call all the snoopers and construct a new list full of the procedure reference and the new state each snooper returns for its next invocation.
One final thing deserving explanation is that my add-watcher reuses the join operator: basically, I construct a new monad with the input monad as the wrapped value, then run join. This is so that I can get the duplication elimination and state list construction for free. It certainly makes the definition of add-watcher short. (Despite its relatively huge commenting)
The watchful monad implementation
;;Watchful Monad (define (make-watchful value) ;; the car of the monad is a list of snoopers, the cdr is the value being watched ;; Each snooper is in the form (function state), where state is returned ;; by the snooper after every invocation so it can store things as it chooses. ;; The snooper function should accept two args: state and data, so that they ;; can have some memory and report on the data being operated on by the ;; watched procedure. (cons '() value)) (define (map-function-to-watchful fn) (lambda (watchful-object) (let* ((snoopers (car watchful-object)) (arg-data (cdr watchful-object))) ;; Compute new snooper state, now they all know what "fn" is operating on. (cons (map (lambda (snooper) (let ((snooper-fn (car snooper)) (snooper-state (cadr snooper))) (list snooper-fn (snooper-fn snooper-state arg-data)))) snoopers) (fn arg-data))))) (define (join-watchful value) (letrec ((inner-snoopers (car (cdr value))) (outer-snoopers (car value)) (inner-value (cddr value)) ;; The standard "filter" we all know and love, but not included in ;; R5RS (filter (lambda (predicate seq) (if (null? seq) '() (if (predicate (car seq)) (cons (car seq) (filter predicate (cdr seq))) (filter predicate (cdr seq))))))) ;; Merge the outer-snoopers and inner-snoopers, outer-snoopers win in event ;; of a collision, which in this case means the same procedure with the same ;; state. (cons (append outer-snoopers (filter (lambda (inner-snooper) (not (member inner-snooper outer-snoopers))) inner-snoopers)) inner-value))) ;; functions that operate on the watcher-monad directly (e.g. Watcher-Monad -> Watcher-Monad) (define (add-watcher watcher-fn watcher-object) ;;watcher-fn must accept two args: state and the value in the monad when a ;;function is called on that value and return a new state. We abuse the ;;join-watchful function I wrote by first encapsulating the watcher-object ;;as a value inside another watcher-object and then employing "join-watchful" (join-watchful ;; We make a singleton assoc-list with watcher-fn with state NIL to start. ;; We know our monad internally is just a cons pair, for brevity break the ;; abstraction here... (cons `((,watcher-fn ())) watcher-object)))
Hopefully these examples shed more light on monads and their uses for those who come from the Lisp-ish backgrounds, or dynamic typing backgrounds in general. I often found the Haskell-annotated versions caught up in typing which sometimes made it more difficult than absolutely necessary to explain the generalities the idea and motivation for monads with languages where mutation is more convenient and typing is more relaxed. As such, I wrote this for the programmer already adept in using higher order functions and thinking of functions as data yet unsure of what the fuss is about monads and how they can be useful.
One of the insane ideas I have floating around is to write yet another Make replacement. My rationale is as follows:
- Make is simple, but not very powerful
- Absurd macro languages and generation facilities grant enough power for “real work,” but are painful and ugly.
- People widely use ant, probably a step down from Make if not for the incredible dedication to tool support to make it convenient for Java somewhat.
Why yes, yes I have. The nice thing about these approaches is that they recognize that most non-trivial build-systems are going to demand the power of a “real” programming language; certainly not the looseness of shell scripting nor the inflexibility of ant buildfiles. (Quote a colleague: “it’s like Lisp, without any of the advantages of Lisp!” ) The downside is they lose out on clarity. A hand-written Makefile tends to stay very simple whereas any convoluted thing can happen in a SConstruct file (the rough equivalent to Make’s “Makefile”), leaving the user doing some head scratching as to why something-or-other happened or why-or-not something built and how or why. On the plus side, the machinery of SCons also gets you things like a “clean” target for free. Another downside is agnosticism: Makefiles, thanks to their simple nature, can be easily written to control complex toolchains that do things other than compiling programs. The barrier for entry in writing a builder in SCons is considerably higher than simple rules as dictated by a Makefile such as
foo : bar baz cat bar baz > foo
(Yes, I realize you can use automatic variables like ‘$@’ and ‘$^’, but let’s keep it less cryptic for now)
I have been working on and off on a project that uses a Makefile to orchestrate around a dozen individual tools. Not only is it robust, but after almost a year of inactivity it’s possible to read the Makefile and recall the roles of all the tiny moving parts and how they relate to one another. I then went back and wrote documentation. But now I want to improve on Make by allowing for complex behaviors while retaining succinct, declarative syntax and keeping the system easy to debug. How do I portend to do this?
KPMake, or “Knowledge Powered Make” is one possible solution. In a nutshell: I want to leverage state of the art symbolic AI techniques to solve a relatively mundane problem of dependency resolution while including modern explanation generation facilities to increase user understanding of why any particular actions were taken and to help debug these actions. I will probably use the KM knowledge base, partially because of being exposed to it at work and partially because I know it has been successfully been used in large projects such as Vulcan Inc’s ambitious Project HALO. I also happen to know it’s available as a single lisp file and has practically no set-up or portability problems. The base package for KPMake should simply be a distribution of KM, some basic models to handle basic generic dependency resolution, and macro/procedure support to make writing KPMakefiles easier. Knowledge models to build for say, particular languages (such as a gcc or a javac model) will be distributed and maintained separately and ought to have an eye towards making things even more convenient.
A neat feature that is enabled that I didn’t include in this writing to begin with is using situation calculus to allow programmers to load, save, and debug situational traces as well as perform advanced error recovery. KM supports situational reasoning and simulations that can be used to test build scripts (and write fuzz testing!) to make sure they respond in at least a reasonable way to unexpected events as well as increasing visibility of every step of the build process: right now one generally has to do a lot of manual logging to get a trace of what’s going on or, if one is lazy, set a breakpoint and only be able to view a few slices in the process. You can also have your build engineers in India and send your broken build “core dump” abroad and mostly avoid the “well it works on my machine” scenario.
: I owe my appreciation of this technique to Professor Strain. Imagine a Makefile to build the program, run the tests, spit out any pictures, compile the TeX sources, and finally deliver you a postscript file to send off to the journal. I’m sure if submissions were something that had to be done often there would be a “mail” target that’d also deliver the resultant postscript to the relevant journal(s).