Metalinguistic Abstraction

Computer Languages, Programming, and Free Software

The Lisp Before the End of My Lifetime

with 14 comments

Many wax poetic on the virtues of Lisp, and I would say for good reason: it was a language and philosophy that was (and is) far ahead of its time in principle and oftentimes in practice. But I have to cede the following: the foundations of Common Lisp are becoming somewhat ancient and there are many places that have more modern roots where I would have it borrow heavily to assist in creating my programming nirvana. In talking with yet another friend from Berkeley (and the author of sudo random) we had discussed some of these things and I decided it was worth enumerating some of them and pointing to ongoing work that implements those fragments or something close to it.

The reason this post is titled in such a sober way is because the Lisp I envision is probably many lifetimes of work to accomplish, and as such, I cannot see myself accomplishing everything on my own. Granted, I still have a lot of life ahead of me yet, but that only makes the equation all the more depressing. Implementation could probably span many PhD theses and industrial man-decades. As such, I can only hope that it’s the Lisp that more or less exists before The End Of My Lifetime. I would be glad to one day say that I contributed in some or large part to any one piece of it. This whole post smacks of the “sufficiently smart compiler” daydreaming, so turn away if you must. Alternatively, you can sit back, enjoy, and nit-pick at the details of compiler theory and implementation, some (or many) of which I’m sure have been overlooked by me.

Finally, this is not by any means a list of things that current implementations do not have, just things that I feel would seem most valuable. Some are not even necessarily technical challenges so much as social and design ones. I view this hypothetical Lisp as not only some new features, but a set of idioms that I more programmers generally agree on. “The Zen of Python” is an excellent example of this. There are definitely some lisp-idioms, but they have become somewhat antiquated and are hard to enumerate in some part because of the baroque and aging Common Lisp specification. The hardest idiom to get around is fearlessness and ease of metaprogramming, which in part is great, but also can make standardization difficult socially as it assists in making herding Lisp programmers difficult. Herding lisp programmers is about as tough as herding cats armed with machine guns.

However, I think Lisp’s guiding intentions have lied in flexibility. Common Lisp, for its time, was the kitchen sink. It still is, in large part, but may benefit from new idioms and a fresh slate, as well as deeper and more integrated compiler support for some of the features mentioned below.

1. The Compiler is your Friend.

Leaders in this area: SLIME and its Swank component
Honorable Mentions: DrScheme, IPython

Nowadays modern IDEs seem to do everything up to the semantic analysis step in compilation to give you advanced searching and refactoring capabilities. Oftentimes a lot of compiler work is reimplemented to support the features of the given IDE at hand, and much work is duplicated, sometimes to the point of implementing a whole compiler, as in Eclipse.

SLIME and Swank have a twist on this that I like: Swank is responsible for asking the compiler implementation itself (in my case SBCL) for information on various symbols, their source location, documentation, and so on. It communicates all this information through a socket to a frontend, which comprises the rest of SLIME. In doing so it gains the authoritative answer to queries about the program because the compiler of choice itself is delivering its opinion on the matter, even as it runs.

This allows for an accurate way to track down references that may be created dynamically by asking the figurative question “What would the compiler do?”. From this SLIME gains extremely powerful auto-completion facilities that are robust to techniques are either unavailable in other programming cultures or, if used, would defeat the programmer’s completeness of assessment of the program. Lisp is the only runtime/language I know of where I can eval a string and still be able to access the resulting, say, function definition and documentation strings with full auto-completion and hinting in my editing environment.

Were Lisp more popular, I would bet Swank-compatibility and feature-richness would be a defining feature for Lisp implementations, and frontends using Swank would be prolific. The socket interface was definitely the way to go here.

2. Networked & Concurrent Programming

Leaders in this area: Erlang
Honorable mentions: Termite & Gambit, Rhino, SISC, Parallel Python, Stackless Python, and many others.

Sun Microsystems, despite its beleaguered business, had at least one thing very, very right: “The network is the computer.” The ability to talk on multiple computers on a network is increasingly important in our era, and making it convenient can lead to extraordinarily powerful, robust applications. Erlang definitely leads the pack in this area: an industrial strength, reasonably efficient compiler that can do I/O pumping using efficient kernel-assisted event polling as well as automatically distributing computation across multiple processors. It also can support sending of most higher-order objects – such as closures – across the network parts of messages, as well as a powerful pattern-matching syntax that allows for relatively easy handling of binary (and other) protocols.

With processors increasing the number of cores and computers continually falling in price the ability to (mostly) correctly use multiple machines and multiple processors on each machine will become a dominant influence for writing programs that require high performance. Erlang has been demonstrated to be excellent at managing network I/O switching and handling, which is not surprising considering that is its main application as a tool. It could, however, stand to improve upon sequential execution performance: let’s just say I won’t be rewriting my numeric codes in Erlang just yet, despite the potential for mass distribution of computation. I also miss some of my amenities I’ve gotten used to in Lisp, but Erlang excels in its area for sure and has many lessons to teach.

3. First Class Environments

Leaders in this area: T, MIT-Scheme — mostly academia
Honorable Mentions: Python, Common Lisp

First class environments are the beginning and end of many problems, but I feel that having this facility would be useful for debugging and implementing creative namespacing and many other important features. Opaque environments can sometimes still be handled with mostly reasonable performance, but as far as I know nice, transparent environments — i.e., things that look like property or assoc lists straight out of the SICP — are just an absolute killer for performance and make compiler optimizations nigh near impossible. But that’s OK…because there’s nothing more annoying that shying away from using thunks or currying when these techniques are the most simple and expressive solution because you are afraid that it will become a chore to poke at the environment to debug these anonymous function instances later. By contrast, “locals()” in Python, for example, can be a godsend for special tasks and quick debugging, even if it only returns the local (and generally most useful) environment.

First class environments also help in “fixing” tricky issues that crop up and are cause for Scheme’s motivation for hygienic macros, famous for being hellishly picky to get right (Lisp-2 fans always seem to harp on this point, although what I’m suggesting may be something more like dynamic-lisp-N). I still feel that the quasiquote, despite its sometimes-ugliness, is the right primitive model to follow. And, in fact, since there seem to be hygienic macro packages build on top of the primitive variants, one could get those almost for free. Perhaps hygienic macros could also be idiomatic, I know not.

In conclusion, the goal is to break down some of the final barriers between code and data and allow for some interesting if unorthodox transformations and redefinitions at run-time and compile-time. It’s also important to have this functionality if one wants to dynamically redistribute computations across machines or perform run-time metaprogramming, which may be a great way to introduce new compiler features that can be toggled on and off.

4. An External Native-Code Generator

Leaders in this area: LLVM, JVM
Honorable Mentions: Parrot (if only because of relative vaporwareness), Mono, C–, Bit-C

More important than the individual merits of any of these specific VMs is that they are maintained separately by Other People™. It is high time to stop re-inventing architecture-specific code generators and local optimizers over and over. With the JVM catching up or passing up language-specific native code generators (it’s now more or less tied with OCaml on the Alioth compiler shootout with Java and doing well with Scala) and LLVM recently showing on-par and sometimes better performance than vanilla versions of GCC for some C code, I am buoyed with hope that one can generate relatively high-level (or at least architecture-independent-ish) bytecode and still get respectable or even good performance. JIT, ironically enough, may be more well suited to the lispy world than the Java one (although its instrumental in the Java world for sure) considering that it’s pretty common to go in and rebind definitions in Lisp while a system is running. One might argue that changing declaim/proclaim statements and evaluating code is in fact better than JIT, and I could see there being a case for that, but it just seems that lots of work is being poured into run-time code generators that could be leveraged.

One interesting idea is compiling to Bit-C, which has support for low-level manipulations and type-verification, yet also is a lisp.

5. Optionally Exposed Type Inference and Static Typing

Leaders in this area: Epigram, Qi, Haskell, the ML family
Honorable Mentions: CMUCL and descendant SBCL

Inferred static typing and type inference is all the rage these days, with claims for increased program execution and correctness. And I’m all for that, and Qi is an excellent example of the ability to do considerable amount with standard Common Lisp facilities. Qi has the extremely sensible goal of remaining in Common Lisp, and thus ensuring that it has measurable chance of having traction in my lifetime.

Although I’m not sure that the interface to typing I would expose is necessarily (or necessarily not) Qi’s, but I do want my compiler to tell me what it thinks about various tokens littered throughout my code, allowing my editor to do things like red-flagging unsafe operations and type disagreements or have a mode to show expensive dynamic operations or inferred types when I’m seeking optimization. Ultimately, not all of my code will fit neatly into the pure-functional paradigm and may be better served by the occasional side-effect or global state, and I would like type rigor to extend as far as possible, but not become a burden. Sometimes I just want a heterogeneous hash table of elements without any baggage. I think it makes sense to rigorously type nuggets of code, but the Lisp in question should not be fascist about maintaining ‘perfect’ consistency throughout an entire program. Epigram and Qi have this model exactly right: pay as you go. Flexibilty when you need it, but not to the point where it is fascist. In the future, it’d be nice to see some efficiency benefits from compiler-awareness of carefully statically typed nuggets of code that otherwise would not be possible, such as eliminating some bounds checking.

Finally, CMUCL and SBCL already do quite a bit of type-inference, it’s just not exposed to the user so nicely in SLIME except through warnings blown out of stdout. Even then, they can be very useful. Ideally I could simply ask SLIME to access the type of a given symbol and (CMU|SB)CL could tell me what it thinks.

6. Pattern Matching

Leaders in this area: Many. MLs, Haskell, Erlang, lisp macros for Scheme and CL.

This is an amenity that should become standard part of the lisper’s idiom for convenience if nothing else. It’s just that there are a number of pattern matchers and none of them that I am aware of has become the idiomatic one.

7. Continuations and Dynamic-Wind

Leaders in this area: Scheme, almost exclusively, and an implementation: STALIN

Scheme is probably the canonical continuation and dynamic-wind implementation. Implementation is subtle and performance impacts can be significant, but give pleasant generality to schemers when designing new control constructs. Combined with first-class environments one could do quite a few interesting things, such as save the entire program state as an environment-continuation pair. Unfortunately, implementation is incredibly painful. Yet, it has been done, and with a pay-as-you-go model it may not need to hurt most code’s performance very much (few people ought to be writing code riddled with continuations). See the STALIN compiler, which does all sorts of rather insane things, along with the insane compilation time. It’s mostly intended for numerical codes, though.

8. Pretty and easy (but optional) Laziness

Leaders in this area: Haskell, Python, Ruby, Common Lisp Iterate package, many others
Honorable Mentions: Anything with closures, Screamer. Scheme for call/cc allowing even more strange general flow control.

I like Python’s yield operator that transforms a normal function into a lazy one that is expressed as an iterator. In particular, I like to avoid, when possible, specifying representation formats for a sequence or set of things when they aren’t strictly necessary. With continuations one can get very nice looking implementations of generator-type functions although this may have an undesirable performance impact. As such, most languages that have laziness or generators implement special restricted-case behavior to get good performance, and that should probably sit pretty high up on the optimization list for this Lisp.

9. Convenient and Pervasive Tail Recursion

Leaders in this area: Erlang, Scheme
Honorable mentions: anything with tail recursion elimination and optional arguments

I like using tail recursion to express loops, as I find them more flexible, easier to debug, and understandable than loops and mutation. Combined with pattern matching it’s fiendishly convenient at times and can in some circumstances greatly assist a compiler if assignments go unused. I don’t meant to say this should be the only means, but I would like to see it be idiomatic and terse to write. The Scheme named-let and Erlang’s pattern matching both assist this process. One of the main priorities that is key is making it easy to hide the extra arguments often required in a tail-recursive function to hold state from the outside world, and Scheme’s named-let, I think, handles this rather beautifully for common cases.

About these ads

Written by fdr

August 4, 2007 at 3:08 am

Posted in languages, lisp, projects

14 Responses

Subscribe to comments with RSS.

  1. re: 7. You might want to check out the latest from PLT on continuations, described in this (academic) publication http://people.cs.uchicago.edu/~robby/pubs/papers/icfp2007-fyff.pdf and also in the docs that come with PLT Scheme.

    Robby Findler

    August 4, 2007 at 5:21 am

  2. Thanks! It looks very interesting; it’ll make for some good evening reading…

    fdr

    August 4, 2007 at 12:10 pm

  3. re “Finally, CMUCL and SBCL already do quite a bit of type-inference, it’s just not exposed to the user so nicely in SLIME except through warnings blown out of stdout.”

    Just to note, warnings from file compiles are “hot” in slime – if you hit enter on them in the “compiler notes” buffer, slime should jump to the code that caused the warning (obviously in a repl the code that caused the warning is the code you just entered, so there’s little point there, but for file compiles it’s useful).

    You might need to use “M-x slime-list-compiler-notes” in cases where there were very few warnings (there seems to be some sort of heuristic where slime decides if there’s only a couple of warnings, you probably don’t want the compiler notes buffer popping up. Wonder if you can turn that “feature” off – I always want the compiler-notes buffer for nonzero warning/note count).

    Anonymous Moocow

    August 4, 2007 at 1:09 pm

  4. Ah. You can Customize “Slime Compilation Finished Hook” in Customization Group “Slime Mode” , but actually the rule seems to be “if all the warnings are highlighted in the currently visible compiled file buffer anyway (remembering that slime highlights by underlining warning/note positions and shows the warnings in the tooltip for the underlined position), don’t bother popping up the compiler notes ‘hyperlinky’ buffer” – which I guess kind of makes sense, maybe I should just get used to it.

    Anonymous Moocow

    August 4, 2007 at 1:45 pm

  5. Thanks, very good post indeed!

    Dimitri Turbiner

    August 4, 2007 at 2:50 pm

  6. Squeak (and to some extent Smalltalk generally) deserves an honourable mention for First Class Environments. Things like closures and stack frames are decomposable to the extent that efficient first-class continuations are implemented in Squeak as a 60-line (approx) library. Squeak’s excellent debugger is also implemented entirely at the library level.

    Because so little in Squeak is left to voodoo beneath the surface (“turtles all the way down”), I’ve found it a more interesting environment for experimentation with continuations than Scheme.

    Patrick Collison

    August 4, 2007 at 3:33 pm

  7. […] The Lisp Before the End of My Lifetime Many wax poetic on the virtues of Lisp, and I would say for good reason: it was a language and philosophy that was (and […] […]

  8. Not sure if you mentioned this but addressing the Von Neumann bottleneck would be a good one to list too. Currently you need to be exceptionally smart about it like the guys from Naughty Dog software as an example, with how they used common lisp and targeted an embedded video game system. A language that helps you with this would be ideal. Con Kolivas from his kernel scheduling work has vented on this topic in a way, pointing out that it might be a good idea for the “framework” overhead to be factored out; presumbly by engineering it as so, though perhaps with tools that actively promote such optimization as a primary goal.

    Some might suggest that any overhead from one library calling into another and into another is small potatos, compared to the amount of bandwidth needed to transport mass data such as audio or video. Ultimately, this has to be measured. However I point you to the jack, oss, alsa, portaudio style libraries that in some application cases call into each other to a very deep level. Audio skipping, anyone?

    LV

    August 4, 2007 at 8:38 pm

  9. Be careful with calling BitC a Lisp. I know there hardly is a good definition of what constitutes a Lisp, but just having Lisp syntax is hardly sufficient. They only adopted this syntax because quoting is a solved problem in Lisp; they are actually considering switching to some infix syntax — after all BitC is targeted at systems programmers.

    Thomas

    August 5, 2007 at 6:07 am

  10. […] 5, 2007 at 12:31 pm · Filed under Programming The Lisp Before the End of My Lifetime « Metalinguistic Abstraction: an interesting look at various programming language […]

  11. @Thomas:

    Is it? I would consider Lisp’s soul to lie in the syntax and the unity between code and data representation. I thought Bit-C was interesting because as long as they support prefix syntax one could emit Bit-C from just about any lisp rather conveniently.

    In any case, I think a low-level lisp that can be conveniently be generated from a higher level lisp is what I’m getting at; this way those looking to extend lisp can employ the low-level constructs to avoid . For example: by some machination (like dependent types) to avoid bounds checking and type annotation, or just because one is writing say, a numeric code where I pretty much want something with most of the properties of C.

    Bit-C sort of gets at this core idea for me, as does the HLVM for LLVM. Sadly, the former seems to be somewhat dead at the moment, but I’ve heard something in the past about starting back up as part of the LLVM project sometime after 2.0 (which somewhat recently is available now).

    So we’ll see.

    fdr

    August 5, 2007 at 12:16 pm

  12. good post.

    But I disagree on the amount of time you think it’s required. Lisp took something like 20 years of computer experience. Scheme came a decade and a some years later, Common Lisp about the same amount later. So, even though we haven’t seen any new LISP-like language in some years, it means either the ones we have are good enough or extensible enough to incorporate new features (which they are), or LISP-like languages have stopped to appeal (which is somewhat true).

    But it does not mean development on a new language is completely stalled. And if you take things into account, with the amount of knowledge we have today, trial-and-error experience, bad examples and greatly faster hardware, I don’t see how a driven group (or individual) with a significant knowledge of computer science (language design is no small thing) could not come up with a language similar to the one you proposed in a few years. It just takes what it always takes for inventions: necessity and knowledge. The last we have, the first is an arguable thing.

    Diogo

    August 5, 2007 at 2:03 pm

  13. @Diogo

    Yes, you could be right; we may have the expertise, but as you said, the will may not be there. That’s the hardest part to predict, and that’s why I hope it’ll be there before “the end of my lifetime.”

    If it means anything, the relatively strong reception of this post (somewhere over 6000 or 7000 hits, pretty good for my simple little blog) has inspired me to play with writing an interpreter that incorporates some of these design elements. In particular those that don’t involve static code generation: I want to have a really friendly interpreter that will make the process of rewriting a program into a standard representation non-magical and understandable.

    If anyone cares then perhaps we can worry about efficiency. Until then it is just a toy for my own benefit, and possibly others that may be interested in this sort of stuff.

    fdr

    August 9, 2007 at 10:21 pm

  14. Hi! I’ve long had a similar project (cf. tunes.org) but it has slipped into vapourware. Nevertheless, if you’re doing this as opensource, I’d be interested to join in.

    Hints: Yurii A. Rashkovskii’s Xi, Ian Piumarta’s COLAs.

    (And obligatory self-plug for my thesis if you want Lisp threads that are as interruptible as Erlang threads.)

    Faré

    September 6, 2007 at 4:20 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: