Metalinguistic Abstraction

Computer Languages, Programming, and Free Software

A Case for Erlang

with 17 comments

Brief prelude on how I got to this topic for flavor and context: earlier this week I was having discussion from an old friend from Berkeley about methodologies in scaling, set off by a discussion rooted in a set of Motorola slides[0] comparing an Erlang, C plus Erlang, and C(++) telecom equipment code that I had forwarded to him. He was aware of Erlang and its general properties since I had been talking about Erlang some time back when I had been playing around with it as a way to coordinate and cluster nodes for KM[1].

He then remarked that he was working on some stuff that needed to be parallelized and support high concurrency and referred me to the SEDA research project as a guiding influence on his budding architecture. SEDA emphasized using event queues to send messages between parallel parts of the architecture with feedback loops for control and updates. I took a look at this and felt there were a few problems:

  1. SEDA is a mothballed research project, so there’s no up to date code
  2. No project I know of maintains a well proven, high quality implementation to abstract a comfortable amount of the mundane details away from me.
  3. Sometimes the model you are working with calls for a thread and not events.

Qualms one and two are strictly at practical and not at a ideological level. SEDA has some high level ideas that I have no strong crystallized negative reaction to and are probably good reading…but at a nuts and bolts level I am left unenthusiastic by the general prospect of not having powerful abstractions readily available to achieve these aims.

Qualm three is much more philosophical and meaty. I have seen a paper that pretty much sums up some of my feelings on the matter: sometimes, you don’t want an event-driven abstraction. The paper even mentions SEDA by name as an example of an event-oriented tool used by the authors to set up high-concurrency servers. Despite the fact that SEDA tried to make this easy (or easier) the authors felt that sometimes a thread was really what one wanted and the event-driven model was just not as clear or easy to write as the thread-oriented one. Not being as clear or easy to write means more bugs. However, not all is lost: the paper concludes that there’s no fundamental reason why events should be the only way to achieve high concurrency. There is some passing mention of Erlang, but nothing substantial. But what have we gained here? Validation of threads? Aren’t threads the road to madness? We can probably do better than just threads and synchronization constructs which themselves pose a substantial risk to program reliability.

With this background information it’s easy to imagine a relatively annoying scenario with a {C, Cpp, C#, Java, Python, Ruby, Lisp, Haskell, damn-near-anything} program: What happens when you write part of your system in a threaded manner (because it was natural feeling, and that’s not a necessarily a dirty instinct, as supported by the paper in qualm three) but then need to extract this threaded functionality because it needs to handle more concurrency or be made network-accessible to work over multiple machines? Generally you get to rewrite a lot of code to fit into the SEDA diagram, including re-writing threaded code to be event-driven and network-accessible, which also means having to take care of the network and protocol issues. Don’t forget to having to update your old code to use the event-driven version, a painful affair if you used synchronization constructs. Your only alternative to these rewrites is to defensively program everything with the intention of being event-driven which will only waste a lot of time and make your program less efficient unless you provide even more code to do shared-memory interactions as a secondary mode; otherwise, you will be stuck doing lots of serialization/serialization to interact with stuff on the same machine. Let’s not even mention that code that could have been handled more gracefully with context switches rather than event handling will end up being maddening to write and more opaque than one feels it should be. Welcome to Tartarus, enjoy your stay.

So now we finally talk about Erlang. Erlang attacks a lot of these problems on many fronts including in its implementation, syntax, and semantics. Yet, people seem to be unfazed by the idea of re-inventing Erlang’s wheels when it comes to Erlang’s choice application domain, and I suspect a large part of the reason for this is that most people who are vaguely aware of Erlang and its reputation don’t know what wheels have already been invented in Erlang. Included in those are some wheels that they probably haven’t thought of yet when starting out and could use to assist implementation, others are wheels (some quite elaborate) that they’d be forced to implement, test, and maintain on their own otherwise or suffer from something painful.

Here is a list of some of the more important things that came to my mind that you get “for free” for using Erlang:

  • A generally expressive syntax that reduces the amount of code from somewhere between one tenth and one forth of the roughly equivalent code in C/C++[3]. Error density was seen to be about the same, so that also means about a fourth to a tenth of the number of bugs.
  • A virtual machine that itself supports kernel event polling (at least under Linux) to allow you to easily handle tens of thousands of persistent connections (such as TCP streams) modeled with a simple one-context-per-connection abstraction[4]. This is not the default and can be enabled with “+K true” when starting the “erl” interpreter.
  • The overhead of a process is 318 machine words.
  • A virtual machine that can efficiently automatically handle SMP (at least under Linux) and distribute processes between nodes
  • Semantically simple process-oriented concurrency (which avoids a lot of bugs seen with shared-state threads) with high-speed message passing and process scheduling (how else could it handle 80,000 processes?), thanks in large part to no-mutate language semantics
  • Extensive heap sharing between processes to avoid message copying, once again from no-mutate language semantics. (used the “-shared” switch)
    • This is not the default behavior: otherwise, per-process heaps and full-copies are used to maintain short GC pauses for real-time applications
  • Network-transparency between processes, even when passing higher-order objects like closures(!)
    • Some of the more “leaky” abstractions made for performance such as ETS or process tables that allow for mutation can have opaque continuation returns that cannot be serialized in this way. In any case, it’s in a very small minority.
  • Node and process monitoring and restart strategies to allow you to write robust, idiomatic, fault-tolerant programs
  • Automatic transitive closing of Erlang nodes for maximum reliablity/message passing speeds as well as (when that full-connectivity is impossible) message routing between intermediate nodes.
  • Pattern-matching of several data types. Not only the obvious tuples, but also for binary streams, largely eliminating temptations to use inefficient ASCII-based wire protocols.
  • Distributed synchronization constructs
  • Safe local and global name registration
  • A powerful distributed database, MNesia
  • Code hot-swapping
  • Application versioning and packaging support
  • Metaprogramming using the AST, so Lispish-style macros exist. NOTE: in Erlang parlance, this is called the “parse_transform” procedure. “Macros” in Erlang lexicon refer to something more like the C preprocessor.
  • Generic constructs for common tasks: finite state machines, event-driven (for when they are the most natural model), and the ever-useful and flexible “generic server” (gen_server) behavior.
  • A community that is focused on reliability, performance, and distributed applications
  • More community that is trying to give Erlang some more tools with “general appeal” such as a web application framework
  • Heavy documentation, both of the libraries and of methodology refined by twenty years of language development, research, and application

Let’s revisit the scenario above, where you were stuck in Tartarus re-writing your threads into events and making them accessible via network, but now starting out with an Erlang code.

Once again, as before, some of your code which you had written using a process-oriented model has outgrown its britches and needs to be made scalable and concurrent. The latter part of that is mostly taken care of for you: simply spawn a process for every work item, as you were before. Thanks to the Erlang VM, your eight-processes turned eight-thousand are having no problem handling the flood of work and utilizing as many machine resources as as possible. You don’t need to coerce yourself into writing an event-based server, introducing bugs and obfuscating code as you go, a huge win already.

Now you get to worry about making things distributed, which is a little more complicated. Your first attempt is to allocate some dedicated machines to running the code in question for more power. Since message-passing is network-transparent, the changes to your code elsewhere in your application is minimal. A send is still a send, whether across a network or on the same node. You write some code to decide how to allocate those queries across these machine resources, which themselves may dynamically reallocate work, carrying along with it the process name to send the return message to to avoid centralized response multiplexing overhead[5]. Ultimately some node in the cluster sends the response or the requester times out. In many cases you are now done, you can just constantly add machines to this glob to get more power.

To spice up the story, let us suppose that you notice that there’s a lot of state-passing going on to synchronize nodes that’s just too network-intensive that wasn’t a problem when this was a single-node solution, so you rewrite some of your code to pass a closure that contains instructions on how to update the node’s state. This means you just avoided having to write some sort of fancy differential state transfer procedure; you’ve simply told the node at the other end how to compute the new state from an old one by sending the procedure itself on the wire instead of the finished product.

Finally, if you had followed some of the OTP design principles[6] to begin with (which is not uncommon, even when working in a single-node environment, they are exceedingly convenient abstractions) and used the gen_server (or likewise) behavior you can get (or might have already had) a supervision tree going that’ll make your application serving this stuff fight down till the last man. And so ends our tale.

Don’t make this glowing report make you think there aren’t difficulties here; there definitely are real tangible downsides to Erlang, not the least of which is recruiting programmers, questionable string handling, and the somewhat-warty records. It’s also considerably slower than C when it comes to sequential computation. However, consider that it is clearly not a toy, and that groups of programmers — not all of them Gods, I’m sure — have employed it to process unfathomable quantities of our telephony data with nine nines of reliability[7], all in two million lines of code[8]. Erlang is an artifact designed with a purpose and a vision and will be difficult to best in technical merit in its chosen problem domain without embarking on a project of similar or greater scope.

Footnotes:

[0]: Gist: Erlang is more terse, has better degradation under high load conditions, better reliability. You know, what you might expect against hand-rolled C++ that’s significantly less complex and tested than Erlang’s implementation itself. (See Greenspun’s Tenth Rule, except with the obvious reapplication)

[1]: Yet unfinished. Actually, stalled for some other priorities. The clustering part of it is done, the missing section is writing the appropriate bindings between the Erlang nodes and the Lisp process, as well as a job allocator to decide on how to allocate work to the nodes. We also don’t yet have a lot of machines to run this thing on, a circumstance that may change in coming months. In the off chance that you are interested in contributing to a parallel, clustered knowledge base, let me know in the comments.

[2]: As opposed to full-blown Prolog style unification where everything is in terms of rules. That is, you can say sum(3, 4, A) and conclude A is 7, but you can also ask Prolog sum(A, B, 7), and constantly ask it for legitimate values of A and B, which it’ll happily return for as long as you’d like. This is why a simplistic fixed-size Sodoku solvers in Prolog look just like articulating the rules instead of actually finding the answers.

[4]: An oft cited “benchmark” of sorts showing an Erlang web server, YAWS, vs. Apache, which uses pthreads. Both of these servers are handing a trivial load — a slow “GET” request — so the main determination of who wins here is who is least choked by the massive number of requests. Since Apache uses a pthreads based server it is largely limited by the operating system’s threading implementation.

[5]: Notice the security problem here? Erlang by default will only communicate with other nodes with the same magic cookie, a simple and robust security mechanism to prevent “messages from strangers.” In case you were wondering: message encryption in Erlang is supported in case you don’t trust your link. It’s not as straightforward, but someone has written something about it.

[6]: See OTP Design Principles which discuss supervision trees, monitors, and links, among many other things. Also see Joe Armstrong’s Thesis; it’s easy to read and extremely informative as a perspective on writing reliable software, even if you are not writing Erlang.

[7]: An article written by Armstrong. He links to his thesis, but this is a little bit more conversational and brings out some highlights. I have excerpted the relevant portion for lazy clickers:

Does it work?

Yes. Erlang is used all over the world in high-tech projects where reliability counts. The Erlang flagship project (built by Ericsson, the Swedish telecom company) is the AXD301. This has over 2 million lines of Erlang.

The AXD301 has achieved a NINE nines reliability (yes, you read that right, 99.9999999%). Let’s put this in context: 5 nines is reckoned to be good (5.2 minutes of downtime/year). 7 nines almost unachievable … but we did 9.

Why is this? No shared state, plus a sophisticated error recovery model. You can read all the details in my PhD thesis.

[8]: In case you thought this was a piddly amount of code, a paper pegs the equivalent amount of C/C++ code at somewhere between four to ten times as much code to get the same stuff done. This is not a extraordinary claim considering Erlang’s advantages in automatic memory management and functional programming constructs such as the ever-useful map(). This is a huge win, despite what some people try to tell me…to be explored in a future blog post which I have tentatively named in my head “Verbosity is a valid complaint!,” or something like that.

About these ads

Written by fdr

July 4, 2007 at 4:33 am

Posted in erlang, languages

17 Responses

Subscribe to comments with RSS.

  1. Great article. Nothing to add.

    Rick

    July 4, 2007 at 6:04 am

  2. I was at a Motorola research lab in the early 90s when the Erlang crew pitched their language to us. The problem with Erlang was the overall performance was too slow, requiring more expensive hardware to handle our load. Even Ericsson only used Erlang on a low-end telecom product, not their high-end stuff. We developed a similar programming model (a proprietary high-level language) that used aggressive partial evaluation to vastly improve performance. Our HLL performance was much better than Erlang and nearly equal to existing code in C.

    For the very large software projects at Motorola, writing code is a small part of the overall engineering effort. The roadblock for our HLL and Erlang is a social issue: engineers don’t want to learn an obscure language that won’t look good on their resume. They would rather use C++ or Java because their skills will be portable to a new employer. After Motorola’s huge layoffs in the late 90s, it looks like they were right!

    projectshave

    July 4, 2007 at 7:47 am

  3. I’m not sure if Erlang really supports Lispish-Style macros
    > Metaprogramming using the AST, so Lispish-style macros exist.
    The macro faclility in Erlang is more like C define.
    What you are refering to may be parse_transform/2
    See e.g. http://chlorophil.blogspot.com/2007/04/erlang-macro-processor-v1-part-i.html
    how it can be used.

    Burkhard Neppert

    July 4, 2007 at 8:46 am

  4. @Rick: Thanks for the kudos
    @projectshave:
    I am aware of these other issues, but in the early 90s much less computing power was available. Keep in mind that in the early 90s no one (relatively) would use Java, partially because there was no JIT and computers will still slow. “Garbage collection? It’ll never work!” Just as times before this one, higher level abstractions were shunned for lack of performance. I would argue that at this point that the world is ready for Erlang.

    The fact that Motorola still decided to build a HLL speaks to this:
    “Erlang is an artifact designed with a purpose and a vision and will be difficult to best in technical merit in its chosen problem domain without embarking on a project of similar or greater scope.”

    Finally, I think one of the reasons Erlang went open-source some time ago was to address some of these social issues. It is a problem.

    @Burkhard Neppert
    It supports AST manipulation, which is my bar for lispish-style macros. Perhaps you’d care to elaborate on why they aren’t? It also happens to support a preprocessor ala C, but this may be a terminology thing. I’m editing my post so people don’t go down the wrong track, thanks.

    fdr

    July 4, 2007 at 11:26 am

  5. Speaking of rewriting threaded computations in an event-driven manner, there’s an interesting paper which proposes a model (and proof-of-concept implementation) for unifying the event-driven model and the multithreaded model (in Haskell):

    http://www.seas.upenn.edu/~lipeng/homepage/unify.html

    Enlightening at best … when are we unifying Erlang and Haskell ? ;-)

    Ricardo Herrmann

    July 4, 2007 at 3:50 pm

  6. […] A Case for Erlang Brief prelude on how I got to this topic for flavor and context: earlier this week I was having discussion from an old […] […]

  7. […] A Case for Erlang « Metalinguistic Abstraction (tags: erlang) […]

  8. > support a preprocessor ala C, but this may be a terminology thing. I’m
    > editing my post so people don’t go down the wrong track, thanks.
    Indeed, one of my objections was that the stuff the Erlang doc calls macro really does not look lispish.
    With parse_transform it seems you can achieve most (all? I haven’t used it myself seriously enough to judge it) of the things you can do with Lisp macros. So my post was just to make sure that people don’t get frustrated when searching for “Erlang macro”.

    I think Erlang is one of the more interesting languages I’ve seen, so I’m very happy about blogs that remind us that there is more than Java, Ruby ….

    Burkhard Neppert

    July 4, 2007 at 10:41 pm

  9. @Buckhard Neppert:
    I’m glad you enjoyed the article and I thank you for your advice, which as you may have noticed has been acted upon…
    @Ricardo Herrmann:
    Thanks for the paper, I will read it with interest soon…

    FYI: I don’t know why, but you got caught in the spam filter. Luckily I noticed and fished you out.

    fdr

    July 4, 2007 at 10:51 pm

  10. […] A Case for Erlang […]

  11. Coming from a Java background, I am seriously considering Scala instead of Erlang. (or any of the other functional languages for that matter).

    http://www.artima.com/scalazine/articles/steps.html

    One reason you might want to use Scala is that it allows you to increase your productivity compared to Java while leveraging JVM execution speed, your existing investments in Java code, knowledge, and the vast array of APIs available for the JVM. It has the conciseness of a dynamic language like Ruby or Python, but it is statically typed, like Java. Another reason is that Scala comes with an Erlang-like Actors library that greatly simplifies concurrent programming, but runs on the JVM.

    Any thoughts? Can Scala really match Erlang for concurrent programming?

    Peter Thomas

    July 5, 2007 at 6:40 am

  12. @Peter Thomas
    I think the post you linked to makes a claim for correctness and ease of use, not necessarily the ability to have tens of thousands of processes, but instead more easily and correctly deal with dozens or low thousands. Increased correctness and ease of use is still a great property (it also makes concurrent programming less painful, so you can get more out of your cores because people will use more threads with less fear), but may still fall flat if you want to handle huge numbers of concurrent processes. But with actors and pattern matching I think you got most of what you need to get Erlang-y levels of concurrency correctness. All that is missing is being able to do a huge amount of it.

    My verdict on this: as we have seen on the paper I cited one can get this event-like performance from just standard old threads and the authors conclude that we basically need better threading implementations, so certainly it is not out of the question for Scala or even regular old Java. I somehow doubt that this is the reality of the implementation today (and if it does support stuff like this there is little doubt in my mind that performance would suffer greatly), but maybe I should run an experiment. That’s probably involved enough to justify another full post in the immediate to intermediate future. A lot of my doubts come from the fact that some of this stuff is pretty low-level and that Scala would have a hard time maintaining competitive performance while actively making use of some features the JVM may not provide. But I am no JVM expert, so it’s fuzzy…the best way to find out, I think, is try.

    fdr

    July 5, 2007 at 4:27 pm

  13. […] 5, 2007 at 11:02 pm · Filed under Programming A Case for Erlang « Metalinguistic Abstraction: a very nice article about […]

  14. Nice to see a thorough write up, on this. Enjoyed reading it and look forward to more.

    I remain unconvinced. If nothing, purely from adoption perspective in at least half a dozen domains I am interested in. That and seeing that similar error rate and productivity is there while certainly bringing different type of risk.

    That the ‘Erlang DM’ is twice as fast as C++ ‘DM’ means nothing in my view, ie. nothing can beat native code and C++ compilers (humans constantly try to outsmart it and consistently fail). Sure it is more work, but libraries are out there, it is not 90s in C++ world any more.

    170% greater memory residency is a huge turn-off, enough to invalidate many modern caches.

    And naturally, this is just ‘bare’ C++ xx we are talking about, and there are far too many tools and commercial implementations demonstrating its adoption, power and longevity to simply ignore or dump it for another Haskell, Lisp or Smalltalk, Java or CLR or ML or any other newcomer.

    Your research project pointer is great example.

    Not that I am against proven ideas and I see that Erlang has done some great works out there, or that I am sticking to my zealot guns, but compiler, and compiler only is what makes all great work possible.

    I believe all you are observing is that C++ is incorrectly assumed to run in a classical huge stack, process or thread oriented model whilst that is far, far from reality. Even FGPA people are showing it is just a misconception, and on scientific front there is nothing out there apart from Fortan that matches it in performance and library support.

    And I strongly believe it is all just a start as people move from legacy of C.

    Cheers

    sixyears.wordpress.com

    sixyears

    July 6, 2007 at 1:15 pm

  15. @sixyears
    I don’t understand quite what you are saying here. There is far greater productivity under Erlang (huge code size reduction, lower complexity, and a a similar error density which means a similarly reduced overall number of bugs) than under C++. Hardware is relatively cheap and programmers and their mistakes are relatively expensive. (This is not always the case if you are say, writing software for a wristwatch, but for servers and telephone switches it often is)

    Let’s not forget that the number of bugs was of similar density, but the C++ implementation was longer, more complicated, and yet slower because it wasn’t complicated enough to deal with Erlang’s well-oiled machinery in handling (in a nutshell) network I/O. We’re not dealing with complex simulations or computations here, but rather switching stuff around, something Erlang was more or less designed to do. Memory residence is probably large there because Erlang overcommits memory (much like many garbage-collecting runtimes do) and never shrinks. (Which itself can be a rather hard to predict OS-specific behavior, even in a C program)

    I think this adherence you have to compiler-only is not even existent in the rigorous sense. I can make a Python binary, I just load up my code into a big data block and run the interpreter, all in one binary. It’s all code and data segment, right? Everything is. Java is a newcomer? Java2 is ten years old, and its overwhelming adoption on the server side has more or less eliminated an entire class security bugs as well as greatly increased reliability. (Garbage collection, array bounds checking).

    Even more convincingly, I can take an entire SBCL core image including my code and dump it to a binary. There is no obvious hard line between something that is “not compiler-only” and “compiler-only,” only compilers that generate different code. This isn’t even “cheating” in the sense that I’m just interpreting a big data block; it compiles it into honest to goodness x86 byte code with stack pointer manipulations and everything. Just try calling the disassemble function on a symbol.

    You really should be making an argument like “I’m against arrays bound checking, garbage collection because they’re too slow,” or whatever it is you really mean. Otherwise…I found a lot of your comment very hard to understand.

    fdr

    July 6, 2007 at 4:43 pm

  16. Productivity argument against C++ is mute nowadays, but I can understand many are unaware of it. It goes very much along the lines of how Java was touted to solve all the problems.

    The C++ interop tools out there are nothing short of outstanding and a lot of (compiler research academia and community) effort went into it.

    What Java and GC brought was raised levels of abstractions and hand-holding and with it a huge framework of bloat. Elrang would go no different if we took that route, but as you say it is oriented towards network I/O not complex simulations so ok I can agree it that sense it would be appropriate for a limited domain.

    It is simply a fact that EJB (I worked on it 10 years back and see it used to date sure), J2EE and various bloatware has lost an enormous audience because of its complexity while gaining nothing on its productivity hype. Another proof of that is uptake of Ruby.

    So I can agree ‘for servers and telephone switches’ it often is more sense to use it. Unfortunatelly, many people need a bit more than that.

    “itself can be a rather hard to predict OS-specific behavior, even in a C program”.

    I cannot agree here, because it is very stable and predictable for a large number of Linux, QNX and even NT deployments.

    Embedded and device field demonstrate what you say is not the case very well.

    ” think this adherence you have to compiler-only is not even existent in the rigorous sense”.

    I am surprised you say this considering your blog title, specifically the word ‘meta’.

    In rigorous sense it is very real and leading the entire show with generics and MPL for quite some time (around 5 years). It would be foolish to assume it is going to change any time soon. Just look at what impact it had on CLR and how it now outperforms Java (with help of MS’s own OS mechanisms or not) by a light year and these are few simple examples.

    ” I can make a Python binary, I just load up my code into a big data block and run the interpreter, all in one binary.”

    That is fine but you are missing my point that fighting the ‘compiler adherence’ is assuming a programmer is more smarter. And he/she usually isn’t. And no longer can they even approach the machine code generation even with a bargpole, hand-writing or generating x86 is as slow and inefficient and unpredictable as interpreted and GC-like hackery.

    As for Java ” overwhelming “, everyone I see and speak too perceives it as Underwhelming in many instances (in terms of cost, productivity and much more) if you compare it to alternatives.

    Also I do not think you should believe benchmarks (especially one as weak as you presented) and look at a bigger picture, practical bits, you know, like what Google infrastructure is written in, like what all those runtimes are written in, like what boots your box and rooter and more etc.

    Of course that you can glue Python, hack even JavaScript or whatever you like. But the metal is always going to support garbage collection induced damage if any performance is a concern.

    I hope this is a bit easier to understand.

    sixyears

    July 9, 2007 at 10:38 am

  17. I also wanted to subscribe to to the argument that it is ‘incorrect’ to measure the Erlang productivity vs C++ by comparing Erlang with bare C++. Modern C++ is employed together with Boost library (and/or Qt). Boost provides type safe and relatively easy implementation of many constructs:
    asyncronious io servers,
    parser and lexers,
    various list operation functions,
    multi-dimentional index, maps,
    implementation independent IO streaming (for example I can be using regular io function to read and write to zip files),
    complex structure serialization,
    abstracted most of the operation system speficific services in platform independent manner

    and many other things.

    So comparing Boost C++ and Erlang may not be that advantageous for Erlang.

    vsp

    March 2, 2010 at 1:47 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: