The Next Lisp: Back to the Future

An Invited Talk given in Milan to the European Lisp Symposium

Dr Mark Tarver, 2009

OK; from the title we're obviously going to talk about Lisp. Actually it’s a bit tricky because I’m also going to talk about open source and open education later too. It might seem even trickier if I start off by debating exactly what we mean by ‘Lisp’. But I'm going to do just that because it will shed a light on what I'm going to say. And I'm going to start off with a biological metaphor, and there will be a few of them in this talk. I hope any biologists out there will forgive me.

In biology there is a distinction between genotype and phenotype. The genotype of a person is the genetic legacy that the person carries. The phenotype is the way that genetic legacy expresses itself. So, for example, you might find that you carry a gene from your father for blue eyes and one from your mother for brown eyes. Your genotype is blue-brown, but your phenotype is the way your eyes appear which is brown.

Applied to languages the genotype of the language is the DNA of the language. It is the essential ideas of the language. The phenotype is the actualization of those ideas on a platform.

If we ask ourselves, ‘What is the genotype of Lisp?’, what is its DNA, its genetic legacy, I'd suggest that it is composed of the following five ideas.

1. Recursion as the primary mean of expressing procedure calling.
2. Heterogenous lists as the way of composing and dissecting complex data.
3. Programming by defining functions.
4. Garbage collection.
5. Programs as data.

So having defined Lisp as a genotype, we can now ask what we mean by ‘Lisp’. Are we talking about a genotype - a particular set of ideas, or a phenotype, the way those ideas are represented on the computer? If we're talking about a genotype then Lisp is doing very well. Python, Ruby, NewLisp, Clojure and even TCL/tk are all inheritors of this genetic legacy. But I guess if I stood here and talked about Python for an hour nobody would be too pleased. So I have to talk about phenotypes, and the main one is Common Lisp and that is not doing so well.

What Went Wrong with Common Lisp?

To understand Common Lisp and what has gone wrong with it, we have to begin with Lisp 1.5. Lisp 1.5 was the brainchild of John McCarthy and his group and it was the first practical Lisp to incorporate all the ideas of the Lisp genotype. Lisp 1.5 was maybe 30 years ahead of the game in terms of programming language design. A lot of very bright guys ran with the idea and it mutated into a polyphony of Lisp dialects - Zeta Lisp, DEC-10 Lisp, InterLisp-D and so on. So Common Lisp was devised as an acceptable political solution for unifying the many tongues of Lisp. But now, 25 years on, it is not the dominant phenotype of the Lisp genotype. Why?

The answer is that quite a few mistakes were made with Common Lisp. The first was that it was too big. The drive was to make the language compatible with the previous dialects and the easiest way was to union these features together. This is generally what happens when you get a committee of guys each with their own input. The language specification explodes.

But for all the size of the CL spec, there are things that are missing from it. The foreign language interface is not defined. Calling the OS is possible in CL but every platform does it differently. Threads and concurrency are nowhere in sight (remember this is 1984).

The Common Lisp community has tried to deal with this through libraries. But these libraries are unsupported, and either undocumented or poorly so. I regard this as partly a weakness of the open source model, but of this later. But the result is that CL has become unpopular with developers who want a good solution which is free and well supported. For that they turn to Python.

On the academic front CL is not popular as an introduction to functional programming and is the victim of early standardization. To a degree, CL is a snapshot of functional programming in the ’70s. Pattern-matching and static typing are not there. There is much in CL that is heavily procedural, too much so for teachers trying to convey the art of functional programming. The syntax is viewed as unattractive, Larry Wall’s comment that ‘Lisp has the visual appeal of oatmeal with finger nail clippings mixed in.’ is funny and irreverent, but largely true. It’s a great language but it doesn’t look great.

I guess a review of past mistakes would be incomplete without a look at Lisp machines. In ‘The Bipolar Lisp Programmer’ I talked about brilliant failures and the Lisp machines were exactly that. The basic idea was simple: make Lisp run fast by making it run on special hardware. The problem was that in the mid-80s standard hardware was doubling in performance every 18 months. So a Lisp machine that promised at the design stage to be 10x faster than conventional hardware might be 5x faster by the time it appeared and that advantage would last only 2 years or so. 

Symbolics woke up far too late that the real strength of the company lay in its ideas - in Genera. In our biological terms, they had got stuck on obsessing over the phenotype, and missed the significance of the genotype. So the DNA of Genera got trapped on a dying phenotype and today all that work is a lost investment. Imagine what the Lisp community would be like today if Symbolics had done what Torvalds did later and moved their ideas onto a PC!

When the CL community is exposed to criticism of CL their attitude is confrontational and conservative. For example, nearly every other newbie who hits CL for the first time complains about the brackets. And when they do they are told to shut up because they don't understand Lisp. Actually that response shows that the shushers don’t understand Lisp, because the Lisp genotype is not hung on the idea of brackets. Guido van Rossum realised that in creating Python. Python is not new; its just a sugaring on the Lisp genotype in favour of a syntax that people feel comfortable with. And look how far that idea went.

Along with all these ills went another - macro laziness. CL programmers think that macros are absolutely central whereas they are not. In Qi, the defmacro construction could be eliminated from the source without remainder by using the programmable reader. I've left them in because of laziness. But macro laziness arises when a challenge to CL is met by ‘I could always write a macro to do that’. It's the same response of the slob on the sofa watching TV and saying ‘I can always get into shape’. Actually being in shape and being able to get in shape are not the same. And since CL has stayed on the same sofa for 25 years, getting this fat specification from the supine position and into shape is not easy.

Lessons to Learn and my Work with Qi

So what lessons do we learn from all this? I make about 7, which is a nice magical number.

1. Don’t bet the farm on a cumbersome language standard that is difficult to change and replace.
2. Create a standard approach to common problems like concurrency, threads and OS calling.
3. Hive off stuff into libraries.
4. Absorb the lessons of programming development going on around you.
5. Don’t get trapped in a phenotype - whether machine or language.
6. Listen to your critics - particularly newbies.
7. Learn to communicate.

Now I want to talk a bit about my own work on Qi. By the way, I have pronounced it as ‘Q Eye’ for years which is wrong. I got the habit from my students. It really should be ‘chee’ which is the Taoist concept of life force. I'm a Taoist and hope to return to the mountains one day. But right now I'm here in the world of men and so I'll continue talking for a while.

Qi began life in the Laboratory for the Foundations of Computer Science in 1989. It owes its character from observing the activity with Standard ML, since the LFCS was the origin of that particular language. Essentially Qi is a harmonious fusion of three different paradigms; a functional one based on pattern-directed list handling and using sequent calculus to define types; a logic programming component based on Prolog and a language-oriented component based on compiler-compiler technology. 

It’s a powerful combination and it’s not an accidental one. Each component has an important part to play in the coding of Qi. The compiler-compiler is used to frame the reader and writer, the Prolog compiles the sequent calculus to code and the functional language does nearly everything else. The result is that the source is amazingly small - about 2000 lines, but packs a punch like Bruce Lee who was built on the same scale.

The idea of Qi was to provide a programming environment based on CL but with the features of a modern functional programming environment. In fact, Qi is ahead of functional programming design in many of its features. Its generally efficient, being faster in execution than hand-coded native CL and has all the features of the Lisp genotype while being about 3x more compact for programming purposes than CL. It comes with a book and a collection of correctness proofs that nobody reads.

Qi is very powerful, but it is not computationally adequate as it stands. ‘Computational adequacy’ is a term I’ve coined to describe the characteristics that a language has when it is adequate for the needs of the programmers of the day. CL was computationally adequate in 1984 but is not today. Computational adequacy today means finding good answers to questions of how to provide threads, concurrency, FFI, GUI support and so on. Qi is not computationally adequate because it is written in CL which is itself not computationally adequate. I’m going to talk about what I’ve been doing to fix that, because it reflects back on the way that Lisp itself should go.

The general strategy of Qi development can be summed up in 6 aphorisms.

1. Grab the easy stuff.
2. Featurise out differences.
3. Infiltrate don't confront.
4. Isolate the core.
5. The genotype is always more important than the phenotype.
6. Educate people.

Since Qi is written in CL, grabbing the easy stuff meant grabbing the CL libraries and bits of the CL specification. CLStringsCLMaths and CLErrors are all part of the Qi library and they bring type security to CL maths, string and error handling. CLHash and ClStreams are on the way. They are easy pickings. The combined sum is a lazy week's work.

Parts of the Qi library are not part of CL. Qi/tk is the biggest. The type theory of Qi/tk is 511 lines of Qi that generates over 200 rules and axioms of sequent calculus. This in turn generates over 27,000 lines of Common Lisp. But it’s quick. A game program of 561 lines typechecks and compiles in 5 seconds under CLisp (1-2 seconds under SBCL). It includes constraint propagation that allows you to dynamically link the properties of widgets to each other.

I’ve begun to featurise out the differences between CL implementations; there is one way to quit, one way to save an image and hopefully one day, one way to call the OS. Our goal is to make the Lisp platform invisible; and by ‘Lisp’ I'm talking about the Lisp genotype not the phenotype.

‘CL is great and Blub is crap’ has not worked as a strategy for selling Common Lisp. People invest years of their life learning Blub and don’t want to hear this message. A strategy of infiltation is better. The Q machines are intended as a collection of machines that cross-compile Qi into any member of the Lisp genotype. Qi compiles into CL, but Quip compiles Qi into Python and that already exists. Clojure, Ruby, Perl are later targets. Qi can be used to provide optional type security for untyped languages like Python. The proper destination for Qi is as a universal metalanguage. Write once, run anywhere.

The Kernel Lisp Project: Back to the Future

When I started thinking of porting Qi to other members of the Lisp genotype I looked at the CL sources for Qi. They are amazingly small. About 8,500 lines from 2,000 lines of Qi. Because I'm a conservative coder nearly all of Qi is running on the Lisp 1.5 core. In all there are about 68 primitive functions in Qi and many of them like CADR are not really primitive. If we talk about ‘kernel functions’ meaning the stuff that is really primitive then we’re down to about 50, if that. 

This kernel defines a Lisp, call it KLambda, that is the kernel necessary to run Qi. Now the interesting bit is that Qi is mapped to Python etc. through a mapping from the Qi-generated Lisp source to the Python language. So by mapping Kl to Blub, you get a version of Qi that runs under Blub. 

Based on this idea, this opens the prospect of a new Lisp that would be effectively a virtual machine for the entire Lisp genotype. Kl would and should be an enhanced version of Lisp 1.5, getting back to the roots of Lisp. By defining Kl and running Qi and the Q machine cross-compilers on Kl, we get a unified platform for the entire Lisp genotype with Qi functionality; state of the art thinking in functional programming across the genotype spectrum. 

This process I call kernelisation is going to change the shape of a lot of computing in the next decade, in the way that RISC changed hardware in the ’80s. In particular the Babel-like profusion of different languages we have at the moment will be tamed through this technique. Kernelisation is really the computing analogue of formalisation - that expression of the mathematician's hunger for simplicity in the expression of formal systems; the fewest primitives and the fewest axioms needed to derive what we take as true. Formalisation issued in a rich developments of mathematical logic like Zermelo-Frankel set theory and it was used to tame the wildness of C19 mathematics. Today I see us as poised to tame computing in the C21 in the same way.

Back to Lisp. What should Kl look like? I've already given one answer relative to Qi; that it should incorporate all the functions necessary to run Qi. But Qi is itself not complete as a language specification. For example in Functional Programming in Qi I don’t deal with how to interact with streams. The reason I don’t deal with this and other such issues is that CL already deals with them and I assumed those people who wanted to learn how to do this would learn CL. But if Qi becomes portable across the Lisp genotype then that aspect will have to be addressed. 

Hence the shape and the form of Kl is not determined by Qi even though it should support Qi. But the general shape and form are, I think, reasonably clear. There has to be several components of Kl.

1. A Turing-equivalent subset based on classical functional programming. This subset should be really really small. For instance you can build a decent functional model using if, = (generic equality), basic arithmetic and l (for abstractions). Cons, car and cdr are all l definable. All the utility functions go into a standard library.
2. It should be dynamically typed but support detailed type annotations. Qi can provide a wealth of detail about the types of Lisp expressions which compilers can eat up. Kernel Lisp needs to be able to use this.
3. It should support arrays, streams, concurrency, error handling and threads in a straightforward way. CLErrors is tiny but does what I need. One objection is that we still don't know how best to handle concurrency. But that does not matter, we can decide conventions on calling concurrency and leave it to developers to decide how to implement it.
4. There should be an open source, freely available Kl library for CL. CL survives in library form. CL developers can load this library. Kl would be an enormous help for implementers of CL. Right now there is too much reduplicated effort going on in a small community for CL to prosper. Putting all the stuff in a library would leave developers free to optimize Kl and would improve Lisp performance no end.

To continue with the biological metaphor, Qi has wonderful viral characteristics; it is small and easy to move and because of its conservative encoding it will fit comfortably into Kl. Once established on top of any instance of the Lisp genotype it forms a symbiotic relation with the host language improving the general well-being of the language it infects.