Lush: my favorite small programming language
I meant to write about this when I started my blog in 2009. Eventually Lush kind of faded out of my consciousness, as it was a lot easier to get work doing stuff in R or Matlab or whatever. The guy who was maintaining the code moved on to other things. The guys who wrote most of the code were getting famous because of the German Traffic Sign results. I moved on to other things. I had a thought bubble the other day that I’d try to compile it and see what happened. Binutils guys have been busy for the last decade and changed all manner of things: I couldn’t even find documentation of the old binutils the last version of Lush2 was linked against. Then I noticed poking around the old sourceforge site that Leon Bottou had done some recent check ins fixing (more effectively than me) the same problems in the Lush1 branch. I stuck the subversion repo with history so you can marvel at it on github. I may try to revive a few of the demos I remember as being cool.
I call it a small language; compared to contemporary Python or R it is quite small, and had a small number of developers. The developers were basically Yann LeCun and Leon Bottou and some of their students (there are other names in the source like Yoshua Bengio). This tool is where they developed what became Deep Learning –lenet5; the first version of Torch was in here (as I recall it was more oriented to HMMs at the time). Since it’s a lisp, it’s easy to add macros and such to make it do your bidding and fit your needs. Unlike anything else I’ve ever used, Lush is a real ergonomic fit for the programmer. It has a self-documenting feature which is incredibly useful: sort of like what R does, it takes comments in code and makes them into documentation. Unlike R documentation there is a way of viewing it in a nice gui and linking it to other documentation. So you have a nice manual for the system and whatever you built in it, almost automagically. Remember “literate coding?” It was always a sort of aspiration: this is a real implementation of it, and it’s so easy to use, you’d have to be actively malicious or in a pretty big hurry not to do it. Here’s a screen I made for myself so I could remember how to use some code I built 15 years ago (it still works BTW). You can update it at the CLI, just like everything else in a Lisp.
As a Lisp, you have access to macros which allows you to do magic things that make Paul Graham happy. I am smooth brain: only wrote a couple of them: I’ve written considerably more C macros than Lisp macros and plan on keeping it that way. The Lush authors also don’t use them very often; mostly in the compiler, which is how it should be. “A word to the wise: don’t get carried away with macros.” as Peter Norvig told us in PAIP. There is a nice object system, and a very useful set of GUI tooling. Not just the help gizmo; there’s a full fledged GUI (ogre). Imagine that; something to develop old fashioned graphical user interfaces without importing two gigabytes of Electron and javascript baloney. The helptool uses this; it is not an HTML browser. The documentation format looks a bit like markdown with a few quirks; I never had to look at a manual to write the stuff. Essentially it looks like the standard two sentence comments you put to remind yourself what a complicated function does. It has a nice object system the GUI thing is written in; I assume it’s something like CLOS: whatever it is, there are no surprises and anyone who knows about namespaces and objects can use it. I found it particularly useful for its encapsulation of raw FFI pointers and other tooling which is best trapped in a namespace where it can’t hurt anything.
Since it is oriented around developing 80s-90s era cutting edge machine learning, one of the core types is the array. The arrays are real APL style arrays: rank 0 to rank 4, which is probably one rank higher than most sane people use (most people use rank 2, aka matrices). It looks like it had up to rank-7 at one point: I have no idea what you’d do with that. APLs such as J often have rank-whatever, so someone somewhere has probably done something with such structures. Lush2 had an interesting APL like sublanguage for operating on the arrays, which looked pretty handy, but which I never quite got into (most of my work was in Lush1).
All this is cool, but I suppose other small programming languages promise things like this. The really cool thing about it is the layers. You get a high level interpreted Lisp. You also have a compilable subset of Lisp; mostly oriented around numerics things, just as one would expect in a domain specific language one might develop early convolutional net/deep learning algorithms in. Even better than this, if you want to call some C, including calling libraries, you can enclose your C in a Lisp macro and compile it right into the interpreter. Most of the interesting and useful code in the world still sits behind a C API. With a tool like this: suddenly you have a useful interpreter where you can vacuum in all the DLLs you want, and they’ll be available at the command line.
Most interpreters have some FFI facility for doing this; none to my knowledge are this easy to use or powerfully agglomerative. The memory management happens for free, more or less. In, say, R’s repl, you can do something called dyn.load on libraries with R compatible types. If it’s more complex than that you might have to write significant wrapper code, and this is a hack: it might just leak memory all over the place. You have to work pretty hard to encapsulate C libraries in a proper R package, compiling against the R sources. J, same story; you can use the 15!:0 foreign to load a dll and wrap up J structures to send, with some tooling to deallocate or copy memory locations (very carefully). In Lush, you call the C functions directly, in C, on C’s terms (or C++ ). You can write a couple of lines C wrapper, a couple of pages; whatever: it’s all a part of the Lush source. If you look at examples of well-wrapped dlls in R on CRAN, you’ll see they’re festooned with all manner of ugly R structure casts, mysterious R #defines and all kinds of badness and quasi-memory management you’d have to read a 300 page manual to make sense of what’s going on. Having done this a few times, I’m exaggerating a tiny bit, but it is tedious and fiddly and takes a fair amount of work; a couple days if you’ve never done it before, versus a couple minutes. In Lush you just stick a dollar sign in front of variables you allocated in Lush in your C function calls, and after it’s been compiled into the interpreter (which happens if you “libload” the file), you call them, variables appear where they’re supposed to. No memory leaks. Usually doesn’t take down the interpreter when something goes wrong, though of course if you send something weird to a raw pointer it will probably segfault and die. Here’s an image grab of a simple method for instantiating a KD-tree using LibANN (a bleeding edge nearest neighbor library of circa 2009):
First lines are the documentation; inside the defmethod we try to make a new kdtree; the stuff between #{ and }# is normal C++. You can see the $ in front of $out, this tells the Lush compiler to pull the result back into the interpreter. This method gets compiled and loaded and accessed like any other method in Lush. idx2 is a matrix type, the other stuff does what you think it does.
Lush dates from 1987: I don’t even remember what kind of computers people used back then. I assume something like a 68020 Sun Workstation or a VAX. Even when I was using it in 2009, a “multicore” system might have two cores, so it wasn’t really designed with that sort of thing in mind either (though you could link to blas which do this in most numerics cases and it has tooling to use it on a cluster). Some of the intestines of the thing probably reflect this. I’m pretty sure Lush1 is not completely 64 bit clean: when I was using it in 2009 it was 32 bit binaries only, which was fine as nobody had 256g of ram back then. Other stuff which will seem unfamiliar to contemporary people: it’s for talking to local libraries. There is no provision for a package manager over the interbutts, or much other network stuff I noticed beyond sockets. No JSON (didn’t exist; s-expr are better anyway), sql interfaces (was exotic pay-for technology) and none of the stuff modern code sloppers are used to having. It was mostly a tool for developing more Lush code which links to locally installed libraries: this is what R&D on machine learning algorithms had to be back then. As a tool for building your own little universe of new numerics oriented algorithms it is almost incomparably cozy and nice. You get the high level stuff to move bits around in style. You get the typedefed sublanguage to compile hot chunks to the metal, and you get C/C++ APIs or adding new functions written in C/C++ as a natural part of the system. Extremely cozy system to use. While it’s not the Lisp machine enthusiasts like Stas are always telling us about, it’s probably about as close as you’re going to get to that experience using a contemporary operating system and hardware. Yes you have to deal with the C API: I’m sorry about that, but it’s just current year reality. Nobody is going to rewrite BLAS in Haskell or CMU-CL to make you happy. Purity is folly.
As a tool, if I had to fault it for anything, it’s a few small things which I could probably fix. For example, in Kubuntu anyway, you can’t copy/paste examples from the helptool. This is probably something that could be repaired if I dig down into whatever X library the ogre package calls to do this. It’s no big deal; not a very wordy language anyway, and I should be reading the docs and typing code I’m about to run in emacs rather than copypasta. Another slightly annoying thing is a lack of built in pretty-print for results. Many languages have this problem: in Lush it’s easy to write one and I have one around somewhere. Some of the packages aren’t well documented and some don’t work because of various forms of bitrot: this is to be expected in something this old. Other than that, no faults. Very cozy programming language. The coziest.
The C insides are fairly understandable, modulo the glowing crystal dldbfd.c gizmo at the center that does the binutils incantations that make the dynamic linking magic happen. Even that looks like it could be understood if you were familiar with binutils. Lush1 there are a number of odd pieces that were planned to be sawed off which you can sort of infer by their absence in Lush2, which had a redesigned VM. However, Lush1 compiles and runs the old code, and Lush2 doesn’t.
While this programming language could (and really should) be revived, even in its present state it can be marveled at. Both for its historical importance in developing machine learning algorithms, and for its wonderful “programmer first” utility. I don’t know what exigencies caused them to move the Torch neural net library to Lua; probably whiny wimps who were intimidated by parenthesis. I can guess why it ended up in Python (the Visual Basic of current year). It’s one of those things where, had things worked out a little differently, machine learning people would be typing lots of parenthesis in vastly more futuristic Lush instead of drearily plodding along with spaghetti in Jupyter. It represents a very clear vision of how software development should work. No bureaucracies or committees were involved in its design: just people who needed a good tool to invent the future. I suspect the committees and social pressures involved in larger programming languages is why they’re often so awful. Lush is all designed and built by makers, not bureaucrats and “product managers.” It feels purposeful. It also feels incomplete, which is as it should be, as these guys were too talented to maintain programming languages. Like an unfinished DaVinci painting; you can see the grandeur of the artist’s vision.
I’ve always been fans of these guys; as I pointed out in my article on DjVu, there is much to admire beyond their good taste in algorithms and dogged determination to continue working on them at a time when only eccentrics were interested in neural nets. All the cool kids of the era were doing SVMs …. because …. researchers are mostly trend following rather than thinking. Hopefully I don’t cheese them off too much by bringing it up, though as an American it is arguably a sovereign duty to piss off the French. For myself, I have a shitload of work to do in coming months. I sort of hope I can find an excuse to fiddle around with it some more, or maybe even use it in production in some small way. If I do, I’ll write about it. I encourage others to give it a try and ponder how cool 2024 would have been if we used this tool instead of trashfire Python slop you’re all doomed to use in your day job.
Wow, it’s really surprising that you show so much respect for these guys, given their active and interested role in the current AI nonsense hype. Also, working at meta does not seem to me something to be proud of for a scientist. But I didn’t know most of this story, so I may have to reshape my mind a little bit…
They’re absolutely not hyping current year AI; Yann regularly goes on tirades against numskulls promising too much, or thinking neural nets are going to make a singularity.
That is really not true. Amongst the “big names”, Yann is certainly one of the guys that still make sense. He predicted the plateau of the scaling laws, and he also predicted that the deep-reinforcement-learning breakthroughs are actually a big nothingburger for control problems in the wild (I was much more bullish on deep RL than I am on the LLM stuff).
Also Facebook might be a place from hell, their research and engineering is quite solid. Also on the infrastructure/hardware side of thing. I have heard interesting stories about their data center in sweden.
That is really not true. Amongst the “big names”, Yann is certainly one of the guys that still make sense. He predicted the plateau of the scaling laws, and he also predicted that the deep-reinforcement-learning breakthroughs are actually a big nothingburger for control problems in the wild (I was much more bullish on deep RL than I am on the LLM stuff).
Also Facebook might be a place from hell, their research and engineering is quite solid. Also on the infrastructure/hardware side of thing. I have heard interesting stories about their data center in sweden.
I hate Facebook, and visiting their former Sun Microsystems campus was like visiting one of the rungs of hell (but with nice food), but they have decent engineers (including hardware) and treat their people well. They also are better run with less bloat than most of the FAANGs.
It’s also my impression that LLaMa is as good or better than anything else out there on useful problems, and they open sourced it. Not sure how involved Yann and Leon were in that, but I bet they’re at least partially responsible for giving it to the world.
Scott,
I have been a reader of yours for a short time only, it was your piece on UNIX utilities (cut, sort, uniq, etc.) that I was directed to that I very much enjoyed; and I must say I find these sort of articles absolutely fascinating.
I have very little idea about these sorts of languages/environments, so am glad that somebody takes the time to dig them out and re-review them. There’s certainly a lot of beauty in alot of older stuff. It’s also quite comical to me that many people often hype new things that some old white guy wrote a paper about fifty years ago… then forgot about.
I am a ‘programmer’ by trade, which basically means code-monkey at some corp or other, and it holds no real joy for me. I had of course thought that it would.
Turns out I get much more joy reading through Niklaus Wirth’s Algorithms + Data Structures = Programs, and actually thinking about the things I’m doing (currently implementing JSON parser in Pascal, by hand, on paper, using Wirth’s techniques) – as it seems to me that most day jobs don’t allow for thinking at all. More of a case of: “Look, this library can help you! Just use that – no need to understand what’s going on!”. Guess from a work perspective, they’re right.
Anyhow – I very much enjoy these articles, and just wanted to say so, really.
Cheers from England.
Thanks for the kind words. Funny I hated pascal when I was taught this in college (the only computard course I took): I had to write code on cut up brown paper bags for lack of computer resource time. It’s a great language though: I assume you’ve seen Delphi. Very powerful programming environment.
There is a danger in letting engineers run amok in tech companies. Most of them want to do cool stuff; rewrite the stack make it perfect kind of thing. Maintaining the thing that works is almost always the right thing, even if it is dreary work: that’s why you get paid for it. If you ever go off and do your own thing, you’ll find this out pretty quickly. Also you will begin to hate people who want to use “advanced” ideas.
Yeah, for sure. Engineers of that ilk are certainly a bane – I don’t really mind the dreary aspect of the work – you know, write a little script to glue some components together; document it. It’s nice. Just always found it funny that I used more brain power on hobbyist work than I did in me day job.
At the moment, I seem to be fending off two forms of attacks: let’s move over to a Kubernetes stack! And let’s integrate “AI” (whatever that means) with our software at all costs! Probably a good business reason for them to jump on the second bandwagon, but I can’t see any benefit to the first!
As for Delphi, I’ve only really heard of it, but never really used it. Worked for an aerospace company some years back that used it for part of their helicopter simulation software; but I was on C++ duty at that time – no Delphi for me.
The thing about Pascal (earlier flavours) that I liked were it’s simplicity. Granted, it has plenty of limitations in certain areas, but I found it quite nice to use. Plus, Wirth’s book is a treasure trove of information on data structures, hashing, compiler design and recursion, so I think maybe I care more for Pascal because of Wirth’s book.
Kubernetes is a way of managing poor software design and broken DLL design. End of this rant for my opinions on such things:
https://scottlocklin.wordpress.com/2022/02/19/managerial-failings-complification/
I hate it too, but it’s probably a necessary monkeypatch in current conditions.
Fiddling with “AI” APIs can be profitable; dude who sits next to me in the office makes a nice living just doing that. If it’s more classic machine learning, that’s actually fun: like cooking with bits.
Wirth was a genius for sure. I think I’m still traumatized by my shift from BASIC and assembler to Pascal though.
Heh.
That’s a nice article, indeed. Made me think of some code review I was subjected to little more than two hours ago, whereby I needed to create more classes to make other classes look neater. Still did the same stuff. To me, it seems to be more obfuscation.
Then again, this is for an in-house PHP project I’m working on. I’ve not had the time to test anything, but with the amount of object wrappers, third party libraries and PHP’s interpreted nature, it must be orders of orders of orders magnitude slower than any equivalent written in C.
Sometimes, you want software that challenges (from a learning perspective) you, but doesn’t involve shed loads of bloat and a billion dependencies. Douglas Comer’s XINU OS is pretty neat in this regard. A good book he wrote on it, too.
Comically, I used to work on a realtime telemetry system written in C. That was five years ago and the thing was almost thirty years old when I was there. It was still going strong… when I had left, I learnt that they had decided upon a migration to AWS. Never found out how that went, but the code was rock solid, really.
Can’t ever see a PHP application having that staying power without going through a billion upgrades.
PHP ran fakebook for years. I’ve always been surprised at how much performance they can squeeze out of such things. It was cheaper for them to invest in the PHP runtime to make it better than it was to rewrite the code. I’m sure it’s slower than C, but it’s faster to market.
Xinu looks neat. I had good experiences with VxWorks and especially QNX in grad school.
Haha
, I’m the reason behind Léon recent bug fixes. We’ve co-authored a few blog posts (see https://atcold.github.io/blog). Nice write-up, but the way! 
Thank you for your service (also for kind words). It’s funny I mostly ignored the .sn files back when I was using this thing. Figured they wouldn’t work, though I do remember the NetTool somehow. I will fiddle with them some more.
These two are particularly pertinent at showing the capabilities of the thing:
https://atcold.github.io/2024/09/27/SN-GUI.html
https://atcold.github.io/2024/08/05/SN.html
I allocated 120gb in Lush this afternoon; it’s in better shape than I thought.
One of the problems is that it was written in 32 bits time. Computation on indices is made with ints (32 bits) this limits the number of array elements to 2 billions. This is one of the many little reasons that pushed Ronan to write the Lua-based Torch5, copying what he liked from the Lush tensor and backprop libraries.
Yes, I remember: I think even Ralf’s redesign of index.c had some 32bit issues. Though such limitations are severe these days for neural nets, even 15 years ago a 32 bit array was pretty big.
Lush seems like a pretty good language for numerical work, definitely appears friendlier than Python and R.
The FFI capabilities looks like evalCpp from RCpp (linker magic instead of template magic).
GUI is a nice bonus. Python also has a minimal GUI with Tcl/Tk but I haven’t seen much use of that. Perhaps a minimized Python2.7 could be like Lush.
Somehow I’m not a big fan of Lisp things for numerics. Maybe R has set a bad example. I guess the wonderful thing about Lush is that it has all the pieces within reach, and isn’t too annoying about anything. Nowadays we have GPU, but that works just fine by calling through a library.
Nice post, thanks for sharing repo.
Yep, it’s all at your fingertips in one nice tool. R has some gui doodads with shiny and so on, but the overhead in setting them up is unpleasant and requires you to think in at least two programming languages. Lots of conveniences in R, and of course you have everything under the sun on CRAN, but I always end up feeling abused having to read a 200 page manual for each one. The Genius of R is in making a package manager that allows statistics experts to contribute code which doesn’t require them to be software experts. The downside is a lot of badly designed (but still useful) stuff encapsulated in packages.
Lisp mostly didn’t concern itself with the floating point end of the machine. Supposedly CMU-CL and Franz can do it (Richard Fatemen asserted so in at least one paper), but the fact that nobody actually has done all the work needed to make that happen makes me dubious. I tried to get Closure to do it; futile exercise as was Incanter. Lush and maybe Xlisp(stat) are the only ones I know of that got that part right. As I said above it would have been wonderful if Lush had continued development onto GPU land. Alternate better future that never happened.
The last time I looked into Lush and Xlisp-stat as a substitute for R and Matlab, I found xlisp-Stat has been partly ported over to common lisp. https://github.com/Lisp-Stat/lisp-stat I also thought that Chicken scheme looked like a possible Lush replacement because it has good C interoperability and speed.
lisp-stat doesn’t look like it has much in it. Neither tool is really a substitute for R or Matlab. The great strength of R is the package manager which allows software impaired statisticians to contribute their latest ideas without blowing anything up. Xlispstat required a lot of the developer/contributor.
I don’t remember why I didn’t fool around in Chicken or PLT/Racket. I think my idea at the time was that the JVM had a lot of cool stuff built in it, and so Clojure might provide a decent front end to it. Of course the realization one had to use JNI-ed Fortran to do anything matrix related was a stopping point for me. Also pulling JARs in was non-trivial and nothing like the Lush $’s experience.
Personal question, Scott. After programming for so long, are you starting to get tired of it? At 55, I just hit a wall and can’t even come up with hobbyist ideas. My interest in putting in the effort for new languages is nearly dead, after failing to learn Haskell. The rest just seem like variations of each other.
Well I was happy to find this old one I’ve used rather than trying to learn OCaML or whatever. They’re all basically doing the same shit, and they’re all broken, so it’s a means to an end.
Learn “art piece” languages, that showcase a style, not production languages. Nothing related to ML, but look at Oz for example. It started as a logic programming’s Scheme, and they have their own SICP (Concepts, Techniques and Models of Computer Programming, pdfs laying around on the interwebs). Every variable is a promise of a constraint (having an exact value is a constraint). Just using it means waiting on the constraint becoming fit enough to continue the thread. Compare to JS .then() and other continuation-passing-looking bullshit.
You can also learn something else, like biology, to see a radically different way to design things. Alan Kay is a mathematician and a biologist, and his original vision for OOP (closer to Erlang than Java) makes it evident. Consider his proposal for a word processor: each letter an object, flocking together like birds following locally simple rules, with complex overall behavior.
And of course you may simply suffer from deep mental exhaustion, just like too much pussy will produce men who subtly lack motivation too early in their lives. That’s more of a spiritual problem.