Regexps Used to Be AI
N/A
We know that “AI is whatever doesn’t work yet”. We also know that people often contrast AI (or DL, or LLMs specifically) derogatorily with classic forms of software, such as regexps: why use a LLM to waste gigaflops of compute to do what a few good regexps could…?
So I am amused to discover recently, by accident while wondering ‘what does the “regular” in “regular expression” mean, anyway?’, that it turns out that regexps are AI.
In fact, they are not even GOFAI symbolic AI, as you immediately assumed on hearing that, but they were originally connectionist AI research!
Huh? “Gwern, how could a regular expression or regular language, clearly intended for processing text, like in my text editor, possibly be a neural network in any way, shape, or form?”
Well, see for yourself—it turns out, per Wikipedia, that ‘regular events’ were introduced by Kleene himself with the justification of modeling McCulloch-Pitts neural nets! (Which are then modeled by ‘regular languages’ and conveniently written down as ‘regular expressions’, abbreviated to ‘regexps’ or ‘regexes’, and then extended/bastardized in countless ways since.) To quote from his 1951 RAND paper introducing the term for the first time (and conveniently available online):
It thus appears that the appropriate mathematical abstraction for us now is to treat the problem of explaining behavior as though organisms and machines were immortal, having an infinite future though a finite past. We want to provide for behavior that could be used ad infinitum, if merely the nerve net and effector mechanisms were immortal.
By trying to provide for behavior over an infinity of time by a finite mechanism, we have a model for the real problem of providing for complex behavior over a long finite lifetime by a relatively small mechanism.
…7. Regular Events:
7.1 “Regular events” defined: We shall presently describe a class of events which we will call “regular events”. (We would welcome any suggestions as to a more descriptive term. McCulloch and Pitts use a term “prehensible”, introduced rather differently; but since we did not understand their definition, we are hesitant to adopt the term.)…Our class of regular events will be defined inductively, starting with the definite events, and using 3 operations E ∨ F, E F, E ✱ F; ie. it shall be the last class containing the definite events, and closed under these operations.
The ‘regular’ here is not well-defined, as Kleene concedes, and is a gesture towards modeling ‘regularly occurring events’ (that the neural net automaton must process and respond to). He admits “regular” is a terrible term, and asks for a better term; but apparently, no one came up with anything better, and so we’re stuck with it.
(This historical background does make it less surprising that you can do neural net-like things with regular expressions, when we look at them as sparsely-connected discrete perceptrons operating on symbols rather than floating-point numbers, and so it makes sense that you can have differentiable finite state machines or you can evolve accurate neural nets for grammar tasks or train cellular automata, or that you can encode minimax-search chess engines into regexes. It also suggests that perhaps the dominance of regular expressions as we know them has not been a good thing, and has obscured the existence of many alternative ways to do these tasks, like in forgotten early programming languages such as SNOBOL or alternative paradigms like parser combinators or pushdown automata.)