`find` + `mkdir` is Turing complete

Update history

2024-08-02 Fixed a proof error that existed in the first version. The first version claimed to have shown Turing completeness by implementing Rule 110, but there was an issue that it cannot handle an unbounded number of iterations. The current version implements the Tag system instead of Rule 110, and the issue should have been resolved.

Overview

We show that a system that has only GNU's find and mkdir commands is Turing-complete.

It is well known that the sed and awk commands are Turing complete on their own, but I could not find any reference of find + mkdir being Turing-complete.

The proof is done by implementing a tag system.

We will look at the implementation of a loop, FizzBuzz, and a tag system in this order.

References

Implementation

Loop construction

The following code recursively creates directories and enters an infinite loop:

mkdir x
find x -execdir mkdir x/x \;

find x lists files under x, including x. When x is listed, it runs mkdir to create x/x, and then it's included in the next find iteration, leading to the creation of x/x/x, and so on.

To limit the depth of directory creation, we can employ the -maxdepth option:

mkdir x
find x -maxdepth 3 -execdir mkdir x/x \;

This code will terminate after creating x/x/x/x/x. Replacing $3$ with $N$ will result in the $N+2$ levels of x directories.

FizzBuzz

The -regex option of find allows you to filter filenames that will be subject to subsequent actions. Using this, we can filter out the x/s with multiples of 3, 5, and 15, and by combining this with a loop, we can implement FizzBuzz. In the following code, -regextype posix-extended is used for readability, but the same thing should be possible with any regular expression syntax.

mkdir -p d/x
find d/x -regextype posix-extended -regex 'd(/x){0,29}' -execdir mkdir x/x \;
find d -regextype posix-extended \
-regex 'd((/x){15})+' -printf "FizzBuzz\n" -o \
-regex 'd((/x){5})+' -printf "Buzz\n" -o \
-regex 'd((/x){3})+' -printf "Fizz\n" -o \
-regex 'd(/x)+' -printf "%d\n"

Execution result

On the second line, it creates x/... /x (30 xs) below d, and then, on the third line, it outputs FizzBuzz if the number of xs is a multiple of 15, Buzz if it is a multiple of 5, Fizz if it is a multiple of 3, and otherwise the depth of the file from d, i.e. the number of xs.

Implementing a tag system

A tag system is a triplet $(m, A, P)$ , where

$m$ is a positive integer.
$A$ is a finite alphabet containing the halting symbol H.
$P$ is the production rule for each alphabet.

Given the initial string the system repeats

If the string's length is less than $m$ or its leading character $x$ is H, halt. Otherwise, add $P(x)$ to the end.
Delete the first $m$ characters.

and outputs the string when it halts.

It's known that there is a tag system with $m=2, |A|=576$ that is a universal Turing machine (De Mol, L., 2008), and thus a system that can run a tag system of this size is Turing complete.

Here we use an example of a tag system with $m=2, |A|=4$ from Wikipedia and show its implementation.

The core idea is to iteratively append the file path representing the next state to the file path that represents a state using _ as a separator. For example, the state _/b/a/a/_ will be followed by the next state a/c/c/a resulting in the file path _/b/a/a/_/a/c/c/a/_, and this process is repeated until the system halts. During the execution, no more than one file is created in a directory.

# Demonstrate 2-tag system can be implemented with `find` and `mkdir` only,
# using an example from Wikipedia.
# en.wikipedia.org/wiki/Tag_system#Example:_A_simple_2-tag_illustration

# _'s are used as a separator between states.
mkdir _

# Production rules for an M-tag system given as constants.
M=2
PROD_A="c/c/b/a/H"
PROD_B="c/c/a"
PROD_C="c/c"

# Initial state (surrounded by _'s).
mkdir -p _/b/a/a/_

# Alphabets (constant size)
S="(/[abcH])"

# Keep appending the next state to the end of the state until seeing the halting symbol. e.g.
# _/b/a/a/_ -> _/b/a/a/_/a/c/c/a/_ -> _/b/a/a/_/a/c/c/a/_/c/a/c/c/b/a/H/_ -> ... -> .../_/H/c/c/c/c/c/c/a/_ (halt)
#
# Line 2:    Halting condition
# Line 3-6:  Copy the previous state removing M characters from the start, utilizing back-references.
# Line 7-29: Apply the production rule for a, b, c.
find _ -regextype posix-extended -empty \( \
    -regex ".*_/H.*_|.*_(/[^/]?){,$M}_" -prune -o \
    -regex ".*_$S{$M}($S*)/a$S*/_\2" \( -execdir mkdir a/a b/a c/a H/a _/a \; -o -true \) -o \
    -regex ".*_$S{$M}($S*)/b$S*/_\2" \( -execdir mkdir a/b b/b c/b H/b _/b \; -o -true \) -o \
    -regex ".*_$S{$M}($S*)/c$S*/_\2" \( -execdir mkdir a/c b/c c/c H/c _/c \; -o -true \) -o \
    -regex ".*_$S{$M}($S*)/H$S*/_\2" \( -execdir mkdir a/H b/H c/H H/H _/H \; -o -true \) -o \
    \( \
        -regex ".*_/a$S*/_$S*" \( \
            -execdir find a \; -execdir mkdir -p a/$PROD_A/_ \; -o \
            -execdir find b \; -execdir mkdir -p b/$PROD_A/_ \; -o \
            -execdir find c \; -execdir mkdir -p c/$PROD_A/_ \; -o \
            -execdir find H \; -execdir mkdir -p H/$PROD_A/_ \; -o \
            -execdir find _ \; -execdir mkdir -p _/$PROD_A/_ \; \
        \) -o \
        -regex ".*_/b$S*/_$S*" \( \
            -execdir find a \; -execdir mkdir -p a/$PROD_B/_ \; -o \
            -execdir find b \; -execdir mkdir -p b/$PROD_B/_ \; -o \
            -execdir find c \; -execdir mkdir -p c/$PROD_B/_ \; -o \
            -execdir find H \; -execdir mkdir -p H/$PROD_B/_ \; -o \
            -execdir find _ \; -execdir mkdir -p _/$PROD_B/_ \; \
        \) -o \
        -regex ".*_/c$S*/_$S*" \( \
            -execdir find a \; -execdir mkdir -p a/$PROD_C/_ \; -o \
            -execdir find b \; -execdir mkdir -p b/$PROD_C/_ \; -o \
            -execdir find c \; -execdir mkdir -p c/$PROD_C/_ \; -o \
            -execdir find H \; -execdir mkdir -p H/$PROD_C/_ \; -o \
            -execdir find _ \; -execdir mkdir -p _/$PROD_C/_ \; \
        \) \
    \) \
\) &> /dev/null

# Output the result. (_/H/c/c/c/c/c/c/a/_)
find _ -depth ! -empty -name _ -execdir find _ -empty \; -quit

Execution result

You can see that the expected result _/H/c/c/c/c/c/c/a/_ is indeed printed. Whereas the FizzBuzz implementation used ordinary regular expressions, notice here we use back-references (\2), which make it possible to copy the previous state without the first $m$ characters.

This construct can be obviously extended to a larger (constant) alphabet size. (The lack of characters can be easily addressed by making each file have more than one character.)

Therefore, find + mkdir is Turing complete.

Expected questions and answers

Is it possible that you cannot execute automata of arbitrary size due to the length limit of the file path?
- I think it is probably not the case. mkdir will fail if you pass it a file path of a certain length or longer, but the code above carefully avoids passing arbitrary-length file paths directly to mkdir. As far as I've tested, find works even with paths longer than 30000, and I haven't found a limit.
Is the above code guaranteed to work according to the POSIX specification?
- Unfortunately, no. Rather, it is clearly stated that the behavior is unspecified if a file is added in the directory being searched during execution (source). I haven't checked how tools other than GNU behave.
- The versions of the tools I used are shown below.

-> % find --version | head -1 && mkdir --version | head -1 && uname -a
find (GNU findutils) 4.8.0
mkdir (GNU coreutils) 8.32
Linux DESKTOP-5JU1LI7 5.15.153.1-microsoft-standard-WSL2 #1 SMP Fri Mar 29 23:14:13 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Acknowledgements

Thanks to deredede for pointing out a proof error that existed in the first version and reviewing the correction.
Thanks to niederman for the comment about tag systems

find + mkdir is Turing complete