The Turing test for artificial intelligence is widely accepted, but is subjective, qualitative, non-repeatable, and difficult to implement. An alternative test without these drawbacks is to insert a machine’s language model into a predictive encoder and compress a corpus of natural language text. A ratio of 1.3 bits per character or less indicates that the machine has AI.
3 pieces of evidence support this claim. First, text compression is shown to be more stringent than the Turing test under reasonable assumptions. Second, humans use high-level knowledge in character prediction tests. Third, compression, like AI, is unsolved: under conditions in which human text-prediction tests show an entropy of 1.3 bits per character or less, the best compression algorithm known achieves 1.87 bits per character.
…No compression program has achieved this. 7 programs, including those top-rated by Gilchrist1998 and Bell1998 were used to compress English narrative, Alice in Wonderland (alice30.txt from the Gutenberg press, minus header) and Far from the Madding Crowd by Thomas Hardy (book1 from the Calgary corpus), after reducing both to 27 characters. The best compression was achieved by rkive 1.91b1: 1.86 bpc on alice and 1.94 on book1. Others tested (from worst to best) were compress 4.3d, pkzip 2.04e, gzip 1.2.4, ha 0.98, szip 1.05×, and boa 0.58b. All program options were set for maximum compression.
Better compressors “learn”, using prior input to improve compression on subsequent input. szip was the best learner, compressing book1 to about 95% of the size of the two halves compressed separately. The first figure below shows the correlation between compression and learning. Similar results were obtained for alice.
It was also found that better compressors make greater use of the syntactic and semantic constraints of English. Lexical, syntactic, and semantic constraints were selectively broken by swapping pairs of letters within words, pairs of words, or pairs of phrases respectively. Results for the original text of book1 are shown in the second figure, with similar results for alice. The swapping transforms are reversible and do not change file size or information content.