“Fast Text Compression With Neural Networks”, Matthew V. Mahoney2000 (, ; backlinks)⁠:

Neural networks have the potential to extend data compression algorithms beyond the character level n-gram models now in use, but have usually been avoided because they are too slow to be practical.

We introduce a model that produces better compression than popular Lempel-Ziv compressors (zip, gzip, compress), and is competitive in time, space, and compression ratio with PPM and Burrows-Wheeler algorithms, currently the best known.

The compressor, a bit-level predictive arithmetic encoder using a 2-layer, 4 × 106 by 1 network, is fast (about 104 characters/second) because only 4–5 connections are simultaneously active and because it uses a variable learning rate optimized for one-pass training.