“Zip Files: History, Explanation and Implementation”, Hans Wennborg2020-02-26 ()⁠:

I have been curious about data compression and the Zip file format in particular for a long time. At some point I decided to address that by learning how it works and writing my own Zip program. The implementation turned into an exciting programming exercise; there is great pleasure to be had from creating a well oiled machine that takes data apart, jumbles its bits into a more efficient representation, and puts it all back together again. Hopefully it is interesting to read about too.

This article explains how the Zip file format and its compression scheme work in great detail: LZ77 compression, Huffman coding, Deflate and all. It tells some of the history, and provides a reasonably efficient example implementation written from scratch in C…It is fascinating how the evolution of technology is both fast and slow. The Zip format was created 30 years ago based on technology from the fifties and seventies, and while much has changed since then, Zip files are essentially the same and more prevalent than ever. I think it is useful to have a good understanding of how they work.

[Thorough and well-illustrated descriptions of how Lempel-Ziv compression & Huffman coding work.]