Bibliography:

  1. ‘information theory’ tag

  2. ‘AI’ tag

  3. ‘NN sparsity’ tag

  4. ‘compressed Transformers’ tag

  5. ‘autoencoder NN’ tag

  6. ‘language’ tag

  7. Research Ideas

  8. Umineko: The Hopium Of The Magics

  9. The sort –key Trick

  10. Against Copyright

  11. The Complexity Dynamics of Grokking

  12. WebP: The WebPage Compression Format

  13. 21a7fc568307faab0c7c540b9a9b27f66af53ef9.html

  14. Investigating learning-independent abstract reasoning in artificial neural networks

  15. SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound

  16. Training LLMs over Neurally Compressed Text

  17. Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens

  18. Language Modeling Is Compression

  19. Bayesian Flow Networks

  20. Gzip versus bag-of-words for text classification with k-NN

  21. High-Fidelity Audio Compression with Improved RVQGAN

  22. White-Box Transformers via Sparse Rate Reduction

  23. How to enumerate trees from a context-free grammar

  24. DIRAC: Neural Image Compression with a Diffusion-Based Decoder

  25. Less is More: Parameter-Free Text Classification with Gzip

  26. Low-Bitrate Redundancy Coding of Speech Using a Rate-Distortion-Optimized Variational Autoencoder

  27. RGB no more: Minimally-decoded JPEG Vision Transformers

  28. High Fidelity Neural Audio Compression

  29. T2CI-GAN: Text to Compressed Image generation using Generative Adversarial Network

  30. DiffC: Lossy Compression with Gaussian Diffusion

  31. MuZero with Self-competition for Rate Control in VP9 Video Compression

  32. A deep dive into an NSO zero-click iMessage exploit: Remote Code Execution

  33. Palette: Image-to-Image Diffusion Models

  34. Autoregressive Diffusion Models

  35. Variational Diffusion Models

  36. Rip van Winkle’s Razor, a Simple New Estimate for Adaptive Data Analysis

  37. Why are tar.xz files 15× smaller when using Python’s tar library compared to macOS tar?

  38. Generating Images with Sparse Representations

  39. Rip van Winkle’s Razor: A Simple Estimate of Overfit to Test Data

  40. Generative Speech Coding with Predictive Variance Regularization

  41. 1-bit Adam: Communication Efficient Large-Scale Training with Adam’s Convergence Speed

  42. Scaling Laws for Autoregressive Generative Modeling

  43. not-so-BigGAN: Generating High-Fidelity Images on Small Compute with Wavelet-based Super-Resolution

  44. Zip Files: History, Explanation and Implementation

  45. The 1-Bit Instrument: The Fundamentals of 1-Bit Synthesis, Their Implementational Implications, and Instrumental Possibilities

  46. People Prefer Simpler Content When There Are More Choices: A Time Series Analysis of Lyrical Complexity in Six Decades of American Popular Music

  47. Bit-Swap: Recursive Bits-Back Coding for Lossless Compression with Hierarchical Latent Variables

  48. Unraveling the JPEG: JPEG images are everywhere in our digital lives, but behind the veil of familiarity lie algorithms that remove details that are imperceptible to the human eye. This produces the highest visual quality with the smallest file size—but what does that look like? Let’s see what our eyes can’t see!

  49. Practical Lossless Compression with Latent Variables using Bits Back Coding

  50. signSGD: Compressed Optimization for Non-Convex Problems

  51. Lempel-Ziv: a ‘1-bit catastrophe’ but not a tragedy

  52. BBhash: Fast and scalable minimal perfect hashing for massive key sets

  53. Full Resolution Image Compression with Recurrent Neural Networks

  54. On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models

  55. Compress and Control

  56. A really simple approximation of smallest grammar

  57. One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling

  58. The thermodynamics of prediction

  59. Notes on a New Philosophy of Empirical Science

  60. Universal Entropy of Word Ordering Across Linguistic Families

  61. Google-Wide Profiling: A Continuous Profiling Infrastructure for Data Centers

  62. New Strategy of Lossy Text Compression

  63. A Monte Carlo AIXI Approximation

  64. A Machine Learning Perspective on Predictive Coding With PAQ8 and New Applications

  65. 16eabf617641b695866278adadce672d0cfe182c.pdf

  66. The Bayesian brain: the role of uncertainty in neural coding and computation

  67. Clustering by compression

  68. Data Compression and Entropy Estimates by Non-sequential Recursive Pair Substitution

  69. Compression and Information Leakage of Plaintext

  70. Estimating and Comparing Entropy across Written Natural Languages Using PPM Compression

  71. Language Trees and Zipping

  72. Redundancy reduction revisited

  73. Fast Text Compression with Neural Networks

  74. Text Compression as a Test for Artificial Intelligence

  75. An Information-Theoretic Model for Steganography

  76. The Art of Computer Programming, Volume 3: Sorting & Searching § Chapter 6, Searching: Hashing: History

  77. THE ENTROPY OF ENGLISH USING PPM-BASED MODELS—Data Compression Conference, 1996. DCC '96. Proceedings

  78. Measuring the complexity of writing systems

  79. Entropy of natural languages: Theory and experiment

  80. Possible Principles Underlying the Transformations of Sensory Messages

  81. Prediction and Entropy of Printed English

  82. About the Test Data

  83. Timm S. Mueller

  84. c387f7a63c648997e42dc66c053538d4fc1fd517.html

  85. Codec2: a Whole Podcast on a Floppy Disk

  86. d11f6883e3e6248e8b858475f79c7d391f5d63f3.html

  87. Finding Near-Duplicates With Jaccard Similarity and MinHash

  88. 823ca6aed7f4694a027316f8407ec5aea7e254e6.html

  89. How We Shrank Our Trip Planner till It Didn’t Need Data.

  90. The Complexity Dynamics of Grokking [Blog]

  91. Statistical Inference Through Data Compression

  92. 5af1e4a5d67ab84d46ac933f01132c9e0a2002cf.html

  93. ChessPositionRanking/img/2389704906374985477664262349386869232706664089.png at Main · Tromp/ChessPositionRanking

  94. Relation of Word Order and Compression Ratio and Degree of Structure

  95. King James Programming

  96. That Alien Message

  97. design#future-tag-features

    [Transclude the forward-link's context]

  98. 2010-stevesouder-forcinggzipcompression.html

  99. 2004-ryannorth-dinosaurcomics-391.png

  100. 1999-mahoney-figure1-compressorbenchmarksonenglishtextanddegradationbyshuffling.jpg

  101. http://brokenbytes.blogspot.com/2015/04/the-making-of-p0-snake-part-3-audio.html

  102. 2f7db188118d56be4cd2aff63e9b14fea530bc65.html

  103. http://ed-von-schleck.github.io/shoco/

  104. 8f9d8392fc8f2068eb4e8a2492b0427b0621cc36.html

  105. http://james.fabpedigree.com/bwt.htm

  106. 1faced9cb3ea2232cf629e8acef3b657ff3100fb.html

  107. http://keyj.emphy.de/mp3-for-image-compression/

  108. b7f12a5c453d8f4c3391403c143e3bc294df27a9.html

  109. http://neoscientists.org/~tmueller/binsort/

  110. 9802872843f29498229b95588e7bdebd7d86cad8.html

  111. http://penduin.blogspot.com/2006/10/pi-compression.html

  112. d9b11499e5e2aa07257e05df0a03e2e558281220.html

  113. http://prize.hutter1.net/

  114. http://slightlynew.blogspot.com/2011/05/who-writes-wikipedia-information.html

  115. 0f3eec4f8d86d5c79787c32c0c1f4e59d6913742.html

  116. http://sub.blue/

  117. http://thevirtuosi.blogspot.com/2011/08/tweet-is-worth-at-least-140-words.html

  118. b13647242146c90f0cbc9b8256357550f2fd8a4a.html

  119. http://timbaumann.info/svd-image-compression-demo/

  120. http://www.byronknoll.com/cmix.html

  121. 2fcab2e668d671915958f32dc7e5d4607a1bd18b.html

  122. http://www.daemonology.net/papers/thesis.pdf

  123. e9c805820b86a71ea2e4ef2d981493804646ffcf.pdf

  124. http://www.scholarpedia.org/article/Algorithmic_probability

  125. https://ai.facebook.com/blog/deepfovea-using-deep-learning-for-foveated-reconstruction-in-ar-vr

  126. https://bellard.org/nncp/

  127. https://blog.andrewcantino.com/blog/2012/06/15/compressing-code/

  128. https://blog.cloudflare.com/brotli-compression-using-a-reduced-dictionary/

  129. 6ba9412a3b8f234e83e724b333d574030e2ad413.html

  130. https://blog.cloudflare.com/improving-compression-with-preset-deflate-dictionary/

  131. e8863eae53f7f65e180fcfe22a9516e0d8bdda57.html

  132. https://blog.cloudflare.com/results-experimenting-brotli/

  133. 0294a6dec72e4a7d3539694997cc6e97f15ae624.html

  134. https://blog.jcoglan.com/2017/02/12/the-myers-diff-algorithm-part-1/

  135. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=716a587f2852bb8134454143868e860cabdfe84f

  136. https://clemenswinter.com/2024/04/07/the-simple-beauty-of-xor-floating-point-compression/

  137. 2b401c25ea519a180d9227c0e7a3fb31e05fee9e.html

  138. https://cloudinary.com/blog/a_one_color_image_is_worth_two_thousand_words#the_most_predictable_image

  139. 39e652ea1f3d2a52a6205095418f11625a3d95fd.html#the_most_predictable_image

  140. https://cloudinary.com/blog/jpeg-xl-and-the-pareto-front

  141. 61dcfec2cddb44e41042ce72bdea7018d925489d.html

  142. https://code.flickr.net/2015/09/25/perceptual-image-compression-at-flickr/

  143. ea80173c408ee328dd4b1c1c98689e86966715d9.html

  144. https://code.google.com/archive/p/paqclass

  145. b18a40649b9d36ff9edf2ed0ce5f781c38ff6bcb.html

  146. https://code4k.blogspot.com/2010/12/crinkler-secrets-4k-intro-executable.html

  147. https://dbohdan.com/jpeg-xl

  148. https://fastcompression.blogspot.com/2018/02/when-to-use-dictionary-compression.html

  149. 6ab5d2db335d5a3b76a4a30bc93557bb8c72dc0d.html

  150. https://frankforce.com/city-in-a-bottle-a-256-byte-raycasting-system/

  151. 9e4cdf43d81fc5e7e0976584e0437bbdcc77dfe5.html

  152. https://gafferongames.com/post/snapshot_compression/

  153. 41b5368dd9f75549a7f38a87ab8dc887ae5d2977.html

  154. https://gist.github.com/munificent/b1bcd969063da3e6c298be070a22b604

  155. https://github.com/Blosc/c-blosc

  156. 3a0becf3d40bfb9cbbaf7ce029d1da985d84bbab.html

  157. https://github.com/WICG/compression-dictionary-transport/blob/main/examples#static-resource-flow-results

  158. https://github.com/albertz/png-db

  159. https://github.com/ckolivas/lrzip

  160. https://github.com/facebook/zstd#the-case-for-small-data-compression

  161. https://github.com/mafm/HashLife

  162. https://github.com/mhx/dwarfs?tab=readme-ov-file#comparison

  163. 9f073b5ef439e92f530e94378ad8e40b125313b7.html#comparison

  164. https://github.com/mhx/dwarfs?tab=readme-ov-file#overview

  165. https://github.com/sasakiassociates/png-db

  166. https://github.com/thomasahle/ziplm

  167. https://hackaday.io/project/5689-lossy-text-compression

  168. 3febcfaf7e0ddea8d2cbdb95ae5fa7c6cb033f34.html

  169. https://intapi.sciendo.com/pdf/10.2478/ijasitels-2020-0003

  170. 536f380a88e2cf5d1aa196dc9c33a7dbdc467901.pdf

  171. https://kenschutte.com/gzip-knn-paper2/

  172. https://kevincox.ca/2022/03/01/dictionary-compression/

  173. c7f2869866750ad20007ea40de7bc2daf6cecc22.html

  174. https://killedbyapixel.github.io/TinyCode/games/CrossMyHeart/

  175. https://kylehovey.github.io/blog/automata-nebula

  176. https://laurmaedje.github.io/posts/hypher/

  177. f3bb3f3a2109b141f868dd9e87ca889fe7e95075.html

  178. https://lichess.org/@/lichess/blog/developer-update-275-improved-game-compression/Wqa7GiAA

  179. https://mailinator.blogspot.com/2012/02/how-mailinator-compresses-email-by-90.html

  180. 12f946a1662a4a97148b0b4c059691efd159105a.html

  181. https://matradomski.com/posts/data_compression/

  182. 497ba0e8a2d5ac4dc19c039d0d448dec94ce606b.html

  183. https://mattmahoney.net/dc/dce.html

  184. https://maxhalford.github.io/blog/text-classification-by-compression/

  185. https://opus-codec.org/demo/opus-1.5/

  186. 619f5130ce1a1f43a51fad3ba6349dd05a72957e.html

  187. https://research.google/blog/lyra-a-new-very-low-bitrate-codec-for-speech-compression/

  188. https://rs.io/creativity-literature-compression/#methods

  189. https://samwho.dev/bloom-filters/

  190. https://shkspr.mobi/blog/2024/01/compressing-text-into-images/

  191. c2a500e080a1d9eec0e2dd2177d9852af9214c62.html

  192. https://spectrum.ieee.org/hans-peter-luhn-and-the-birth-of-the-hashing-algorithm

  193. e452846d922854b741a5d3b14556b611bae4f858.html

  194. https://terrytao.wordpress.com/2007/04/13/compressed-sensing-and-single-pixel-cameras/

  195. b9003ae6ff7bb21786eef4149c589e4374d93f4e.html

  196. https://timepedia.blogspot.com/2009/08/on-reducing-size-of-compressed.html

  197. https://timepedia.blogspot.com/2009/11/traveling-salesman-problem-and.html

  198. https://triplehappy.wordpress.com/2015/10/26/chess-move-compression/

  199. 6dbefdd86c0038387704ff409dbb9b719800763a.html

  200. https://web.archive.org/web/20100924002346/http://blog.podly.tv/the-lost-quarter-century-in-data-compression

  201. 4eb14dcde0c4d14d27985278b11488a546aaa46a.html

  202. https://web.archive.org/web/20140918110745/http://friggeri.net/blog/a-genetic-approach-to-css-compression/

  203. https://wiki.archlinux.org/title/Lrzip

  204. https://wrap.warwick.ac.uk/61087/7/WRAP_cs-rr-360.pdf#page=2

  205. https://www.abortretry.fail/p/lz-compression

  206. 8e8c5992a135a68920359cdf168a3e6da38809d2.html

  207. https://www.antoniomallia.it/sorted-integers-compression-with-elias-fano-encoding.html

  208. be62196910e9bdeaed1cbf12a101edcb8303b035.html

  209. https://www.chromium.org/developers/design-documents/software-updates-courgette/

  210. d38d1cf6087589e0a85c757bd84939985c51dc48.html

  211. https://www.filfre.net/2013/12/elite/

  212. https://www.jodybruchon.com/2010/11/27/sort-compressed-tar-archives-to-make-them-smaller-20-percent-smaller/

  213. 31ad1577e4febd8f6479c9425d51b67a61dcd7fd.html

  214. https://www.lofibucket.com/articles/64k_intro.html

  215. 82d03c8e25bda7e0e524a7ac5ff88b2d82bd59fb.html

  216. https://www.stavros.io/posts/compressing-images-with-stable-diffusion/

  217. https://x.com/patriciogv/status/1443931444292866063

  218. Language Modeling Is Compression

  219. https%253A%252F%252Farxiv.org%252Fabs%252F2309.10668%2523deepmind.html

  220. Less is More: Parameter-Free Text Classification with Gzip

  221. https%253A%252F%252Farxiv.org%252Fabs%252F2212.09410.html

  222. High Fidelity Neural Audio Compression

  223. https%253A%252F%252Farxiv.org%252Fabs%252F2210.13438%2523facebook.html

  224. Rip van Winkle’s Razor, a Simple New Estimate for Adaptive Data Analysis

  225. https%253A%252F%252Fwww.offconvex.org%252F2021%252F04%252F07%252Fripvanwinkle%252F.html

  226. 1-bit Adam: Communication Efficient Large-Scale Training with Adam’s Convergence Speed

  227. https%253A%252F%252Farxiv.org%252Fabs%252F2102.02888%2523microsoft.html

  228. Scaling Laws for Autoregressive Generative Modeling

  229. Jared Kaplan

  230. Speaker Details: EmTech MIT 2023

  231. Alec Radford

  232. Aditya A. Ramesh

  233. John Schulman’s Homepage

  234. Sam McCandlish

  235. https%253A%252F%252Farxiv.org%252Fabs%252F2010.14701%2523openai.html

  236. not-so-BigGAN: Generating High-Fidelity Images on Small Compute with Wavelet-based Super-Resolution

  237. https%253A%252F%252Farxiv.org%252Fabs%252F2009.04433.html

  238. Measuring the complexity of writing systems

  239. %252Fdoc%252Fpsychology%252Fwriting%252F1994-vandenbosch.pdf.html

  240. Possible Principles Underlying the Transformations of Sensory Messages

  241. %252Fdoc%252Fpsychology%252Fneuroscience%252F1961-barlow.pdf.html