Bibliography:

  1. CLIP: Connecting Text and Images: We’re introducing a neural network called CLIP which efficiently learns visual concepts from natural language supervision. CLIP can be applied to any visual classification benchmark by simply providing the names of the visual categories to be recognized, similar to the ‘zero-shot’ capabilities of GPT-2 and GPT-3

  2. T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

  3. Contrastive Representation Learning: A Framework and Review

  4. WaveNet: A Generative Model for Raw Audio

  5. Malware Detection by Eating a Whole EXE

  6. Multi-trait analysis of genome-wide association summary statistics using MTAG

  7. Speech2Face: Learning the Face Behind a Voice

  8. Variance Components Beyond Genetics

  9. Assessing the Big Five personality traits using real-life static facial images

  10. LipNet: End-to-End Sentence-level Lipreading

  11. LipNet: How Easy Do You Think Lipreading Is?

  12. Absolute Unit NNs: Regression-Based MLPs for Everything

  13. https://www.lesswrong.com/posts/K7AyY8LMrcKhwfbyj/no-really-attention-is-all-you-need-attention-can-do

  14. Scaling MLPs: A Tale of Inductive Bias

  15. Technology Forecasting: The Garden of Forking Paths

  16. Scaling Laws for Neural Language Models

  17. Chinchilla: Training Compute-Optimal Large Language Models

  18. Attention Is All You Need

  19. The Scaling Hypothesis

  20. Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers

  21. Bayesian Optimization in AlphaGo

  22. InvertOrNot.com Proposal

  23. Abandoning Objectives: Evolution Through the Search for Novelty Alone

  24. Towards a Human-like Open-Domain Chatbot

  25. Scaling Laws for Reward Model Overoptimization

  26. Timeghost

  27. crop#aspect-ratio-training

    [Transclude the forward-link's context]

  28. SDXL § Micro-Conditioning: Conditioning the Model on Image Size

  29. Choose-Your-Own-Adventure AI Dungeon Games

  30. GPT-2 Preference Learning for Music Generation § Optimization by Backprop, Not Blackbox

  31. Visual Autoregressive Modeling (VAR): Scalable Image Generation via Next-Scale Prediction

  32. Progressive Growing of GANs for Improved Quality, Stability, and Variation

  33. PixelRNN: Pixel Recurrent Neural Networks

  34. Parallel Multiscale Autoregressive Density Estimation

  35. not-so-BigGAN: Generating High-Fidelity Images on Small Compute with Wavelet-based Super-Resolution

  36. Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors

  37. MAE: Masked Autoencoders Are Scalable Vision Learners

  38. https://arxiv.org/pdf/2307.01952.pdf#page=3

  39. Image GPT (iGPT): We find that, just as a large transformer model trained on language can generate coherent text, the same exact model trained on pixel sequences can generate coherent image completions and samples

  40. Nenex: A Neural Personal Wiki Idea

  41. dynamic-evaluation#scaling-laws

    [Transclude the forward-link's context]

  42. LLM Applications I Want To See

  43. ‘AI mode collapse’ tag

  44. Virtual comments: idea for LLM support for writing LessWrong posts

  45. Hierarchical Embeddings for Text Search

  46. Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

  47. Recursively Summarizing Books with Human Feedback

  48. Co-Writing Screenplays and Theatre Scripts with Language Models (Dramatron): An Evaluation by Industry Professionals

  49. https://arxiv.org/pdf/2209.14958#page=5&org=deepmind

  50. design#future-tag-features

    [Transclude the forward-link's context]

  51. The Curious Case of Neural Text Degeneration

  52. ‘discrete diffusion model’ tag

  53. resorter#noisy-sorting

    [Transclude the forward-link's context]

  54. https://beta.openai.com/docs/guides/classifications

  55. Text and Code Embeddings by Contrastive Pre-Training

  56. 01#gzip

    [Transclude the forward-link's context]

  57. Calculating The Gaussian Expected Maximum § Probability of Bivariate Maximum

  58. The Relationship Of Validity Coefficients To The Practical Effectiveness Of Tests In Selection: Discussion And Tables

  59. Number Search Engine via NN Embeddings

  60. littlewood#media

    [Transclude the forward-link's context]

  61. Websim, Worldsim, and The Summer of Simulative AI

  62. Some Evidence of Bees and Honey in Ancient Egypt

  63. leprechaun#miscitation

    [Transclude the forward-link's context]

  64. Leprechaun Hunting & Citogenesis

  65. Chaff Bugs: Deterring Attackers by Making Software Buggier

  66. ROME: Locating and Editing Factual Associations in GPT

  67. Activation Addition: Steering Language Models Without Optimization

  68. Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet

  69. ‘truesight (stylometrics)’ tag

  70. The Art of the Shadow: How Painters Have Gotten It Wrong for Centuries [From The Visual World of Shadows]

  71. Three Months in Monte Carlo

  72. Analytic and Algorithmic Solution of Random Satisfiability Problems

  73. Anime Crop Datasets: Faces, Figures, & Hands § Hands

  74. DAgger: A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning

  75. eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers

  76. Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise

  77. gpt-2-preference-learning#differentiable-sorting

    [Transclude the forward-link's context]

  78. Unsupervised Neural Machine Translation with Generative Language Models Only

  79. https://www.crosslabs.org/blog/diffusion-with-offset-noise

  80. Progressive Distillation for Fast Sampling of Diffusion Models

  81. Consistency Models

  82. Problem 14 Dynamic Programming Solutions

  83. mixup: Beyond Empirical Risk Minimization

  84. DataMUX: Data Multiplexing for Neural Networks

  85. Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

  86. Rectified Flow: A Marginal Preserving Approach to Optimal Transport

  87. InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation

  88. UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs

  89. TRACT: Denoising Diffusion Models with Transitive Closure Time-Distillation

  90. Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

  91. ‘recurrent Transformers’ tag

  92. Diffusion Is Spectral Autoregression

  93. Progressive Growing of GANs for Improved Quality, Stability, and Variation

  94. Text Embeddings Reveal (Almost) As Much As Text

  95. Absolute Unit NNs: Regression-Based MLPs for Everything § Memorize All The Things

    [Transclude the forward-link's context]

  96. DALL·E 1: Creating Images from Text: We’ve trained a neural network called DALL·E that creates images from text captions for a wide range of concepts expressible in natural language

  97. GANs Didn’t Fail, They Were Abandoned

  98. StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

  99. GigaGAN: Scaling up GANs for Text-to-Image Synthesis

  100. BigGAN: Consistency Regularization (SimCLR-Style) Loss

  101. A Simple Framework for Contrastive Learning of Visual Representations

  102. Training GANs with Stronger Augmentations via Contrastive Discriminator (ContraD)

  103. Self-conditioned Image Generation via Generating Representations

  104. The Unusual Effectiveness of Averaging in GAN Training

  105. Stochastic Weight Averaging and the Ornstein-Uhlenbeck Process

  106. Connecting Generative Adversarial Networks and Actor-Critic Methods

  107. How AI Training Scales

  108. face#minibatch-retrieval

    [Transclude the forward-link's context]

  109. Making Anime Faces With StyleGAN § Reversing StyleGAN To Control & Modify Images

  110. face#biggan-latent-space

    [Transclude the forward-link's context]

  111. Net2Net: Accelerating Learning via Knowledge Transfer

  112. The Cost of Imbalance in Clinical Trials

  113. The Power of Twins: The Scottish Milk Experiment

  114. Policy Learning and Evaluation with Randomized Quasi-Monte Carlo

  115. Small-GAN: Speeding Up GAN Training Using Core-sets

  116. Top-K Training of GANs: Improving GAN Performance by Throwing Away Bad Samples

  117. https://algorithmsbook.com/files/dm.pdf#page=246

  118. Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models

  119. MSG-GAN: Multi-Scale Gradients for Generative Adversarial Networks

  120. Generator Knows What Discriminator Should Learn in Unconditional GANs

  121. Simple statistical gradient-following algorithms for connectionist reinforcement learning

  122. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

  123. Distilling the Knowledge in a Neural Network

  124. https://x.com/mere_mortise/status/934932000796020736

  125. Pix2Pix: Image-to-Image Translation with Conditional Adversarial Networks

  126. Sem-GAN: Semantically-Consistent Image-to-Image Translation

  127. Improving Shape Deformation in Unsupervised Image-to-Image Translation

  128. Detecting GAN generated errors

  129. A U-Net Based Discriminator for Generative Adversarial Networks

  130. CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features

  131. ImageNet: A Large-Scale Hierarchical Image Database

  132. Novelty Nets: Classifier Anti-Guidance

  133. [D] RL: GANs As MCTS Environment Simulator for Deep Model-Based Planning?

  134. The Shattered Gradients Problem: If resnets are the answer, then what is the question?

  135. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks

  136. Data-dependent Initializations of Convolutional Neural Networks

  137. Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity

  138. Deep Information Propagation

  139. On weight initialization in deep neural networks

  140. Convolution Aware Initialization

  141. HyperNetworks

  142. Using Fast Weights to Attend to the Recent Past

  143. SMASH: One-Shot Model Architecture Search through HyperNetworks

  144. https://www.lesswrong.com/posts/2JJtxitp6nqu6ffak/basic-facts-about-language-models-during-training-1#M3wsmwiGBCxd4dHHW

  145. GPT-2 Preference Learning for Music Generation § Bradley-Terry Preference Learning

  146. GPT-2 Preference Learning for Music Generation § Decision Transformers: Preference Learning As Simple As Possible

  147. Gato: A Generalist Agent

  148. Learning to summarize from human feedback

  149. ‘AlphaStar’ tag

  150. Player of Games

  151. Diversifying AI: Towards Creative Chess with AlphaZero (AZdb)

  152. MLMAC

  153. Better Language Models and Their Implications

  154. Bigscience/bloom

  155. XLNet: Generalized Autoregressive Pretraining for Language Understanding

  156. https://www.lesswrong.com/posts/aPeJE8bSo6rAFoLqg/solidgoldmagikarp-plus-prompt-generation

  157. SolidGoldMagikarp II: Technical Details and More Recent Findings

  158. https://www.lesswrong.com/posts/8viQEp8KBg2QSW4Yc/solidgoldmagikarp-iii-glitch-token-archaeology

  159. GPT-3 Creative Fiction § BPEs

  160. scaling-hypothesis#blessings-of-scale

    [Transclude the forward-link's context]

  161. $2023

  162. On Being The Right Size

  163. Computer Optimization: Your Computer Is Faster Than You Think § DL

    [Transclude the forward-link's context]

  164. Motion Planning for Dynamic Knotting of a Flexible Rope with a High-speed Robot Arm

  165. Motion Planning for Dynamic Folding of a Cloth with Two High-Speed Robot Hands and Two High-Speed Sliders

  166. Free-Play Periods for RL Agents

  167. Brit-Pick

  168. The Surprising Number of American Adults Who Think Chocolate Milk Comes from Brown Cows

  169. https://ru.wikipedia.org/wiki/%D0%92%D1%8F%D0%B7%D1%8C

  170. ‘A Font Inspired by Square Word Calligraphy’, Pomdepin

  171. https://fontsinuse.com/typefaces/40498/ed-interlock

  172. Utext: Rich Unicode Documents

  173. XKCD #941: Depth Perception

  174. Depth Perception

  175. Speculative Loading

  176. Prerender Pages in Chrome for Instant Page Navigations

  177. Speculation Rules API - Web APIs

  178. Banner Ads Considered Harmful

  179. Cat itecture: Better Cat Window Boxes

  180. LAION-Aesthetics

  181. Sandspiel

  182. State-Space of Drug Effects: Results

  183. Darknet Market Archives (2013–2015)

  184. Acne: a good Quantified Self topic

  185. anime#battle-angel-alita

    [Transclude the forward-link's context]

  186. movie#ready-player-one

    [Transclude the forward-link's context]

  187. https://www.juliansanchez.com/2009/12/08/the-redactors-dilemma/

  188. https://www.fastcompany.com/90692176/chinese-wikipedia

  189. Nucleus Genomics

  190. Formal Theory of Creativity & Fun & Intrinsic Motivation (1990–2010)

  191. Magic, Explanations, and Evil: The Origins and Design of Witches and Sorcerers [and replies]

  192. Wikipedia Bibliography:

    1. Gravatar

    2. Word Embedding

    3. Perceptual Hashing

    4. Sparklines

    5. Recognition Memory

    6. Principal Component Analysis

    7. T-Distributed Stochastic Neighbor Embedding

    8. Fourier Transform

    9. Single-Nucleotide Polymorphism

    10. Linkage Disequilibrium

    11. Polygenic Score

    12. Lasso (statistics)

    13. Recurrent Neural Network

    14. Big Five Personality Traits

    15. Empirical Bayes Method

    16. Stochastic Gradient Descent

    17. Shrinkage (statistics)

    18. Winner's Curse

    19. Knowledge Distillation

    20. Centroid

    21. N-Sphere

    22. K-Means Clustering

    23. Rejection Sampling

    24. Active Learning (machine Learning)

    25. Autoregressive Model

    26. Mipmap

    27. Image Segmentation

    28. SHRDLU

    29. Graphomanic

    30. TvTropes

    31. Brownian Bridge

    32. AI Dungeon

    33. Choose Your Own Adventure

    34. Kullback-Leibler Divergence

    35. Pareto Front

    36. The Lottery in Babylon

    37. Jorge Luis Borges

    38. Digital Watermarking

    39. Analogue Hole

    40. ‘Tlön, Uqbar, Orbis Tertius’

    41. John Drewe § Career As a Forger

    42. Honeytoken

    43. Trap Street

    44. 555 (telephone Number)

    45. Error-Correcting Code

    46. Probabilistically Checkable Proofs

    47. PCP Theorem

    48. Chaffing and Winnowing

    49. Ising Model

    50. Reinforcement Learning

    51. Brownian Bridge

    52. Random Walk

    53. Diffusion Model

    54. JPEG

    55. Embeddings

    56. Simulated Annealing

    57. Variance

    58. Law of Large Numbers

    59. Stratified Sampling

    60. Blocking (statistics)

    61. Coresets

    62. Quasi-Monte Carlo Method

    63. Low-Discrepancy Sequence

    64. Antithetic Variates

    65. Order Statistic § Order Statistics Sampled from a Uniform Distribution

    66. Order Statistic

    67. U-Net

    68. AlphaGo

    69. Monte Carlo Tree Search

    70. Edit Distance

    71. ‘There’s Plenty of Room at the Bottom’

    72. Dollhouse

    73. RoboCup

    74. Meccano

    75. K'Nex

    76. Experience Curve Effects § Reasons for the Effect

    77. Delta Robot

    78. General Social Survey

    79. Hangul

    80. Ligature (writing)

    81. Constructed Writing System

    82. Xu Bing § Square Word Calligraphy

    83. Display Typeface

    84. Ed Benguiat

    85. Tiki Culture

    86. Stereoscopy

    87. Unmanned Aerial Vehicle

    88. Global Positioning System

    89. Geo-Fence

    90. Natural Experiment

    91. Regression Discontinuity Design

    92. The Market for Lemons

    93. Home Inspection

    94. Delivery Drone

    95. Zipline (drone Delivery Company)

    96. Amazon Prime Air

    97. Webcomic

    98. Girl Genius

    99. Web Fiction § Web Serial

    100. Greasemonkey

    101. WordPress

    102. Zen Sand Gardens

    103. Sokoban

    104. Stephen's Sausage Roll

    105. Ichi-Go Ichi-E

    106. The Witness (2016 Video Game)

    107. Boustrophedon

    108. Ōkami

    109. Particle System

    110. Falling-Sand Game

    111. Sandbox Game

    112. The Powder Toy

    113. Regression toward the Mean

    114. Alita: Battle Angel

    115. Battle Angel Alita

    116. Ready Player One

    117. Ready Player One (film)

    118. Wikidata

    119. Chinese Wikipedia

    120. 2021 Wikimedia Foundation Actions on the Chinese Wikipedia

    121. Baidu Baike

    122. Weibo

    123. English Wikipedia

    124. 23andMe

    125. Nominative Determinism

    126. Schizophrenia

    127. Pareidolia

    128. Hygiene Hypothesis

    129. Escape Room

    130. Predictive Coding

    131. Gang Stalking

    132. Helminthic Therapy

    133. Isolation Tank

    134. Nicotine

    135. Diplomacy (game)

    136. Social Deduction Games

    137. Mafia (party Game)

    138. Dan Rather § ‘Kenneth, What Is the Frequency?’