CLIP: Connecting Text and Images: We’re introducing a neural network called CLIP which efficiently learns visual concepts from natural language supervision. CLIP can be applied to any visual classification benchmark by simply providing the names of the visual categories to be recognized, similar to the ‘zero-shot’ capabilities of GPT-2 and GPT-3
T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Contrastive Representation Learning: A Framework and Review
WaveNet: A Generative Model for Raw Audio
Malware Detection by Eating a Whole EXE
Multi-trait analysis of genome-wide association summary statistics using MTAG
Speech2Face: Learning the Face Behind a Voice
‘variance components’ directory
Assessing the Big Five personality traits using real-life static facial images
LipNet: End-to-End Sentence-level Lipreading
LipNet: How Easy Do You Think Lipreading Is?
Absolute Unit NNs: Regression-Based MLPs for Everything
https://www.lesswrong.com/posts/K7AyY8LMrcKhwfbyj/no-really-attention-is-all-you-need-attention-can-do
Scaling MLPs: A Tale of Inductive Bias
Technology Forecasting: The Garden of Forking Paths
Scaling Laws for Neural Language Models
Chinchilla: Training Compute-Optimal Large Language Models
Attention Is All You Need
The Scaling Hypothesis
Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers
Bayesian Optimization in AlphaGo
InvertOrNot.com Proposal
Abandoning Objectives: Evolution Through the Search for Novelty Alone
Towards a Human-like Open-Domain Chatbot
Scaling Laws for Reward Model Overoptimization
Timeghost
crop#aspect-ratio-training
[Transclude the forward-link's
context]
SDXL § Micro-Conditioning: Conditioning the Model on Image Size
Choose-Your-Own-Adventure AI Dungeon Games
GPT-2 Preference Learning for Music Generation § Optimization by Backprop, Not Blackbox
Visual Autoregressive Modeling (VAR): Scalable Image Generation via Next-Scale Prediction
Progressive Growing of GANs for Improved Quality, Stability, and Variation
PixelRNN: Pixel Recurrent Neural Networks
Parallel Multiscale Autoregressive Density Estimation
not-so-BigGAN: Generating High-Fidelity Images on Small Compute with Wavelet-based Super-Resolution
Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
CM3: A Causal Masked Multimodal Model of the Internet
MAE: Masked Autoencoders Are Scalable Vision Learners
https://arxiv.org/pdf/2307.01952.pdf#page=3
Claude Plays Pokemon
Image GPT (iGPT): We find that, just as a large transformer model trained on language can generate coherent text, the same exact model trained on pixel sequences can generate coherent image completions and samples
Nenex: A Neural Personal Wiki Idea
index#scaling-laws
[Transclude the forward-link's
context]
LLM Applications I Want To See
‘AI mode collapse’ directory
Virtual comments: LLM idea
Hierarchical Embeddings for Text Search
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Recursively Summarizing Books with Human Feedback
Co-Writing Screenplays and Theatre Scripts with Language Models (Dramatron): An Evaluation by Industry Professionals
https://arxiv.org/pdf/2209.14958#page=5&org=deepmind
design#future-tag-features
[Transclude the forward-link's
context]
The Curious Case of Neural Text Degeneration
‘discrete diffusion model’ directory
resorter#noisy-sorting
[Transclude the forward-link's
context]
https://beta.openai.com/docs/guides/classifications
Text and Code Embeddings by Contrastive Pre-Training
01#gzip
[Transclude the forward-link's
context]
Calculating The Gaussian Expected Maximum § Probability of Bivariate Maximum
The Relationship Of Validity Coefficients To The Practical Effectiveness Of Tests In Selection: Discussion And Tables
Number Search Engine via NN Embeddings
littlewood#media
[Transclude the forward-link's
context]
Websim, Worldsim, and The Summer of Simulative AI
Some Evidence of Bees and Honey in Ancient Egypt
leprechaun#miscitation
[Transclude the forward-link's
context]
Leprechaun Hunting & Citogenesis
Chaff Bugs: Deterring Attackers by Making Software Buggier
ROME: Locating and Editing Factual Associations in GPT
Activation Addition: Steering Language Models Without Optimization
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
‘truesight (stylometry)’ directory
The Art of the Shadow: How Painters Have Gotten It Wrong for Centuries [From The Visual World of Shadows]
Three Months in Monte Carlo
Analytic and Algorithmic Solution of Random Satisfiability Problems
Anime Crop Datasets: Faces, Figures, & Hands § Hands
DAgger: A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise
gpt-2-preference-learning#differentiable-sorting
[Transclude the forward-link's
context]
Unsupervised Neural Machine Translation with Generative Language Models Only
https://www.crosslabs.org/blog/diffusion-with-offset-noise
Progressive Distillation for Fast Sampling of Diffusion Models
Consistency Models
Problem 14 Dynamic Programming Solutions
mixup: Beyond Empirical Risk Minimization
DataMUX: Data Multiplexing for Neural Networks
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
Rectified Flow: A Marginal Preserving Approach to Optimal Transport
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation
UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs
TRACT: Denoising Diffusion Models with Transitive Closure Time-Distillation
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
‘recurrent Transformer’ directory
Diffusion Is Spectral Autoregression
Progressive Growing of GANs for Improved Quality, Stability, and Variation
Text Embeddings Reveal (Almost) As Much As Text
Absolute Unit NNs: Regression-Based MLPs for Everything § Memorize All The Things
[Transclude the forward-link's
context]
DALL·E 1: Creating Images from Text: We’ve trained a neural network called DALL·E that creates images from text captions for a wide range of concepts expressible in natural language
GANs Didn’t Fail, They Were Abandoned
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
GigaGAN: Scaling up GANs for Text-to-Image Synthesis
BigGAN: Consistency Regularization (SimCLR-Style) Loss
A Simple Framework for Contrastive Learning of Visual Representations
Training GANs with Stronger Augmentations via Contrastive Discriminator (ContraD)
Self-conditioned Image Generation via Generating Representations
The Unusual Effectiveness of Averaging in GAN Training
Stochastic Weight Averaging and the Ornstein-Uhlenbeck Process
Connecting Generative Adversarial Networks and Actor-Critic Methods
How AI Training Scales
face#minibatch-retrieval
[Transclude the forward-link's
context]
Making Anime Faces With StyleGAN § Reversing StyleGAN To Control & Modify Images
face#biggan-latent-space
[Transclude the forward-link's
context]
Net2Net: Accelerating Learning via Knowledge Transfer
The Cost of Imbalance in Clinical Trials
The Power of Twins: The Scottish Milk Experiment
Policy Learning and Evaluation with Randomized Quasi-Monte Carlo
Small-GAN: Speeding Up GAN Training Using Core-sets
Top-K Training of GANs: Improving GAN Performance by Throwing Away Bad Samples
https://algorithmsbook.com/files/dm.pdf#page=246
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
MSG-GAN: Multi-Scale Gradients for Generative Adversarial Networks
Generator Knows What Discriminator Should Learn in Unconditional GANs
Simple statistical gradient-following algorithms for connectionist reinforcement learning
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
Distilling the Knowledge in a Neural Network
https://x.com/mere_mortise/status/934932000796020736
Pix2Pix: Image-to-Image Translation with Conditional Adversarial Networks
Sem-GAN: Semantically-Consistent Image-to-Image Translation
Improving Shape Deformation in Unsupervised Image-to-Image Translation
Detecting GAN generated errors
A U-Net Based Discriminator for Generative Adversarial Networks
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
ImageNet: A Large-Scale Hierarchical Image Database
Novelty Nets: Classifier Anti-Guidance
[D] RL: GANs As MCTS Environment Simulator for Deep Model-Based Planning?
The Shattered Gradients Problem: If resnets are the answer, then what is the question?
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks
Data-dependent Initializations of Convolutional Neural Networks
Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity
Deep Information Propagation
On weight initialization in deep neural networks
Convolution Aware Initialization
HyperNetworks
Using Fast Weights to Attend to the Recent Past
SMASH: One-Shot Model Architecture Search through HyperNetworks
https://www.lesswrong.com/posts/2JJtxitp6nqu6ffak/basic-facts-about-language-models-during-training-1?commentId=M3wsmwiGBCxd4dHHW
GPT-2 Preference Learning for Music Generation § Bradley-Terry Preference Learning
GPT-2 Preference Learning for Music Generation § Decision Transformers: Preference Learning As Simple As Possible
Gato: A Generalist Agent
Learning to summarize from human feedback
‘AlphaStar’ directory
Player of Games
Diversifying AI: Towards Creative Chess with AlphaZero (AZdb)
MLMAC
Better Language Models and Their Implications
Bigscience/bloom
XLNet: Generalized Autoregressive Pretraining for Language Understanding
https://www.lesswrong.com/posts/aPeJE8bSo6rAFoLqg/solidgoldmagikarp-plus-prompt-generation
SolidGoldMagikarp II: Technical Details and More Recent Findings
https://www.lesswrong.com/posts/8viQEp8KBg2QSW4Yc/solidgoldmagikarp-iii-glitch-token-archaeology
GPT-3 Creative Fiction § BPEs
scaling-hypothesis#blessings-of-scale
[Transclude the forward-link's
context]
On Being The Right Size
Computer Optimization: Your Computer Is Faster Than You Think § DL
[Transclude the forward-link's
context]
Motion Planning for Dynamic Knotting of a Flexible Rope with a High-speed Robot Arm
Motion Planning for Dynamic Folding of a Cloth with Two High-Speed Robot Hands and Two High-Speed Sliders
Free-Play Periods for RL Agents
Brit-Pick
The Surprising Number of American Adults Who Think Chocolate Milk Comes from Brown Cows
https://ru.wikipedia.org/wiki/%D0%92%D1%8F%D0%B7%D1%8C
‘A Font Inspired by Square Word Calligraphy’, Pomdepin
https://fontsinuse.com/typefaces/40498/ed-interlock
Utext: Rich Unicode Documents
XKCD #941: Depth Perception
Depth Perception
Speculative Loading
Prerender Pages in Chrome for Instant Page Navigations
Web APIs: Speculation Rules API
Banner Ads Considered Harmful
Cat itecture: Better Cat Window Boxes
LAION-Aesthetics
Sandspiel
State-Space of Drug Effects: Results
Darknet Market Archives (2013–2015)
Acne: a good Quantified Self topic
anime#battle-angel-alita
[Transclude the forward-link's
context]
movie#ready-player-one
[Transclude the forward-link's
context]
https://www.juliansanchez.com/2009/12/08/the-redactors-dilemma/
https://www.fastcompany.com/90692176/chinese-wikipedia
Nucleus Genomics
Formal Theory of Creativity & Fun & Intrinsic Motivation (1990–2010)
Magic, Explanations, and Evil: The Origins and Design of Witches and Sorcerers [and replies]