Bibliography:

  1. ‘AI’ tag

  2. ‘adversarial examples (human)’ tag

  3. ‘adversarial examples (AI)’ tag

  4. ‘Anthropic’ tag

  5. ‘CNN’ tag

  6. ‘discrete diffusion model’ tag

  7. ‘diffusion model’ tag

  8. ‘black sun sigil’ tag

  9. ‘Dropcat dropcaps’ tag

  10. ‘Gene Wolfe dropcaps’ tag

  11. ‘dropcaps (AI typography)’ tag

  12. /doc/ai/nn/diffusion/midjourney/dropcap/ninit

  13. ‘Midjourney’ tag

  14. /doc/ai/nn/diffusion/midjourney/landscape

  15. ‘dynamic evaluation (NN)’ tag

  16. ‘MLP NN’ tag

  17. ‘BigGAN’ tag

  18. ‘data-augmented GANs’ tag

  19. ‘GAN’ tag

  20. ‘StyleGAN anime’ tag

  21. ‘StyleGAN’ tag

  22. ‘ProGAN’ tag

  23. ‘retrieval AI’ tag

  24. ‘RNN’ tag

  25. ‘NN sampling’ tag

  26. ‘NN sparsity’ tag

  27. ‘knowledge distillation’ tag

  28. ‘reduced-precision NNs’ tag

  29. ‘NN pruning’ tag

  30. ‘LM tokenization’ tag

  31. ‘AlphaFold’ tag

  32. ‘compressed Transformers’ tag

  33. ‘multi-scale Transformers’ tag

  34. ‘self-attention’ tag

  35. ‘Transformer matrix optimizations’ tag

  36. ‘recurrent Transformers’ tag

  37. ‘sparse Transformers’ tag

  38. ‘CLIP’ tag

  39. ‘CLIP samples’ tag

  40. ‘GPT-2 fiction’ tag

  41. ‘GPT-2’ tag

  42. ‘GPT-2 nonfiction’ tag

  43. /doc/ai/nn/transformer/gpt/2/poetry

  44. ‘GPT-3 fiction’ tag

  45. ‘GPT-3 humor’ tag

  46. ‘GPT-3’ tag

  47. ‘GPT-3 nonfiction’ tag

  48. ‘GPT-3 poetry’ tag

  49. ‘GPT-4 fiction’ tag

  50. ‘GPT-4’ tag

  51. ‘GPT-4 nonfiction’ tag

  52. ‘GPT-4 poetry’ tag

  53. ‘Sydney (AI)’ tag

  54. ‘GPT-5’ tag

  55. ‘GPT calibration’ tag

  56. ‘Claude AI’ tag

  57. ‘Codex’ tag

  58. ‘DALL·E 1’ tag

  59. ‘DALL·E 2’ tag

  60. ‘DALL·E 3’ tag

  61. ‘DALL·E’ tag

  62. ‘GPT fiction’ tag

  63. ‘GPT’ tag

  64. ‘inner monologue (AI)’ tag

  65. ‘instruct-tuning LLMs’ tag

  66. ‘Jukebox’ tag

  67. ‘LaMDA’ tag

  68. ‘GPT non-fiction’ tag

  69. ‘PaLM 2’ tag

  70. ‘PaLM’ tag

  71. ‘GPT poetry’ tag

  72. ‘Whisper NN’ tag

  73. ‘Transformer’ tag

  74. ‘T5 Transformer’ tag

  75. ‘autoencoder NN’ tag

  76. ‘masked autoencoder’ tag

  77. ‘cellular automata’ tag

  78. 2019 News

  79. Research Ideas

  80. The Neural Net Tank Urban Legend

  81. Surprisingly Turing-Complete

  82. Evolution as Backstop for Reinforcement Learning

  83. ARPA and SCI: Surfing AI

  84. Computer Optimization: Your Computer Is Faster Than You Think

  85. Timing Technology: Lessons From The Media Lab

  86. Collapse or Thrive? Perils and Promises of Synthetic Data in a Self-Generating World

  87. Why concepts are (probably) vectors

  88. Robin Hanson: Prediction Markets, the Future of Civilization, and Polymathy—#66 § Opposition to DL

  89. Memorization in Machine Learning: A Survey of Results

  90. Simultaneous linear connectivity of neural networks modulo permutation

  91. The boundary of neural network trainability is fractal

  92. Tweets to Citations: Unveiling the Impact of Social Media Influencers on AI Research Visibility

  93. Outliers with Opposing Signals Have an Outsized Effect on Neural Network Optimization

  94. Proving Linear Mode Connectivity of Neural Networks via Optimal Transport

  95. How deep is the brain? The shallow brain hypothesis

  96. Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture

  97. Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition

  98. Efficient Video and Audio processing with Loihi 2

  99. Latent State Models of Training Dynamics

  100. Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity

  101. Combining Human Expertise with Artificial Intelligence: Experimental Evidence from Radiology

  102. The Architecture of a Biologically Plausible Language Organ

  103. Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training

  104. Protecting Society from AI Misuse: When are Restrictions on Capabilities Warranted?

  105. Symbolic Discovery of Optimization Algorithms

  106. The Forward-Forward Algorithm: Some Preliminary Investigations

  107. Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability

  108. Do Current Multi-Task Optimization Methods in Deep Learning Even Help?

  109. Selective neutralization and deterring of cockroaches with laser automated by machine vision

  110. Git Re-Basin: Merging Models modulo Permutation Symmetries

  111. Learning with Differentiable Algorithms

  112. Normalized Activation Function: Toward Better Convergence

  113. Bugs in the Data: How ImageNet Misrepresents Biodiversity

  114. The Value of Out-of-Distribution Data

  115. AniWho: A Quick and Accurate Way to Classify Anime Character Faces in Images

  116. Zeus: Understanding and Optimizing GPU Energy Consumption of DNN Training

  117. Adaptive Gradient Methods at the Edge of Stability

  118. Learning with Combinatorial Optimization Layers: a Probabilistic Approach

  119. What Do We Maximize in Self-Supervised Learning?

  120. Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit

  121. High-performing neural network models of visual cortex benefit from high latent dimensionality

  122. Perceptein: A synthetic protein-level neural network in mammalian cells

  123. Predicting Word Learning in Children from the Performance of Computer Vision Systems

  124. Wav2Vec-Aug: Improved self-supervised training with limited data

  125. The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon

  126. An Improved One millisecond Mobile Backbone

  127. Greedy Bayesian Posterior Approximation with Deep Ensembles

  128. Generating Scientific Claims for Zero-Shot Scientific Fact Checking

  129. Deep Lexical Hypothesis: Identifying personality structure in natural language

  130. Gradients without Backpropagation

  131. Towards Scaling Difference Target Propagation by Learning Backprop Targets

  132. M5 accuracy competition: Results, findings, and conclusions

  133. Formal Analysis of Art: Proxy Learning of Visual Concepts from Style Through Language Models

  134. Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow

  135. Artificial intelligence ‘sees’ split electrons

  136. Pushing the frontiers of density functionals by solving the fractional electron problem

  137. Word Golf

  138. Deep learning enables genetic analysis of the human thoracic aorta

  139. Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks

  140. Achieving Human Parity on Visual Question Answering

  141. BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning

  142. Learning in High Dimension Always Amounts to Extrapolation

  143. The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks

  144. The structure of genotype-phenotype maps makes fitness landscapes navigable

  145. Deep Neural Networks and Tabular Data: A Survey

  146. Learning through atypical "phase transitions" in overparameterized neural networks

  147. RAFT: A Real-World Few-Shot Text Classification Benchmark

  148. PPT: Pre-trained Prompt Tuning for Few-shot Learning

  149. DART: Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

  150. ETA Prediction with Graph Neural Networks in Google Maps

  151. Neural Operator: Learning Maps Between Function Spaces

  152. Introducing Triton: Open-Source GPU Programming for Neural Networks

  153. Predictive Coding: a Theoretical and Experimental Review

  154. A connectivity-constrained computational account of topographic organization in primate high-level visual cortex

  155. A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers

  156. Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation

  157. Randomness In Neural Network Training: Characterizing The Impact of Tooling

  158. Revisiting Deep Learning Models for Tabular Data

  159. BEiT: BERT Pre-Training of Image Transformers

  160. Revisiting Model Stitching to Compare Neural Representations

  161. Artificial intelligence in China’s revolution in military affairs

  162. The Geometry of Concept Learning

  163. VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning

  164. The Modern Mathematics of Deep Learning

  165. Understanding by Understanding Not: Modeling Negation in Language Models

  166. Entailment as Few-Shot Learner

  167. PAWS: Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples

  168. Epistemic Autonomy: Self-supervised Learning in the Mammalian Hippocampus

  169. Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization

  170. Contrasting Contrastive Self-Supervised Representation Learning Models

  171. Characterizing and Improving the Robustness of Self-Supervised Learning through Background Augmentations

  172. GWAS in almost 195,000 individuals identifies 50 previously unidentified genetic loci for eye color

  173. BERTese: Learning to Speak to BERT

  174. Predictive Coding Can Do Exact Backpropagation on Any Neural Network

  175. Barlow Twins: Self-Supervised Learning via Redundancy Reduction

  176. WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning

  177. The inverse variance–flatness relation in stochastic gradient descent is critical for finding flat minima

  178. Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability

  179. Rip van Winkle’s Razor: A Simple Estimate of Overfit to Test Data

  180. Image Completion via Inference in Deep Generative Models

  181. Contrastive Learning Inverts the Data Generating Process

  182. DirectPred: Understanding self-supervised Learning Dynamics without Contrastive Pairs

  183. MLGO: a Machine Learning Guided Compiler Optimizations Framework

  184. Facial recognition technology can expose political orientation from naturalistic facial images

  185. Solving Mixed Integer Programs Using Neural Networks

  186. Sixteen facial expressions occur in similar contexts worldwide

  187. PiRank: Learning To Rank via Differentiable Sorting

  188. Real-time Synthesis of Imagined Speech Processes from Minimally Invasive Recordings of Neural Activity

  189. Generalization bounds for deep learning

  190. Selective Eye-gaze Augmentation To Enhance Imitation Learning In Atari Games

  191. SimSiam: Exploring Simple Siamese Representation Learning

  192. Recent advances in neurotechnologies with broad potential for neuroscience research

  193. Voting for Authorship Attribution Applied to Dark Web Data

  194. Twenty Years Beyond the Turing Test: Moving Beyond the Human Judges Too

  195. Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding

  196. Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth

  197. Guys and Dolls

  198. Open-Domain Question Answering Goes Conversational via Question Rewriting

  199. Digital Voicing of Silent Speech

  200. Rank-Smoothed Pairwise Learning In Perceptual Quality Assessment

  201. Implicit Gradient Regularization

  202. Large Associative Memory Problem in Neurobiology and Machine Learning

  203. AdapterHub: A Framework for Adapting Transformers

  204. Identifying Regulatory Elements via Deep Learning

  205. Is SGD a Bayesian sampler? Well, almost

  206. Bootstrap your own latent (BYOL): A new approach to self-supervised Learning

  207. SCAN: Learning to Classify Images without Labels

  208. Politeness Transfer: A Tag and Generate Approach

  209. Supervised Contrastive Learning

  210. Backpropagation and the brain

  211. Can You Put it All Together: Evaluating Conversational Agents’ Ability to Blend Skills

  212. Topology of deep neural networks

  213. Improved Baselines with Momentum Contrastive Learning

  214. The large learning rate phase of deep learning: the catapult mechanism

  215. Fast Differentiable Sorting and Ranking

  216. The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence

  217. Quantifying Independently Reproducible Machine Learning

  218. The Secret History of Facial Recognition: Sixty years ago, a sharecropper’s son invented a technology to identify faces. Then the record of his role all but vanished. Who was Woody Bledsoe, and who was he working for?

  219. Can the Brain Do Backpropagation? -Exact Implementation of Backpropagation in Predictive Coding Networks

  220. Learning Neural Activations

  221. 2019 AI Alignment Literature Review and Charity Comparison

  222. Libri-Light: A Benchmark for ASR with Limited or No Supervision

  223. Connecting Vision and Language with Localized Narratives

  224. 12-in-1: Multi-Task Vision and Language Representation Learning

  225. A Deep Learning Framework for Neuroscience

  226. Machine Learning for Scent: Learning Generalizable Perceptual Representations of Small Molecules

  227. KuroNet: Pre-Modern Japanese Kuzushiji Character Recognition with Deep Learning

  228. Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions

  229. Best practices for the human evaluation of automatically generated text

  230. RandAugment: Practical automated data augmentation with a reduced search space

  231. Large-scale Pretraining for Neural Machine Translation with Tens of Billions of Sentence Pairs

  232. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

  233. Engineering a Less Artificial Intelligence

  234. Neural networks are a priori biased towards Boolean functions with low entropy

  235. Simple, Scalable Adaptation for Neural Machine Translation

  236. Emergent Tool Use From Multi-Agent Autocurricula

  237. A Step Toward Quantifying Independently Reproducible Machine Learning Research

  238. Does Machine Translation Affect International Trade? Evidence from a Large Digital Platform

  239. Can One Concurrently Record Electrical Spikes from Every Neuron in a Mammalian Brain?

  240. Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges

  241. Deep Set Prediction Networks

  242. Optimizing color for camouflage and visibility using deep learning: the effects of the environment and the observer’s visual system

  243. Speech2Face: Learning the Face Behind a Voice

  244. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

  245. Universal quantum control through deep reinforcement learning

  246. Analysing Mathematical Reasoning Abilities of Neural Models

  247. Reinforcement Learning for Recommender Systems: A Case Study on Youtube

  248. Stochastic Optimization of Sorting Networks via Continuous Relaxations

  249. Surprises in High-Dimensional Ridgeless Least Squares Interpolation

  250. DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

  251. Theories of Error Back-Propagation in the Brain

  252. A Replication Study: Machine Learning Models Are Capable of Predicting Sexual Orientation From Facial Images

  253. Unmasking Clever Hans Predictors and Assessing What Machines Really Learn

  254. What makes a good conversation? How controllable attributes affect human judgments

  255. The Evolved Transformer

  256. Forecasting Transformative AI: An Expert Survey

  257. Human few-shot learning of compositional instructions

  258. Evaluation and Accurate Diagnoses of Pediatric Diseases Using Artificial Intelligence

  259. Why Is There No Successful Whole Brain Simulation (Yet)?

  260. High-Performance Medicine: the Convergence of Human and Artificial Intelligence

  261. Identifying Facial Phenotypes of Genetic Disorders Using Deep Learning

  262. Reinventing the Wheel: Discovering the Optimal Rolling Shape With PyTorch

  263. An Empirical Study of Example Forgetting during Deep Neural Network Learning

  264. CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge

  265. Depth with Nonlinearity Creates No Bad Local Minima in ResNets

  266. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

  267. Interpretable Textual Neuron Representations for NLP

  268. Searching for Efficient Multi-Scale Architectures for Dense Image Prediction

  269. Machine Learning to Predict Osteoporotic Fracture Risk from Genotypes

  270. Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction

  271. Searching Toward Pareto-Optimal Device-Aware Neural Architectures

  272. A Study of Reinforcement Learning for Neural Machine Translation

  273. Modeling Visual Context is Key to Augmenting Object Detection Datasets

  274. Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search

  275. Automatically Composing Representation Transformations as a Means for Generalization

  276. Differentiable Learning-to-Normalize via Switchable Normalization

  277. On the Spectral Bias of Neural Networks

  278. Neural Tangent Kernel: Convergence and Generalization in Neural Networks

  279. Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning

  280. Do CIFAR-10 Classifiers Generalize to CIFAR-10?

  281. Zero-Shot Dual Machine Translation

  282. Do Better ImageNet Models Transfer Better?

  283. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

  284. Adafactor: Adaptive Learning Rates with Sublinear Memory Cost

  285. Averaging Weights Leads to Wider Optima and Better Generalization

  286. SentEval: An Evaluation Toolkit for Universal Sentence Representations

  287. Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge

  288. Analyzing Uncertainty in Neural Machine Translation

  289. End-to-end deep image reconstruction from human brain activity

  290. Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari

  291. signSGD: Compressed Optimization for Non-Convex Problems

  292. Differentiable Dynamic Programming for Structured Prediction and Attention

  293. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

  294. Semantic projection: recovering human knowledge of multiple, distinct object features from word embeddings

  295. Panoptic Segmentation

  296. Clinically Applicable Deep Learning for Diagnosis and Referral in Retinal Disease

  297. Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning

  298. Three-dimensional visualization and a deep-learning model reveal complex fungal parasite networks in behaviorally manipulated ants

  299. Decoupled Weight Decay Regularization

  300. Automatic differentiation in PyTorch

  301. Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior

  302. mixup: Beyond Empirical Risk Minimization

  303. Malware Detection by Eating a Whole EXE

  304. AlphaGo Zero: Mastering the game of Go without human knowledge

  305. Swish: Searching for Activation Functions

  306. Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates

  307. Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection

  308. Emergence of Locomotion behaviors in Rich Environments

  309. The Persistence and Transience of Memory

  310. Verb Physics: Relative Physical Knowledge of Actions and Objects

  311. Driver Identification Using Automobile Sensor Data from a Single Turn

  312. StreetStyle: Exploring world-wide clothing styles from millions of photos

  313. Deep Voice 2: Multi-Speaker Neural Text-to-Speech

  314. WebVision Challenge: Visual Learning and Understanding With Web Data

  315. Inferring and Executing Programs for Visual Reasoning

  316. Visual Attribute Transfer through Deep Image Analogy

  317. On weight initialization in deep neural networks

  318. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference

  319. RACE: Large-scale ReAding Comprehension Dataset From Examinations

  320. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation

  321. Prototypical Networks for Few-shot Learning

  322. Meta Networks

  323. Understanding Synthetic Gradients and Decoupled Neural Interfaces

  324. Adaptive Neural Networks for Efficient Inference

  325. Deep Voice: Real-time Neural Text-to-Speech

  326. Machine Learning Predicts Laboratory Earthquakes

  327. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks

  328. Dermatologist-Level Classification of Skin Cancer With Deep Neural Networks

  329. Child machines

  330. Machine Learning for Systems and Systems for Machine Learning

  331. Feedback Networks

  332. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

  333. Towards Information-Seeking Agents

  334. Spatially Adaptive Computation Time for Residual Networks

  335. Deep Learning Reinvents the Hearing Aid: Finally, wearers of hearing aids can pick out a voice in a crowded room

  336. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset

  337. Learning to reinforcement learn

  338. Lip Reading Sentences in the Wild

  339. Could a Neuroscientist Understand a Microprocessor?

  340. A Neural Network Playground

  341. Homotopy Analysis for Tensor PCA

  342. Why does deep and cheap learning work so well?

  343. SGDR: Stochastic Gradient Descent with Warm Restarts

  344. Concrete Problems in AI Safety

  345. SQuAD: 100,000+ Questions for Machine Comprehension of Text

  346. Matching Networks for One Shot Learning

  347. Convolutional Sketch Inversion

  348. Unifying Count-Based Exploration and Intrinsic Motivation

  349. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks

  350. Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity

  351. "Why Should I Trust You?": Explaining the Predictions of Any Classifier

  352. Mastering the game of Go with deep neural networks and tree search

  353. Learning to Compose Neural Networks for Question Answering

  354. How a Japanese Cucumber Farmer Is Using Deep Learning and TensorFlow

  355. 3bfbe97a13cecba72a03ec7fd40a1a9cf40f7dd4.html

  356. Random Gradient-Free Minimization of Convex Functions

  357. Data-dependent Initializations of Convolutional Neural Networks

  358. Online Batch Selection for Faster Training of Neural Networks

  359. Neural Module Networks

  360. Deep DPG (DDPG): Continuous control with deep reinforcement learning

  361. A Neural Algorithm of Artistic Style

  362. VQA: Visual Question Answering

  363. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

  364. Probabilistic Line Searches for Stochastic Optimization

  365. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

  366. Neural Networks and Deep Learning

  367. Neural Networks and Deep Learning § Ch6 Deep Learning

  368. Qualitatively characterizing neural network optimization problems

  369. Freeze-Thaw Bayesian Optimization

  370. Microsoft COCO: Common Objects in Context

  371. Deep Learning in Neural Networks: An Overview

  372. Neural Networks, Manifolds, and Topology

  373. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks

  374. Distributed Representations of Words and Phrases and their Compositionality

  375. Whatever next? Predictive brains, situated agents, and the future of cognitive science

  376. Deep Gaussian Processes

  377. Artist Agent: A Reinforcement Learning Approach to Automatic Stroke Generation in Oriental Ink Painting

  378. HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

  379. Large-scale deep unsupervised learning using graphics processors

  380. A free energy principle for the brain

  381. Understanding the nature of the general factor of intelligence: The role of individual differences in neural plasticity as an explanatory mechanism

  382. Starfish § Bulrushes

  383. Exponentiated Gradient versus Gradient Descent for Linear Predictors

  384. Optimality in Biological and Artificial Networks?

  385. A Sociological Study of the Official History of the Perceptrons Controversy

  386. Turing patterns in CNNs, I: Once over lightly

  387. Learning and generalization in a two-layer neural network: The role of the Vapnik-Chervonvenkis dimension

  388. A Sociological Study of the Official History of the Perceptrons Controversy [1993]

  389. The statistical mechanics of learning a rule

  390. On Learning the Past Tenses of English Verbs

  391. Statistical mechanics of learning from examples

  392. Memorization Without Generalization in a Multilayered Neural Network

  393. Symbolic and neural learning algorithms: An experimental comparison

  394. Backpropagation Learning For Multilayer Feed-Forward Neural Networks Using The Conjugate Gradient Method

  395. Artificial Neural Networks, Back Propagation, and the Kelley-Bryson Gradient Procedure

  396. Exhaustive Learning

  397. International Joint Conference on Neural Networks, January 15–19, 1990: Volume 1: Theory Track, Neural and Cognitive Sciences Track

  398. International Joint Conference on Neural Networks, January 15–19, 1990: Volume 2: Applications Track

  399. Explanatory coherence

  400. Parallel Distributed Processing: Implications for Cognition and Development

  401. Cellular neural networks: theory

  402. Cellular neural networks: applications

  403. The Brain As Template

  404. Observation of Phase Transitions in Spreading Activation Networks

  405. Learning representations by backpropagating errors

  406. Storing Infinite Numbers of Patterns in a Spin-Glass Model of Neural Networks

  407. Toward An Interactive Model Of Reading

  408. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences

  409. Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms

  410. Speculations on Perceptrons and Other Automata

  411. Pandemonium: A Paradigm for Learning

  412. Some AI Koans § Http://www.catb.org/esr/jargon/html/koans.html#id3141241

  413. 57d8adf8ecf7d6c89649ff9bb2c0bb8f07413e40.html#id3141241

  414. Some AI Koans

  415. The Age of Em, A Book

  416. gsutil config: Obtain credentials and create configuration file

  417. Why Momentum Really Works

  418. Code for Reproducing Results in "Glow: Generative Flow With Invertible 1×1 Convolutions"

  419. Differentiable Finite State Machines

  420. 0eeba2f81960bbe9a4de7644ea87beed8a3f7f31.html

  421. About Sam Greydanus

  422. Contrastive Representation Learning

  423. 34370e160f56a3affd65dc9cd4313dcffd9205cc.html

  424. The Internet’s AI Slop Problem Is Only Going to Get Worse

  425. Glow: Better Reversible Generative Models

  426. Differentiable Programming from Scratch

  427. Deep Reinforcement Learning Doesn't Work Yet

  428. [Commonsense Media Survey on US Generative Media Use]

  429. 673dd7d16332c8c39f7b3ac35237c13f5d72f3de.html

  430. Gourmand Cat Fence

  431. Simple versus Short: Higher-Order Degeneracy and Error-Correction

  432. 8677e9ee445914700a8e8aeb235c5c6bf0468e95.html

  433. Inferring Neural Activity Before Plasticity As a Foundation for Learning beyond Backpropagation

  434. Reddit: Reinforcement Learning subreddit

  435. AI and the Indian Election

  436. Lip Reading Sentences in the Wild [Video]

  437. design#future-tag-features

    [Transclude the forward-link's context]

  438. 2022-12-02-gwern-meme-itsafraid-googlereluctancetoproductizedeeplearningresearch.jpg

  439. 2022-grand-figure2-semanticprojectionpredictionshumanjudgmentsexamplesofdangersizewitnessanimalscitiesmythologicalcreatures.jpg

  440. 2021-arora-descriptionlengthofresnet152.png

  441. 2021-santospata-figure1-hippocampusselfsupervisionlearning.jpg

  442. 2019-lecun-isscctalk-cake.png

  443. 2008-03-03-jonahlehrer-outofthebluecanathinkingrememberingdecisionmakingbiologicallyaccuratebrainbebuiltfromasupercomputer.html

  444. 2000-cartwright-intelligentdataanalysisinscience.pdf

  445. 1997-dhar-intelligentdecisionsupportmethods.pdf

  446. 1993-harth-thecreativeloop.pdf

  447. 1991-sethi-artificialneuralnetworksandstatisticalpatternrecognition.pdf

  448. 1977-agrawala-machinerecognitionofpatterns.pdf

  449. http://unremediatedgender.space/2018/Jan/blame-me-for-trying/

  450. eb1b87bda573519f38b0e64aad5ff4dbc71ddb88.html

  451. https://aclanthology.org/D10-1115.pdf

  452. 3724d9f26b75b7b63f810017c00e4957e7e1eb42.pdf

  453. https://aleph.se/andart2/math/weird-probability-distributions/

  454. https://fleuret.org/francois/lbdl.html

  455. f0f4d4119b5bb4324e4908d5b27f6370621f2096.html

  456. https://juretriglav.si/compressing-global-illumination-with-neural-networks/

  457. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4553431

  458. https://people.idsia.ch/~juergen/DanNet-triggers-deep-CNN-revolution-2011.html

  459. 063e0886533a719c4bd47df9646729e9ab559d00.html

  460. https://plato.stanford.edu/entries/language-thought/

  461. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/37648.pdf

  462. https://vitalik.eth.limo/general/2024/01/30/cryptoai.html

  463. https://web.archive.org/web/20220927022638/https://nautil.us/the-man-who-tried-to-redeem-the-world-with-logic-235253/

  464. https://www.chinalawtranslate.com/overview-of-draft-measures-on-generative-ai/

  465. https://www.kaggle.com/code/andy8744/predict-anime-face-using-pre-trained-model/data

  466. https://www.lesswrong.com/posts/QNQuWB3hS5FrGp5yZ/programmatic-backdoors-dnns-can-use-sgd-to-run-arbitrary

  467. https://www.lesswrong.com/posts/RKDQCB6smLWgs2Mhr/multi-component-learning-and-s-curves

  468. https://www.lesswrong.com/posts/XpCnhaAQrssq8tJBG/an-interactive-introduction-to-grokking-and-mechanistic

  469. https://www.mosaicml.com/blog/mosaic-resnet-deep-dive

  470. c63ef8be56ef75d1c298d5fa4efb22ea0334024f.html

  471. https://www.neelnanda.io/mechanistic-interpretability/favourite-papers

  472. fb5cf4bddd76b28a81cdb871972cb79f16ef09fb.html

  473. https://www.newyorker.com/magazine/1981/12/14/a-i

  474. https://www.protocol.com/china/i-built-bytedance-censorship-machine

  475. 3844f7b7f0c19331e9046fb22e49e5ab1e87d93e.html

  476. https://www.quantamagazine.org/to-be-energy-efficient-brains-predict-their-perceptions-20211115/

  477. 9f2f0825d7f71e212398d0f64842060fc522ebc9.html

  478. https://www.recraft.ai/

  479. https://www.vox.com/future-perfect/23775650/ai-regulation-openai-gpt-anthropic-midjourney-stable

  480. https://www.youtube.com/watch?v=mlXzufEk-2E

  481. https://www.youtube.com/watch?v=ze5i_e_ryTk

  482. https://x.com/RobertMMetcalfe/status/1839703601108664348

  483. https://x.com/scottbelsky/status/1582748549388783617

  484. Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture

  485. https%253A%252F%252Farxiv.org%252Fabs%252F2310.12109.html

  486. Combining Human Expertise with Artificial Intelligence: Experimental Evidence from Radiology

  487. https%253A%252F%252Fwww.nber.org%252Fpapers%252Fw31422.html

  488. Symbolic Discovery of Optimization Algorithms

  489. https%253A%252F%252Farxiv.org%252Fabs%252F2302.06675%2523google.html

  490. Selective neutralization and deterring of cockroaches with laser automated by machine vision

  491. https%253A%252F%252Fwww.tandfonline.com%252Fdoi%252Ffull%252F10.1080%252F00305316.2022.2121777.html

  492. AniWho: A Quick and Accurate Way to Classify Anime Character Faces in Images

  493. https%253A%252F%252Farxiv.org%252Fabs%252F2208.11012.html

  494. Towards Scaling Difference Target Propagation by Learning Backprop Targets

  495. https://mila.quebec/en/person/blake-richards/

  496. https%253A%252F%252Farxiv.org%252Fabs%252F2201.13415.html

  497. M5 accuracy competition: Results, findings, and conclusions

  498. https%253A%252F%252Fwww.sciencedirect.com%252Fscience%252Farticle%252Fpii%252FS0169207021001874.html

  499. Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow

  500. https%253A%252F%252Farxiv.org%252Fabs%252F2112.13314.html

  501. Pushing the frontiers of density functionals by solving the fractional electron problem

  502. David Pfau

  503. %252Fdoc%252Fai%252Fnn%252F2021-kirkpatrick.pdf%2523deepmind.html

  504. Word Golf

  505. https%253A%252F%252Fwww.word.golf%252F.html

  506. Learning through atypical "phase transitions" in overparameterized neural networks

  507. https%253A%252F%252Farxiv.org%252Fabs%252F2110.00683.html

  508. BEiT: BERT Pre-Training of Image Transformers

  509. Furu Wei

  510. https%253A%252F%252Farxiv.org%252Fabs%252F2106.08254%2523microsoft.html

  511. PAWS: Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples

  512. https%253A%252F%252Farxiv.org%252Fabs%252F2104.13963%2523facebook.html

  513. Contrasting Contrastive Self-Supervised Representation Learning Models

  514. https%253A%252F%252Farxiv.org%252Fabs%252F2103.14005.html

  515. Characterizing and Improving the Robustness of Self-Supervised Learning through Background Augmentations

  516. https%253A%252F%252Farxiv.org%252Fabs%252F2103.12719%2523facebook.html

  517. DirectPred: Understanding self-supervised Learning Dynamics without Contrastive Pairs

  518. https%253A%252F%252Farxiv.org%252Fabs%252F2102.06810%2523facebook.html

  519. AdapterHub: A Framework for Adapting Transformers

  520. Kyunghyun Cho

  521. https%253A%252F%252Farxiv.org%252Fabs%252F2007.07779.html

  522. SCAN: Learning to Classify Images without Labels

  523. https%253A%252F%252Farxiv.org%252Fabs%252F2005.12320.html

  524. Supervised Contrastive Learning

  525. https%253A%252F%252Farxiv.org%252Fabs%252F2004.11362%2523google.html

  526. 2019 AI Alignment Literature Review and Charity Comparison

  527. https%253A%252F%252Fwww.lesswrong.com%252Fposts%252FSmDziGM9hBjW9DKmf%252F2019-ai-alignment-literature-review-and-charity-comparison.html

  528. Connecting Vision and Language with Localized Narratives

  529. https%253A%252F%252Farxiv.org%252Fabs%252F1912.03098%2523google.html

  530. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

  531. https%253A%252F%252Farxiv.org%252Fabs%252F1909.11942%2523google.html

  532. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

  533. Alex Wang—Personal Site

  534. Nikita Nangia

  535. Amanpreet Singh

  536. Julian Michael

  537. Language Understanding Grounded in Perception and Action

  538. Omer Levy

  539. Sam Bowman

  540. https%253A%252F%252Farxiv.org%252Fabs%252F1905.00537.html

  541. Differentiable Learning-to-Normalize via Switchable Normalization

  542. https%253A%252F%252Farxiv.org%252Fabs%252F1806.10779.html

  543. Averaging Weights Leads to Wider Optima and Better Generalization

  544. https%253A%252F%252Farxiv.org%252Fabs%252F1803.05407.html

  545. Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari

  546. Ilya Loshchilov

  547. Profile – Machine Learning Lab

  548. https%253A%252F%252Farxiv.org%252Fabs%252F1802.08842.html

  549. AlphaGo Zero: Mastering the game of Go without human knowledge

  550. Julian Schrittwieser

  551. Karen Simonyan

  552. Lucas Baker (B.S. ’11)

  553. %252Fdoc%252Freinforcement-learning%252Fmodel%252Falphago%252F2017-silver.pdf%2523deepmind.html

  554. Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates

  555. https%253A%252F%252Farxiv.org%252Fabs%252F1708.07120.html

  556. The Persistence and Transience of Memory

  557. https%253A%252F%252Fwww.sciencedirect.com%252Fscience%252Farticle%252Fpii%252FS0896627317303653.html

  558. WebVision Challenge: Visual Learning and Understanding With Web Data

  559. https%253A%252F%252Farxiv.org%252Fabs%252F1705.05640.html

  560. Spatially Adaptive Computation Time for Residual Networks

  561. https%253A%252F%252Farxiv.org%252Fabs%252F1612.02297.html

  562. Large-scale deep unsupervised learning using graphics processors

  563. %252Fdoc%252Fai%252Fscaling%252Fhardware%252F2009-raina.pdf.html

  564. Learning and generalization in a two-layer neural network: The role of the Vapnik-Chervonvenkis dimension

  565. %252Fdoc%252Fai%252Fnn%252F1994-opper.pdf.html

  566. A Sociological Study of the Official History of the Perceptrons Controversy [1993]

  567. Mikel Olazaran

  568. %252Fdoc%252Fai%252Fnn%252F1993-olazaran.pdf.html

  569. Statistical mechanics of learning from examples

  570. %252Fdoc%252Fai%252Fnn%252F1992-seung.pdf.html

  571. Memorization Without Generalization in a Multilayered Neural Network

  572. %252Fdoc%252Fai%252Fnn%252F1992-hansel.pdf.html

  573. Parallel Distributed Processing: Implications for Cognition and Development

  574. %252Fdoc%252Fai%252Fnn%252F1989-mcclelland.pdf.html

  575. Storing Infinite Numbers of Patterns in a Spin-Glass Model of Neural Networks

  576. %252Fdoc%252Fai%252Fnn%252F1985-amit.pdf.html