- See Also
-
Links
- “How Deep Is the Brain? The Shallow Brain Hypothesis”, Suzuki et al 2023
- “Grokking Beyond Neural Networks: An Empirical Exploration With Model Complexity”, Miller et al 2023
- “Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture”, Fu et al 2023
- “Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition”, Chen et al 2023
- “Combining Human Expertise With Artificial Intelligence: Experimental Evidence from Radiology”, Agarwal et al 2023
- “Adam Accumulation to Reduce Memory Footprints of Both Activations and Gradients for Large-scale DNN Training”, Zhang et al 2023
- “Protecting Society from AI Misuse: When Are Restrictions on Capabilities Warranted?”, Anderljung & Hazell 2023
- “Symbolic Discovery of Optimization Algorithms”, Chen et al 2023
- “ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition”, Gandhi et al 2022
- “Do Current Multi-Task Optimization Methods in Deep Learning Even Help?”, Xin et al 2022
- “Selective Neutralization and Deterring of Cockroaches With Laser Automated by Machine Vision”, Rakhmatulin et al 2022
- “Git Re-Basin: Merging Models modulo Permutation Symmetries”, Ainsworth et al 2022
- “Learning With Differentiable Algorithms”, Petersen 2022
- “Normalized Activation Function: Toward Better Convergence”, Peiwen & Changsheng 2022
- “Bugs in the Data: How ImageNet Misrepresents Biodiversity”, Luccioni & Rolnick 2022
- “AniWho: A Quick and Accurate Way to Classify Anime Character Faces in Images”, Naftali et al 2022
- “The Value of Out-of-Distribution Data”, Silva et al 2022
- “Zeus: Understanding and Optimizing GPU Energy Consumption of DNN Training”, You et al 2022
- “Learning With Combinatorial Optimization Layers: a Probabilistic Approach”, Dalle et al 2022
- “What Do We Maximize in Self-Supervised Learning?”, Shwartz-Ziv et al 2022
- “Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit”, Barak et al 2022
- “High-performing Neural Network Models of Visual Cortex Benefit from High Latent Dimensionality”, Elmoznino & Bonner 2022
- “Predicting Word Learning in Children from the Performance of Computer Vision Systems”, Rane et al 2022
- “Wav2Vec-Aug: Improved Self-supervised Training With Limited Data”, Sriram et al 2022
- “The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon”, Thilak et al 2022
- “An Improved One Millisecond Mobile Backbone”, Vasu et al 2022
- “Greedy Bayesian Posterior Approximation With Deep Ensembles”, Tiulpin & Blaschko 2022
- “Generating Scientific Claims for Zero-Shot Scientific Fact Checking”, Wright et al 2022
- “Model Soups: Averaging Weights of Multiple Fine-tuned Models Improves Accuracy without Increasing Inference Time”, Wortsman et al 2022
- “Deep Lexical Hypothesis: Identifying Personality Structure in Natural Language”, Cutler & Condon 2022
- “Gradients without Backpropagation”, Baydin et al 2022
- “Towards Scaling Difference Target Propagation by Learning Backprop Targets”, Ernoult et al 2022
- “M5 Accuracy Competition: Results, Findings, and Conclusions”, Makridakis et al 2022
- “Formal Analysis of Art: Proxy Learning of Visual Concepts from Style Through Language Models”, Kim et al 2022
- “Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow”, Tambon et al 2021
- “Artificial Intelligence ‘sees’ Split Electrons”, Perdew 2021
- “Pushing the Frontiers of Density Functionals by Solving the Fractional Electron Problem”, Kirkpatrick et al 2021
- “ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction”, Santhanam et al 2021
- “Word Golf”, Xia 2021
- “Deep Learning Enables Genetic Analysis of the Human Thoracic Aorta”, Pirruccello et al 2021
- “Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks”, Ericsson et al 2021
- “Achieving Human Parity on Visual Question Answering”, Yan et al 2021
- “BC-Z: Zero-Shot Task Generalization With Robotic Imitation Learning”, Jang et al 2021
- “Learning in High Dimension Always Amounts to Extrapolation”, Balestriero et al 2021
- “TWIST: Self-Supervised Learning by Estimating Twin Class Distributions”, Wang et al 2021
- “The Structure of Genotype-phenotype Maps Makes Fitness Landscapes Navigable”, Greenbury et al 2021
- “The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks”, Entezari et al 2021
- “Deep Neural Networks and Tabular Data: A Survey”, Borisov et al 2021
- “Learning through Atypical "phase Transitions" in Overparameterized Neural Networks”, Baldassi et al 2021
- “RAFT: A Real-World Few-Shot Text Classification Benchmark”, Alex et al 2021
- “PPT: Pre-trained Prompt Tuning for Few-shot Learning”, Gu et al 2021
- “Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners”, Zhang et al 2021
- “ETA Prediction With Graph Neural Networks in Google Maps”, Derrow-Pinion et al 2021
- “Predictive Coding: a Theoretical and Experimental Review”, Millidge et al 2021
- “Neuroprosthesis for Decoding Speech in a Paralyzed Person With Anarthria”, Moses et al 2021
- “A Connectivity-constrained Computational Account of Topographic Organization in Primate High-level Visual Cortex”, Blauch et al 2021
- “A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers”, Miao et al 2021
- “Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation”, James et al 2021
- “Revisiting Deep Learning Models for Tabular Data”, Gorishniy et al 2021
- “Randomness In Neural Network Training: Characterizing The Impact of Tooling”, Zhuang et al 2021
- “BEiT: BERT Pre-Training of Image Transformers”, Bao et al 2021
- “Revisiting Model Stitching to Compare Neural Representations”, Bansal et al 2021
- “Artificial Intelligence in China’s Revolution in Military Affairs”, Kania 2021
- “The Geometry of Concept Learning”, Sorscher et al 2021
- “VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning”, Bardes et al 2021
- “Understanding by Understanding Not: Modeling Negation in Language Models”, Hosseini et al 2021
- “Entailment As Few-Shot Learner”, Wang et al 2021
- “PAWS: Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments With Support Samples”, Assran et al 2021
- “Computer Optimization: Your Computer Is Faster Than You Think”, Gwern 2021
- “Epistemic Autonomy: Self-supervised Learning in the Mammalian Hippocampus”, Santos-Pata et al 2021
- “Rip Van Winkle’s Razor, a Simple New Estimate for Adaptive Data Analysis”, Arora & Zhang 2021
- “Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization”, Xie et al 2021
- “Contrasting Contrastive Self-Supervised Representation Learning Models”, Kotar et al 2021
- “Characterizing and Improving the Robustness of Self-Supervised Learning through Background Augmentations”, Ryali et al 2021
- “GWAS in Almost 195,000 Individuals Identifies 50 Previously Unidentified Genetic Loci for Eye Color”, Simcoe et al 2021
- “BERTese: Learning to Speak to BERT”, Haviv et al 2021
- “Predictive Coding Can Do Exact Backpropagation on Any Neural Network”, Salvatori et al 2021
- “Barlow Twins: Self-Supervised Learning via Redundancy Reduction”, Zbontar et al 2021
- “WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning”, Srinivasan et al 2021
- “Rip Van Winkle’s Razor: A Simple Estimate of Overfit to Test Data”, Arora & Zhang 2021
- “Image Completion via Inference in Deep Generative Models”, Harvey et al 2021
- “Contrastive Learning Inverts the Data Generating Process”, Zimmermann et al 2021
- “DirectPred: Understanding Self-supervised Learning Dynamics without Contrastive Pairs”, Tian et al 2021
- “MLGO: a Machine Learning Guided Compiler Optimizations Framework”, Trofin et al 2021
- “Facial Recognition Technology Can Expose Political Orientation from Naturalistic Facial Images”, Kosinski 2021
- “Solving Mixed Integer Programs Using Neural Networks”, Nair et al 2020
- “Sixteen Facial Expressions Occur in Similar Contexts Worldwide”, Cowen 2020
- “PiRank: Learning To Rank via Differentiable Sorting”, Swezey et al 2020
- “Real-time Synthesis of Imagined Speech Processes from Minimally Invasive Recordings of Neural Activity”, Angrick et al 2020
- “Generalization Bounds for Deep Learning”, Valle-Pérez & Louis 2020
- “Selective Eye-gaze Augmentation To Enhance Imitation Learning In Atari Games”, Thammineni et al 2020
- “Inductive Biases for Deep Learning of Higher-Level Cognition”, Goyal & Bengio 2020
- “Exploring Simple Siamese Representation Learning”, Chen & He 2020
- “Recent Advances in Neurotechnologies With Broad Potential for Neuroscience Research”, Vázquez-Guardado et al 2020
- “Voting for Authorship Attribution Applied to Dark Web Data”, Samreen & Alalfi 2020
- “Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding”, Roberts et al 2020
- “Twenty Years Beyond the Turing Test: Moving Beyond the Human Judges Too”, Hernández-Orallo 2020
- “Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary With Width and Depth”, Nguyen et al 2020
- “Open-Domain Question Answering Goes Conversational via Question Rewriting”, Anantha et al 2020
- “Digital Voicing of Silent Speech”, Gaddy & Klein 2020
- “Rank-Smoothed Pairwise Learning In Perceptual Quality Assessment”, Talebi et al 2020
- “Implicit Gradient Regularization”, Barrett & Dherin 2020
- “Large Associative Memory Problem in Neurobiology and Machine Learning”, Krotov & Hopfield 2020
- “AdapterHub: A Framework for Adapting Transformers”, Pfeiffer et al 2020
- “On Linear Identifiability of Learned Representations”, Roeder et al 2020
- “Identifying Regulatory Elements via Deep Learning”, Barshai et al 2020
- “Is SGD a Bayesian Sampler? Well, Almost”, Mingard et al 2020
- “Bootstrap Your Own Latent (BYOL): A New Approach to Self-supervised Learning”, Grill et al 2020
- “SCAN: Learning to Classify Images without Labels”, Gansbeke et al 2020
- “Open-Retrieval Conversational Question Answering”, Qu et al 2020
- “Politeness Transfer: A Tag and Generate Approach”, Madaan et al 2020
- “Supervised Contrastive Learning”, Khosla et al 2020
- “Can You Put It All Together: Evaluating Conversational Agents’ Ability to Blend Skills”, Smith et al 2020
- “Backpropagation and the Brain”, Lillicrap et al 2020
- “TREC CAsT 2019: The Conversational Assistance Track Overview”, Dalton et al 2020
- “Improved Baselines With Momentum Contrastive Learning”, Chen et al 2020
- “The Large Learning Rate Phase of Deep Learning: the Catapult Mechanism”, Lewkowycz et al 2020
- “Fast Differentiable Sorting and Ranking”, Blondel et al 2020
- “The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence”, Marcus 2020
- “Quantifying Independently Reproducible Machine Learning”, Raff 2020
- “The Secret History of Facial Recognition: Sixty Years Ago, a Sharecropper’s Son Invented a Technology to Identify Faces. Then the Record of His Role All but Vanished. Who Was Woody Bledsoe, and Who Was He Working For?”, Raviv 2020
- “ImageNet-A: Natural Adversarial Examples”, Hendrycks et al 2020
- “Can the Brain Do Backpropagation? -Exact Implementation of Backpropagation in Predictive Coding Networks”, Song et al 2020
- “Learning Neural Activations”, Minhas & Asif 2019
- “2019 AI Alignment Literature Review and Charity Comparison”, Larks 2019
- “Libri-Light: A Benchmark for ASR With Limited or No Supervision”, Kahn et al 2019
- “Connecting Vision and Language With Localized Narratives”, Pont-Tuset et al 2019
- “12-in-1: Multi-Task Vision and Language Representation Learning”, Lu et al 2019
- “2019 News”, Gwern 2019
- “A Deep Learning Framework for Neuroscience”, Richards et al 2019
- “Machine Learning for Scent: Learning Generalizable Perceptual Representations of Small Molecules”, Sanchez-Lengeling et al 2019
- “KuroNet: Pre-Modern Japanese Kuzushiji Character Recognition With Deep Learning”, Clanuwat et al 2019
- “Approximate Inference in Discrete Distributions With Monte Carlo Tree Search and Value Functions”, Buesing et al 2019
- “Best Practices for the Human Evaluation of Automatically Generated Text”, Lee et al 2019
- “RandAugment: Practical Automated Data Augmentation With a Reduced Search Space”, Cubuk et al 2019
- “ALBERT: A Lite BERT for Self-supervised Learning of Language Representations”, Lan et al 2019
- “Large-scale Pretraining for Neural Machine Translation With Tens of Billions of Sentence Pairs”, Meng et al 2019
- “Neural Networks Are a Priori Biased towards Boolean Functions With Low Entropy”, Mingard et al 2019
- “Engineering a Less Artificial Intelligence”, Sinz et al 2019
- “Simple, Scalable Adaptation for Neural Machine Translation”, Bapna et al 2019
- “Emergent Tool Use From Multi-Agent Autocurricula”, Baker et al 2019
- “A Step Toward Quantifying Independently Reproducible Machine Learning Research”, Raff 2019
- “Does Machine Translation Affect International Trade? Evidence from a Large Digital Platform”, Brynjolfsson et al 2019b
- “Can One Concurrently Record Electrical Spikes from Every Neuron in a Mammalian Brain?”, Kleinfeld et al 2019
- “Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges”, Arivazhagan et al 2019
- “Deep Set Prediction Networks”, Zhang et al 2019
- “Optimizing Color for Camouflage and Visibility Using Deep Learning: the Effects of the Environment and the Observer’s Visual System”, Fennell et al 2019
- “Cold Case: The Lost MNIST Digits”, Yadav & Bottou 2019
- “Speech2Face: Learning the Face Behind a Voice”, Oh et al 2019
- “SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems”, Wang et al 2019
- “Universal Quantum Control through Deep Reinforcement Learning”, Niu et al 2019
- “Analysing Mathematical Reasoning Abilities of Neural Models”, Saxton et al 2019
- “Reinforcement Learning for Recommender Systems: A Case Study on Youtube”, Chen 2019
- “Stochastic Optimization of Sorting Networks via Continuous Relaxations”, Grover et al 2019
- “Surprises in High-Dimensional Ridgeless Least Squares Interpolation”, Hastie et al 2019
- “DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs”, Dua et al 2019
- “Theories of Error Back-Propagation in the Brain”, Whittington & Bogacz 2019
- “A Replication Study: Machine Learning Models Are Capable of Predicting Sexual Orientation From Facial Images”, Leuner 2019
- “Unmasking Clever Hans Predictors and Assessing What Machines Really Learn”, Lapuschkin et al 2019
- “What Makes a Good Conversation? How Controllable Attributes Affect Human Judgments”, See et al 2019
- “The Evolved Transformer”, So et al 2019
- “Forecasting Transformative AI: An Expert Survey”, Gruetzemacher et al 2019
- “Human Few-shot Learning of Compositional Instructions”, Lake et al 2019
- “Identifying Facial Phenotypes of Genetic Disorders Using Deep Learning”, Gurovich et al 2019
- “High-performance Medicine: the Convergence of Human and Artificial Intelligence”, Topol 2019
- “Why Is There No Successful Whole Brain Simulation (Yet)?”, Stiefel 2019
- “Evaluation and Accurate Diagnoses of Pediatric Diseases Using Artificial Intelligence”, Liang et al 2019
- “Reinventing the Wheel: Discovering the Optimal Rolling Shape With PyTorch”, Wiener 2019
- “An Empirical Study of Example Forgetting during Deep Neural Network Learning”, Toneva et al 2018
- “Evolution As Backstop for Reinforcement Learning”, Gwern 2018
- “CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge”, Talmor et al 2018
- “Depth With Nonlinearity Creates No Bad Local Minima in ResNets”, Kawaguchi & Bengio 2018
- “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”, Devlin et al 2018
- “Interpretable Textual Neuron Representations for NLP”, Poerner et al 2018
- “Machine Learning to Predict Osteoporotic Fracture Risk from Genotypes”, Forgetta et al 2018
- “Searching for Efficient Multi-Scale Architectures for Dense Image Prediction”, Chen et al 2018
- “Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction”, Hashimoto & Tsuruoka 2018
- “Searching Toward Pareto-Optimal Device-Aware Neural Architectures”, Cheng et al 2018
- “A Study of Reinforcement Learning for Neural Machine Translation”, Wu et al 2018
- “Neural Arithmetic Logic Units”, Trask et al 2018
- “Modeling Visual Context Is Key to Augmenting Object Detection Datasets”, Dvornik et al 2018
- “Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search”, Zela et al 2018
- “Automatically Composing Representation Transformations As a Means for Generalization”, Chang et al 2018
- “ARPA and SCI: Surfing AI”, Gwern 2018
- “Differentiable Learning-to-Normalize via Switchable Normalization”, Luo et al 2018
- “On the Spectral Bias of Neural Networks”, Rahaman et al 2018
- “Neural Tangent Kernel: Convergence and Generalization in Neural Networks”, Jacot et al 2018
- “Faster SGD Training by Minibatch Persistency”, Fischetti et al 2018
- “Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning”, Pang et al 2018
- “Do CIFAR-10 Classifiers Generalize to CIFAR-10?”, Recht et al 2018
- “Zero-Shot Dual Machine Translation”, Sestorain et al 2018
- “Do Better ImageNet Models Transfer Better?”, Kornblith et al 2018
- “GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding”, Wang et al 2018
- “Adafactor: Adaptive Learning Rates With Sublinear Memory Cost”, Shazeer & Stern 2018
- “Think You Have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge”, Clark et al 2018
- “SentEval: An Evaluation Toolkit for Universal Sentence Representations”, Conneau & Kiela 2018
- “Averaging Weights Leads to Wider Optima and Better Generalization”, Izmailov et al 2018
- “Analyzing Uncertainty in Neural Machine Translation”, Ott et al 2018
- “End-to-end Deep Image Reconstruction from Human Brain Activity”, Shen et al 2018
- “Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari”, Chrabaszcz et al 2018
- “SignSGD: Compressed Optimisation for Non-Convex Problems”, Bernstein et al 2018
- “Differentiable Dynamic Programming for Structured Prediction and Attention”, Mensch & Blondel 2018
- “UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction”, Lel et al 2018
- “Semantic Projection: Recovering Human Knowledge of Multiple, Distinct Object Features from Word Embeddings”, Grand et al 2018
- “Panoptic Segmentation”, Kirillov et al 2018
- “Prediction of Cardiovascular Risk Factors from Retinal Fundus Photographs via Deep Learning”, Poplin et al 2018
- “Clinically Applicable Deep Learning for Diagnosis and Referral in Retinal Disease”, Fauw et al 2018
- “Deep Image Reconstruction from Human Brain Activity”, Shen et al 2017
- “The NarrativeQA Reading Comprehension Challenge”, Kočiský et al 2017
- “China’s A.I. Advances Help Its Tech Industry, and State Security”, Mozur & Bradsher 2017
- “Three-dimensional Visualization and a Deep-learning Model Reveal Complex Fungal Parasite Networks in Behaviorally Manipulated Ants”, Fredericksen et al 2017
- “Decoupled Weight Decay Regularization”, Loshchilov & Hutter 2017
- “Unsupervised Machine Translation Using Monolingual Corpora Only”, Lample et al 2017
- “Automatic Differentiation in PyTorch”, Paszke et al 2017
- “Rethinking Generalization Requires Revisiting Old Ideas: Statistical Mechanics Approaches and Complex Learning Behavior”, Martin & Mahoney 2017
- “Malware Detection by Eating a Whole EXE”, Raff et al 2017
- “Mixup: Beyond Empirical Risk Minimization”, Zhang et al 2017
- “AlphaGo Zero: Mastering the Game of Go without Human Knowledge”, Silver et al 2017
- “Swish: Searching for Activation Functions”, Ramachandran et al 2017
- “Online Learning of a Memory for Learning Rates”, Meier et al 2017
- “Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates”, Smith & Topin 2017
- “Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection”, Dwibedi et al 2017
- “Emergence of Locomotion Behaviors in Rich Environments”, Heess et al 2017
- “Six Challenges for Neural Machine Translation”, Koehn & Knowles 2017
- “Verb Physics: Relative Physical Knowledge of Actions and Objects”, Forbes & Choi 2017
- “Driver Identification Using Automobile Sensor Data from a Single Turn”, Hallac et al 2017
- “StreetStyle: Exploring World-wide Clothing Styles from Millions of Photos”, Matzen et al 2017
- “Deep Voice 2: Multi-Speaker Neural Text-to-Speech”, Arik et al 2017
- “WebVision Challenge: Visual Learning and Understanding With Web Data”, Li et al 2017
- “Inferring and Executing Programs for Visual Reasoning”, Johnson et al 2017
- “Visual Attribute Transfer through Deep Image Analogy”, Liao et al 2017
- “On Weight Initialization in Deep Neural Networks”, Kumar 2017
- “A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference”, Williams et al 2017
- “RACE: Large-scale ReAding Comprehension Dataset From Examinations”, Lai et al 2017
- “Data-efficient Deep Reinforcement Learning for Dexterous Manipulation”, Popov et al 2017
- “Research Ideas”, Gwern 2017
- “Prototypical Networks for Few-shot Learning”, Snell et al 2017
- “Meta Networks”, Munkhdalai & Yu 2017
- “Understanding Synthetic Gradients and Decoupled Neural Interfaces”, Czarnecki et al 2017
- “Deep Voice: Real-time Neural Text-to-Speech”, Arik et al 2017
- “Adaptive Neural Networks for Efficient Inference”, Bolukbasi et al 2017
- “Machine Learning Predicts Laboratory Earthquakes”, Bertr et al 2017
- “Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks”, Katz et al 2017
- “Dermatologist-level Classification of Skin Cancer With Deep Neural Networks”, Esteva et al 2017
- “Machine Learning for Systems and Systems for Machine Learning”, Dean 2017
- “Feedback Networks”, Zamir et al 2016
- “CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning”, Johnson et al 2016
- “Towards Information-Seeking Agents”, Bachman et al 2016
- “Spatially Adaptive Computation Time for Residual Networks”, Figurnov et al 2016
- “Deep Learning Reinvents the Hearing Aid: Finally, Wearers of Hearing Aids Can Pick out a Voice in a Crowded Room”, Wang 2016b
- “MS MARCO: A Human Generated MAchine Reading COmprehension Dataset”, Bajaj et al 2016
- “Learning to Reinforcement Learn”, Wang et al 2016
- “Lip Reading Sentences in the Wild”, Chung et al 2016
- “Could a Neuroscientist Understand a Microprocessor?”, Jonas & Kording 2016
- “A Neural Network Playground”, Smilkov & Carter 2016
- “Deep Information Propagation”, Schoenholz et al 2016
- “Homotopy Analysis for Tensor PCA”, Anandkumar et al 2016
- “Language As a Latent Variable: Discrete Generative Models for Sentence Compression”, Miao & Blunsom 2016
- “Why Does Deep and Cheap Learning Work so Well?”, Lin et al 2016
- “SGDR: Stochastic Gradient Descent With Warm Restarts”, Loshchilov & Hutter 2016
- “Concrete Problems in AI Safety”, Amodei et al 2016
- “SQuAD: 100,000+ Questions for Machine Comprehension of Text”, Rajpurkar et al 2016
- “Matching Networks for One Shot Learning”, Vinyals et al 2016
- “Convolutional Sketch Inversion”, Güçlütürk et al 2016
- “Unifying Count-Based Exploration and Intrinsic Motivation”, Bellemare et al 2016
- “Synthesizing the Preferred Inputs for Neurons in Neural Networks via Deep Generator Networks”, Nguyen et al 2016
- “Improving Information Extraction by Acquiring External Evidence With Reinforcement Learning”, Narasimhan et al 2016
- “Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity”, Daniely et al 2016
- “"Why Should I Trust You?": Explaining the Predictions of Any Classifier”, Ribeiro et al 2016
- “Mastering the Game of Go With Deep Neural Networks and Tree Search”, Silver et al 2016
- “Learning to Compose Neural Networks for Question Answering”, Andreas et al 2016
- “How a Japanese Cucumber Farmer Is Using Deep Learning and TensorFlow”, Sato 2016
- “Random Gradient-Free Minimization of Convex Functions”, Nesterov & Spokoiny 2015
- “Data-dependent Initializations of Convolutional Neural Networks”, Krähenbühl et al 2015
- “Online Batch Selection for Faster Training of Neural Networks”, Loshchilov & Hutter 2015
- “Neural Module Networks”, Andreas et al 2015
- “Deep DPG (DDPG): Continuous Control With Deep Reinforcement Learning”, Lillicrap et al 2015
- “A Neural Algorithm of Artistic Style”, Gatys et al 2015
- “VQA: Visual Question Answering”, Agrawal et al 2015
- “Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks”, Weston et al 2015
- “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification”, He et al 2015
- “Freeze-Thaw Bayesian Optimization”, Swersky et al 2014
- “Microsoft COCO: Common Objects in Context”, Lin et al 2014
- “Deep Learning in Neural Networks: An Overview”, Schmidhuber 2014
- “Neural Networks, Manifolds, and Topology”, Olah 2014
- “Exact Solutions to the Nonlinear Dynamics of Learning in Deep Linear Neural Networks”, Saxe et al 2013
- “Distributed Representations of Words and Phrases and Their Compositionality”, Mikolov et al 2013
- “Whatever Next? Predictive Brains, Situated Agents, and the Future of Cognitive Science”, Clark 2013
- “Surprisingly Turing-Complete”, Gwern 2012
- “Deep Gaussian Processes”, Damianou & Lawrence 2012
- “Timing Technology: Lessons From The Media Lab”, Gwern 2012
- “Artist Agent: A Reinforcement Learning Approach to Automatic Stroke Generation in Oriental Ink Painting”, Xie et al 2012
- “The Neural Net Tank Urban Legend”, Gwern 2011
- “HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent”, Niu et al 2011
- “How Complex Are Individual Differences?”, Gwern 2010
- “A Free Energy Principle for the Brain”, Friston et al 2006
- “Understanding the Nature of the General Factor of Intelligence: The Role of Individual Differences in Neural Plasticity As an Explanatory Mechanism”, Garlick 2002
- “DARPA and the Quest for Machine Intelligence, 1983–1993”, Roland & Shiman 2002
- “Starfish § Bulrushes”, Watts 1999
- “Optimality in Biological and Artificial Networks?”, Levine & Elsberry 1997
- “A Sociological Study of the Official History of the Perceptrons Controversy”, Olazaran 1996
- “Statistical Mechanics of Generalization”, Opper & Kinzel 1996
- “Learning and Generalization in a Two-layer Neural Network: The Role of the Vapnik-Chervonvenkis Dimension”, Opper 1994
- “A Sociological Study of the Official History of the Perceptrons Controversy [1993]”, Olazaran 1993
- “The Statistical Mechanics of Learning a Rule”, Watkin et al 1993
- “On Learning the Past Tenses of English Verbs”, Rumelhart & McClelland 1993
- “Statistical Mechanics of Learning from Examples”, Seung et al 1992
- “Memorization Without Generalization in a Multilayered Neural Network”, Hansel et al 1992
- “Symbolic and Neural Learning Algorithms: An Experimental Comparison”, Shavlik et al 1991
- “Backpropagation Learning For Multilayer Feed-Forward Neural Networks Using The Conjugate Gradient Method”, Johansson et al 1991
- “Exhaustive Learning”, Schwartz et al 1990
- “Artificial Neural Networks, Back Propagation, and the Kelley-Bryson Gradient Procedure”, Dreyfus 1990
- “International Joint Conference on Neural Networks, January 15–19, 1990: Volume 2: Applications Track”, Caudill 1990
- “International Joint Conference on Neural Networks, January 15–19, 1990: Volume 1: Theory Track, Neural and Cognitive Sciences Track”, Caudill 1990
- “Explanatory Coherence”, Thagard 1989
- “Parallel Distributed Processing: Implications for Cognition and Development”, McClelland 1989
- “The Brain As Template”, Finkbeiner 1988
- “Observation of Phase Transitions in Spreading Activation Networks”, Shrager et al 1987
- “Learning Representations by Backpropagating Errors”, Rumelhart et al 1986b
- “Toward An Interactive Model Of Reading”, Rumelhart 1985
- “Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms”, Rosenblatt 1962
- “Speculations on Perceptrons and Other Automata”, Good 1959
- “Pandemonium: A Paradigm for Learning”, Selfridge 1959
- “Gsutil Config—Obtain Credentials and Create Configuration File”, Google 2023
- “Why Momentum Really Works”
- “Glow: Better Reversible Generative Models”
- “Deep Reinforcement Learning Doesn't Work Yet”
- “Reddit: Reinforcement Learning Subreddit”, Reddit 2023
- Wikipedia
- Miscellaneous
- Link Bibliography
See Also
Links
“How Deep Is the Brain? The Shallow Brain Hypothesis”, Suzuki et al 2023
“Grokking Beyond Neural Networks: An Empirical Exploration With Model Complexity”, Miller et al 2023
“Grokking Beyond Neural Networks: An Empirical Exploration with Model Complexity”
“Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture”, Fu et al 2023
“Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture”
“Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition”, Chen et al 2023
“Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition”
“Combining Human Expertise With Artificial Intelligence: Experimental Evidence from Radiology”, Agarwal et al 2023
“Combining Human Expertise with Artificial Intelligence: Experimental Evidence from Radiology”
“Adam Accumulation to Reduce Memory Footprints of Both Activations and Gradients for Large-scale DNN Training”, Zhang et al 2023
“Protecting Society from AI Misuse: When Are Restrictions on Capabilities Warranted?”, Anderljung & Hazell 2023
“Protecting Society from AI Misuse: When are Restrictions on Capabilities Warranted?”
“Symbolic Discovery of Optimization Algorithms”, Chen et al 2023
“ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition”, Gandhi et al 2022
“ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition”
“Do Current Multi-Task Optimization Methods in Deep Learning Even Help?”, Xin et al 2022
“Do Current Multi-Task Optimization Methods in Deep Learning Even Help?”
“Selective Neutralization and Deterring of Cockroaches With Laser Automated by Machine Vision”, Rakhmatulin et al 2022
“Selective neutralization and deterring of cockroaches with laser automated by machine vision”
“Git Re-Basin: Merging Models modulo Permutation Symmetries”, Ainsworth et al 2022
“Git Re-Basin: Merging Models modulo Permutation Symmetries”
“Learning With Differentiable Algorithms”, Petersen 2022
“Normalized Activation Function: Toward Better Convergence”, Peiwen & Changsheng 2022
“Bugs in the Data: How ImageNet Misrepresents Biodiversity”, Luccioni & Rolnick 2022
“AniWho: A Quick and Accurate Way to Classify Anime Character Faces in Images”, Naftali et al 2022
“AniWho: A Quick and Accurate Way to Classify Anime Character Faces in Images”
“The Value of Out-of-Distribution Data”, Silva et al 2022
“Zeus: Understanding and Optimizing GPU Energy Consumption of DNN Training”, You et al 2022
“Zeus: Understanding and Optimizing GPU Energy Consumption of DNN Training”
“Learning With Combinatorial Optimization Layers: a Probabilistic Approach”, Dalle et al 2022
“Learning with Combinatorial Optimization Layers: a Probabilistic Approach”
“What Do We Maximize in Self-Supervised Learning?”, Shwartz-Ziv et al 2022
“Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit”, Barak et al 2022
“Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit”
“High-performing Neural Network Models of Visual Cortex Benefit from High Latent Dimensionality”, Elmoznino & Bonner 2022
“High-performing neural network models of visual cortex benefit from high latent dimensionality”
“Predicting Word Learning in Children from the Performance of Computer Vision Systems”, Rane et al 2022
“Predicting Word Learning in Children from the Performance of Computer Vision Systems”
“Wav2Vec-Aug: Improved Self-supervised Training With Limited Data”, Sriram et al 2022
“Wav2Vec-Aug: Improved self-supervised training with limited data”
“The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon”, Thilak et al 2022
“The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon”
“An Improved One Millisecond Mobile Backbone”, Vasu et al 2022
“Greedy Bayesian Posterior Approximation With Deep Ensembles”, Tiulpin & Blaschko 2022
“Greedy Bayesian Posterior Approximation with Deep Ensembles”
“Generating Scientific Claims for Zero-Shot Scientific Fact Checking”, Wright et al 2022
“Generating Scientific Claims for Zero-Shot Scientific Fact Checking”
“Model Soups: Averaging Weights of Multiple Fine-tuned Models Improves Accuracy without Increasing Inference Time”, Wortsman et al 2022
“Deep Lexical Hypothesis: Identifying Personality Structure in Natural Language”, Cutler & Condon 2022
“Deep Lexical Hypothesis: Identifying personality structure in natural language”
“Gradients without Backpropagation”, Baydin et al 2022
“Towards Scaling Difference Target Propagation by Learning Backprop Targets”, Ernoult et al 2022
“Towards Scaling Difference Target Propagation by Learning Backprop Targets”
“M5 Accuracy Competition: Results, Findings, and Conclusions”, Makridakis et al 2022
“M5 accuracy competition: Results, findings, and conclusions”
“Formal Analysis of Art: Proxy Learning of Visual Concepts from Style Through Language Models”, Kim et al 2022
“Formal Analysis of Art: Proxy Learning of Visual Concepts from Style Through Language Models”
“Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow”, Tambon et al 2021
“Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow”
“Artificial Intelligence ‘sees’ Split Electrons”, Perdew 2021
“Pushing the Frontiers of Density Functionals by Solving the Fractional Electron Problem”, Kirkpatrick et al 2021
“Pushing the frontiers of density functionals by solving the fractional electron problem”
“ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction”, Santhanam et al 2021
“ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction”
“Word Golf”, Xia 2021
“Deep Learning Enables Genetic Analysis of the Human Thoracic Aorta”, Pirruccello et al 2021
“Deep learning enables genetic analysis of the human thoracic aorta”
“Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks”, Ericsson et al 2021
“Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks”
“Achieving Human Parity on Visual Question Answering”, Yan et al 2021
“BC-Z: Zero-Shot Task Generalization With Robotic Imitation Learning”, Jang et al 2021
“BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning”
“Learning in High Dimension Always Amounts to Extrapolation”, Balestriero et al 2021
“Learning in High Dimension Always Amounts to Extrapolation”
“TWIST: Self-Supervised Learning by Estimating Twin Class Distributions”, Wang et al 2021
“TWIST: Self-Supervised Learning by Estimating Twin Class Distributions”
“The Structure of Genotype-phenotype Maps Makes Fitness Landscapes Navigable”, Greenbury et al 2021
“The structure of genotype-phenotype maps makes fitness landscapes navigable”
“The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks”, Entezari et al 2021
“The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks”
“Deep Neural Networks and Tabular Data: A Survey”, Borisov et al 2021
“Learning through Atypical "phase Transitions" in Overparameterized Neural Networks”, Baldassi et al 2021
“Learning through atypical "phase transitions" in overparameterized neural networks”
“RAFT: A Real-World Few-Shot Text Classification Benchmark”, Alex et al 2021
“PPT: Pre-trained Prompt Tuning for Few-shot Learning”, Gu et al 2021
“Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners”, Zhang et al 2021
“Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners”
“ETA Prediction With Graph Neural Networks in Google Maps”, Derrow-Pinion et al 2021
“Predictive Coding: a Theoretical and Experimental Review”, Millidge et al 2021
“Neuroprosthesis for Decoding Speech in a Paralyzed Person With Anarthria”, Moses et al 2021
“Neuroprosthesis for Decoding Speech in a Paralyzed Person with Anarthria”
“A Connectivity-constrained Computational Account of Topographic Organization in Primate High-level Visual Cortex”, Blauch et al 2021
“A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers”, Miao et al 2021
“A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers”
“Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation”, James et al 2021
“Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation”
“Revisiting Deep Learning Models for Tabular Data”, Gorishniy et al 2021
“Randomness In Neural Network Training: Characterizing The Impact of Tooling”, Zhuang et al 2021
“Randomness In Neural Network Training: Characterizing The Impact of Tooling”
“BEiT: BERT Pre-Training of Image Transformers”, Bao et al 2021
“Revisiting Model Stitching to Compare Neural Representations”, Bansal et al 2021
“Revisiting Model Stitching to Compare Neural Representations”
“Artificial Intelligence in China’s Revolution in Military Affairs”, Kania 2021
“Artificial intelligence in China’s revolution in military affairs”
“The Geometry of Concept Learning”, Sorscher et al 2021
“VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning”, Bardes et al 2021
“VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning”
“Understanding by Understanding Not: Modeling Negation in Language Models”, Hosseini et al 2021
“Understanding by Understanding Not: Modeling Negation in Language Models”
“Entailment As Few-Shot Learner”, Wang et al 2021
“PAWS: Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments With Support Samples”, Assran et al 2021
“Computer Optimization: Your Computer Is Faster Than You Think”, Gwern 2021
“Computer Optimization: Your Computer Is Faster Than You Think”
“Epistemic Autonomy: Self-supervised Learning in the Mammalian Hippocampus”, Santos-Pata et al 2021
“Epistemic Autonomy: Self-supervised Learning in the Mammalian Hippocampus”
“Rip Van Winkle’s Razor, a Simple New Estimate for Adaptive Data Analysis”, Arora & Zhang 2021
“Rip van Winkle’s Razor, a Simple New Estimate for Adaptive Data Analysis”
“Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization”, Xie et al 2021
“Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization”
“Contrasting Contrastive Self-Supervised Representation Learning Models”, Kotar et al 2021
“Contrasting Contrastive Self-Supervised Representation Learning Models”
“Characterizing and Improving the Robustness of Self-Supervised Learning through Background Augmentations”, Ryali et al 2021
“GWAS in Almost 195,000 Individuals Identifies 50 Previously Unidentified Genetic Loci for Eye Color”, Simcoe et al 2021
“BERTese: Learning to Speak to BERT”, Haviv et al 2021
“Predictive Coding Can Do Exact Backpropagation on Any Neural Network”, Salvatori et al 2021
“Predictive Coding Can Do Exact Backpropagation on Any Neural Network”
“Barlow Twins: Self-Supervised Learning via Redundancy Reduction”, Zbontar et al 2021
“Barlow Twins: Self-Supervised Learning via Redundancy Reduction”
“WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning”, Srinivasan et al 2021
“WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning”
“Rip Van Winkle’s Razor: A Simple Estimate of Overfit to Test Data”, Arora & Zhang 2021
“Rip van Winkle’s Razor: A Simple Estimate of Overfit to Test Data”
“Image Completion via Inference in Deep Generative Models”, Harvey et al 2021
“Contrastive Learning Inverts the Data Generating Process”, Zimmermann et al 2021
“DirectPred: Understanding Self-supervised Learning Dynamics without Contrastive Pairs”, Tian et al 2021
“DirectPred: Understanding self-supervised Learning Dynamics without Contrastive Pairs”
“MLGO: a Machine Learning Guided Compiler Optimizations Framework”, Trofin et al 2021
“MLGO: a Machine Learning Guided Compiler Optimizations Framework”
“Facial Recognition Technology Can Expose Political Orientation from Naturalistic Facial Images”, Kosinski 2021
“Facial recognition technology can expose political orientation from naturalistic facial images”
“Solving Mixed Integer Programs Using Neural Networks”, Nair et al 2020
“Sixteen Facial Expressions Occur in Similar Contexts Worldwide”, Cowen 2020
“Sixteen facial expressions occur in similar contexts worldwide”
“PiRank: Learning To Rank via Differentiable Sorting”, Swezey et al 2020
“Real-time Synthesis of Imagined Speech Processes from Minimally Invasive Recordings of Neural Activity”, Angrick et al 2020
“Generalization Bounds for Deep Learning”, Valle-Pérez & Louis 2020
“Selective Eye-gaze Augmentation To Enhance Imitation Learning In Atari Games”, Thammineni et al 2020
“Selective Eye-gaze Augmentation To Enhance Imitation Learning In Atari Games”
“Inductive Biases for Deep Learning of Higher-Level Cognition”, Goyal & Bengio 2020
“Inductive Biases for Deep Learning of Higher-Level Cognition”
“Exploring Simple Siamese Representation Learning”, Chen & He 2020
“Recent Advances in Neurotechnologies With Broad Potential for Neuroscience Research”, Vázquez-Guardado et al 2020
“Recent advances in neurotechnologies with broad potential for neuroscience research”
“Voting for Authorship Attribution Applied to Dark Web Data”, Samreen & Alalfi 2020
“Voting for Authorship Attribution Applied to Dark Web Data”
“Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding”, Roberts et al 2020
“Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding”
“Twenty Years Beyond the Turing Test: Moving Beyond the Human Judges Too”, Hernández-Orallo 2020
“Twenty Years Beyond the Turing Test: Moving Beyond the Human Judges Too”
“Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary With Width and Depth”, Nguyen et al 2020
“Open-Domain Question Answering Goes Conversational via Question Rewriting”, Anantha et al 2020
“Open-Domain Question Answering Goes Conversational via Question Rewriting”
“Digital Voicing of Silent Speech”, Gaddy & Klein 2020
“Rank-Smoothed Pairwise Learning In Perceptual Quality Assessment”, Talebi et al 2020
“Rank-Smoothed Pairwise Learning In Perceptual Quality Assessment”
“Implicit Gradient Regularization”, Barrett & Dherin 2020
“Large Associative Memory Problem in Neurobiology and Machine Learning”, Krotov & Hopfield 2020
“Large Associative Memory Problem in Neurobiology and Machine Learning”
“AdapterHub: A Framework for Adapting Transformers”, Pfeiffer et al 2020
“On Linear Identifiability of Learned Representations”, Roeder et al 2020
“Identifying Regulatory Elements via Deep Learning”, Barshai et al 2020
“Is SGD a Bayesian Sampler? Well, Almost”, Mingard et al 2020
“Bootstrap Your Own Latent (BYOL): A New Approach to Self-supervised Learning”, Grill et al 2020
“Bootstrap your own latent (BYOL): A new approach to self-supervised Learning”
“SCAN: Learning to Classify Images without Labels”, Gansbeke et al 2020
“Open-Retrieval Conversational Question Answering”, Qu et al 2020
“Politeness Transfer: A Tag and Generate Approach”, Madaan et al 2020
“Supervised Contrastive Learning”, Khosla et al 2020
“Can You Put It All Together: Evaluating Conversational Agents’ Ability to Blend Skills”, Smith et al 2020
“Can You Put it All Together: Evaluating Conversational Agents’ Ability to Blend Skills”
“Backpropagation and the Brain”, Lillicrap et al 2020
“TREC CAsT 2019: The Conversational Assistance Track Overview”, Dalton et al 2020
“TREC CAsT 2019: The Conversational Assistance Track Overview”
“Improved Baselines With Momentum Contrastive Learning”, Chen et al 2020
“The Large Learning Rate Phase of Deep Learning: the Catapult Mechanism”, Lewkowycz et al 2020
“The large learning rate phase of deep learning: the catapult mechanism”
“Fast Differentiable Sorting and Ranking”, Blondel et al 2020
“The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence”, Marcus 2020
“The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence”
“Quantifying Independently Reproducible Machine Learning”, Raff 2020
“The Secret History of Facial Recognition: Sixty Years Ago, a Sharecropper’s Son Invented a Technology to Identify Faces. Then the Record of His Role All but Vanished. Who Was Woody Bledsoe, and Who Was He Working For?”, Raviv 2020
“ImageNet-A: Natural Adversarial Examples”, Hendrycks et al 2020
“Can the Brain Do Backpropagation? -Exact Implementation of Backpropagation in Predictive Coding Networks”, Song et al 2020
“Learning Neural Activations”, Minhas & Asif 2019
“2019 AI Alignment Literature Review and Charity Comparison”, Larks 2019
“2019 AI Alignment Literature Review and Charity Comparison”
“Libri-Light: A Benchmark for ASR With Limited or No Supervision”, Kahn et al 2019
“Libri-Light: A Benchmark for ASR with Limited or No Supervision”
“Connecting Vision and Language With Localized Narratives”, Pont-Tuset et al 2019
“12-in-1: Multi-Task Vision and Language Representation Learning”, Lu et al 2019
“12-in-1: Multi-Task Vision and Language Representation Learning”
“A Deep Learning Framework for Neuroscience”, Richards et al 2019
“Machine Learning for Scent: Learning Generalizable Perceptual Representations of Small Molecules”, Sanchez-Lengeling et al 2019
“Machine Learning for Scent: Learning Generalizable Perceptual Representations of Small Molecules”
“KuroNet: Pre-Modern Japanese Kuzushiji Character Recognition With Deep Learning”, Clanuwat et al 2019
“KuroNet: Pre-Modern Japanese Kuzushiji Character Recognition with Deep Learning”
“Approximate Inference in Discrete Distributions With Monte Carlo Tree Search and Value Functions”, Buesing et al 2019
“Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions”
“Best Practices for the Human Evaluation of Automatically Generated Text”, Lee et al 2019
“Best practices for the human evaluation of automatically generated text”
“RandAugment: Practical Automated Data Augmentation With a Reduced Search Space”, Cubuk et al 2019
“RandAugment: Practical automated data augmentation with a reduced search space”
“ALBERT: A Lite BERT for Self-supervised Learning of Language Representations”, Lan et al 2019
“ALBERT: A Lite BERT for Self-supervised Learning of Language Representations”
“Large-scale Pretraining for Neural Machine Translation With Tens of Billions of Sentence Pairs”, Meng et al 2019
“Large-scale Pretraining for Neural Machine Translation with Tens of Billions of Sentence Pairs”
“Neural Networks Are a Priori Biased towards Boolean Functions With Low Entropy”, Mingard et al 2019
“Neural networks are a priori biased towards Boolean functions with low entropy”
“Engineering a Less Artificial Intelligence”, Sinz et al 2019
“Simple, Scalable Adaptation for Neural Machine Translation”, Bapna et al 2019
“Simple, Scalable Adaptation for Neural Machine Translation”
“Emergent Tool Use From Multi-Agent Autocurricula”, Baker et al 2019
“A Step Toward Quantifying Independently Reproducible Machine Learning Research”, Raff 2019
“A Step Toward Quantifying Independently Reproducible Machine Learning Research”
“Does Machine Translation Affect International Trade? Evidence from a Large Digital Platform”, Brynjolfsson et al 2019b
“Does Machine Translation Affect International Trade? Evidence from a Large Digital Platform”
“Can One Concurrently Record Electrical Spikes from Every Neuron in a Mammalian Brain?”, Kleinfeld et al 2019
“Can One Concurrently Record Electrical Spikes from Every Neuron in a Mammalian Brain?”
“Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges”, Arivazhagan et al 2019
“Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges”
“Deep Set Prediction Networks”, Zhang et al 2019
“Optimizing Color for Camouflage and Visibility Using Deep Learning: the Effects of the Environment and the Observer’s Visual System”, Fennell et al 2019
“Cold Case: The Lost MNIST Digits”, Yadav & Bottou 2019
“Speech2Face: Learning the Face Behind a Voice”, Oh et al 2019
“SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems”, Wang et al 2019
“SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems”
“Universal Quantum Control through Deep Reinforcement Learning”, Niu et al 2019
“Universal quantum control through deep reinforcement learning”
“Analysing Mathematical Reasoning Abilities of Neural Models”, Saxton et al 2019
“Analysing Mathematical Reasoning Abilities of Neural Models”
“Reinforcement Learning for Recommender Systems: A Case Study on Youtube”, Chen 2019
“Reinforcement Learning for Recommender Systems: A Case Study on Youtube”
“Stochastic Optimization of Sorting Networks via Continuous Relaxations”, Grover et al 2019
“Stochastic Optimization of Sorting Networks via Continuous Relaxations”
“Surprises in High-Dimensional Ridgeless Least Squares Interpolation”, Hastie et al 2019
“Surprises in High-Dimensional Ridgeless Least Squares Interpolation”
“DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs”, Dua et al 2019
“DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs”
“Theories of Error Back-Propagation in the Brain”, Whittington & Bogacz 2019
“A Replication Study: Machine Learning Models Are Capable of Predicting Sexual Orientation From Facial Images”, Leuner 2019
“Unmasking Clever Hans Predictors and Assessing What Machines Really Learn”, Lapuschkin et al 2019
“Unmasking Clever Hans Predictors and Assessing What Machines Really Learn”
“What Makes a Good Conversation? How Controllable Attributes Affect Human Judgments”, See et al 2019
“What makes a good conversation? How controllable attributes affect human judgments”
“The Evolved Transformer”, So et al 2019
“Forecasting Transformative AI: An Expert Survey”, Gruetzemacher et al 2019
“Human Few-shot Learning of Compositional Instructions”, Lake et al 2019
“Identifying Facial Phenotypes of Genetic Disorders Using Deep Learning”, Gurovich et al 2019
“Identifying facial phenotypes of genetic disorders using deep learning”
“High-performance Medicine: the Convergence of Human and Artificial Intelligence”, Topol 2019
“High-performance medicine: the convergence of human and artificial intelligence”
“Why Is There No Successful Whole Brain Simulation (Yet)?”, Stiefel 2019
“Evaluation and Accurate Diagnoses of Pediatric Diseases Using Artificial Intelligence”, Liang et al 2019
“Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence”
“Reinventing the Wheel: Discovering the Optimal Rolling Shape With PyTorch”, Wiener 2019
“Reinventing the Wheel: Discovering the Optimal Rolling Shape with PyTorch”
“An Empirical Study of Example Forgetting during Deep Neural Network Learning”, Toneva et al 2018
“An Empirical Study of Example Forgetting during Deep Neural Network Learning”
“Evolution As Backstop for Reinforcement Learning”, Gwern 2018
“CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge”, Talmor et al 2018
“CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge”
“Depth With Nonlinearity Creates No Bad Local Minima in ResNets”, Kawaguchi & Bengio 2018
“Depth with Nonlinearity Creates No Bad Local Minima in ResNets”
“BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”, Devlin et al 2018
“BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”
“Interpretable Textual Neuron Representations for NLP”, Poerner et al 2018
“Machine Learning to Predict Osteoporotic Fracture Risk from Genotypes”, Forgetta et al 2018
“Machine Learning to Predict Osteoporotic Fracture Risk from Genotypes”
“Searching for Efficient Multi-Scale Architectures for Dense Image Prediction”, Chen et al 2018
“Searching for Efficient Multi-Scale Architectures for Dense Image Prediction”
“Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction”, Hashimoto & Tsuruoka 2018
“Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction”
“Searching Toward Pareto-Optimal Device-Aware Neural Architectures”, Cheng et al 2018
“Searching Toward Pareto-Optimal Device-Aware Neural Architectures”
“A Study of Reinforcement Learning for Neural Machine Translation”, Wu et al 2018
“A Study of Reinforcement Learning for Neural Machine Translation”
“Neural Arithmetic Logic Units”, Trask et al 2018
“Modeling Visual Context Is Key to Augmenting Object Detection Datasets”, Dvornik et al 2018
“Modeling Visual Context is Key to Augmenting Object Detection Datasets”
“Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search”, Zela et al 2018
“Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search”
“Automatically Composing Representation Transformations As a Means for Generalization”, Chang et al 2018
“Automatically Composing Representation Transformations as a Means for Generalization”
“ARPA and SCI: Surfing AI”, Gwern 2018
“Differentiable Learning-to-Normalize via Switchable Normalization”, Luo et al 2018
“Differentiable Learning-to-Normalize via Switchable Normalization”
“On the Spectral Bias of Neural Networks”, Rahaman et al 2018
“Neural Tangent Kernel: Convergence and Generalization in Neural Networks”, Jacot et al 2018
“Neural Tangent Kernel: Convergence and Generalization in Neural Networks”
“Faster SGD Training by Minibatch Persistency”, Fischetti et al 2018
“Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning”, Pang et al 2018
“Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning”
“Do CIFAR-10 Classifiers Generalize to CIFAR-10?”, Recht et al 2018
“Zero-Shot Dual Machine Translation”, Sestorain et al 2018
“Do Better ImageNet Models Transfer Better?”, Kornblith et al 2018
“GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding”, Wang et al 2018
“GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding”
“Adafactor: Adaptive Learning Rates With Sublinear Memory Cost”, Shazeer & Stern 2018
“Adafactor: Adaptive Learning Rates with Sublinear Memory Cost”
“Think You Have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge”, Clark et al 2018
“Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge”
“SentEval: An Evaluation Toolkit for Universal Sentence Representations”, Conneau & Kiela 2018
“SentEval: An Evaluation Toolkit for Universal Sentence Representations”
“Averaging Weights Leads to Wider Optima and Better Generalization”, Izmailov et al 2018
“Averaging Weights Leads to Wider Optima and Better Generalization”
“Analyzing Uncertainty in Neural Machine Translation”, Ott et al 2018
“End-to-end Deep Image Reconstruction from Human Brain Activity”, Shen et al 2018
“End-to-end deep image reconstruction from human brain activity”
“Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari”, Chrabaszcz et al 2018
“Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari”
“SignSGD: Compressed Optimisation for Non-Convex Problems”, Bernstein et al 2018
“Differentiable Dynamic Programming for Structured Prediction and Attention”, Mensch & Blondel 2018
“Differentiable Dynamic Programming for Structured Prediction and Attention”
“UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction”, Lel et al 2018
“UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction”
“Semantic Projection: Recovering Human Knowledge of Multiple, Distinct Object Features from Word Embeddings”, Grand et al 2018
“Panoptic Segmentation”, Kirillov et al 2018
“Prediction of Cardiovascular Risk Factors from Retinal Fundus Photographs via Deep Learning”, Poplin et al 2018
“Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning”
“Clinically Applicable Deep Learning for Diagnosis and Referral in Retinal Disease”, Fauw et al 2018
“Clinically applicable deep learning for diagnosis and referral in retinal disease”
“Deep Image Reconstruction from Human Brain Activity”, Shen et al 2017
“The NarrativeQA Reading Comprehension Challenge”, Kočiský et al 2017
“China’s A.I. Advances Help Its Tech Industry, and State Security”, Mozur & Bradsher 2017
“China’s A.I. Advances Help Its Tech Industry, and State Security”
“Three-dimensional Visualization and a Deep-learning Model Reveal Complex Fungal Parasite Networks in Behaviorally Manipulated Ants”, Fredericksen et al 2017
“Decoupled Weight Decay Regularization”, Loshchilov & Hutter 2017
“Unsupervised Machine Translation Using Monolingual Corpora Only”, Lample et al 2017
“Unsupervised Machine Translation Using Monolingual Corpora Only”
“Automatic Differentiation in PyTorch”, Paszke et al 2017
“Rethinking Generalization Requires Revisiting Old Ideas: Statistical Mechanics Approaches and Complex Learning Behavior”, Martin & Mahoney 2017
“Malware Detection by Eating a Whole EXE”, Raff et al 2017
“Mixup: Beyond Empirical Risk Minimization”, Zhang et al 2017
“AlphaGo Zero: Mastering the Game of Go without Human Knowledge”, Silver et al 2017
“AlphaGo Zero: Mastering the game of Go without human knowledge”
“Swish: Searching for Activation Functions”, Ramachandran et al 2017
“Online Learning of a Memory for Learning Rates”, Meier et al 2017
“Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates”, Smith & Topin 2017
“Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates”
“Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection”, Dwibedi et al 2017
“Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection”
“Emergence of Locomotion Behaviors in Rich Environments”, Heess et al 2017
“Six Challenges for Neural Machine Translation”, Koehn & Knowles 2017
“Verb Physics: Relative Physical Knowledge of Actions and Objects”, Forbes & Choi 2017
“Verb Physics: Relative Physical Knowledge of Actions and Objects”
“Driver Identification Using Automobile Sensor Data from a Single Turn”, Hallac et al 2017
“Driver Identification Using Automobile Sensor Data from a Single Turn”
“StreetStyle: Exploring World-wide Clothing Styles from Millions of Photos”, Matzen et al 2017
“StreetStyle: Exploring world-wide clothing styles from millions of photos”
“Deep Voice 2: Multi-Speaker Neural Text-to-Speech”, Arik et al 2017
“WebVision Challenge: Visual Learning and Understanding With Web Data”, Li et al 2017
“WebVision Challenge: Visual Learning and Understanding With Web Data”
“Inferring and Executing Programs for Visual Reasoning”, Johnson et al 2017
“Visual Attribute Transfer through Deep Image Analogy”, Liao et al 2017
“On Weight Initialization in Deep Neural Networks”, Kumar 2017
“A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference”, Williams et al 2017
“A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference”
“RACE: Large-scale ReAding Comprehension Dataset From Examinations”, Lai et al 2017
“RACE: Large-scale ReAding Comprehension Dataset From Examinations”
“Data-efficient Deep Reinforcement Learning for Dexterous Manipulation”, Popov et al 2017
“Data-efficient Deep Reinforcement Learning for Dexterous Manipulation”
“Research Ideas”, Gwern 2017
“Prototypical Networks for Few-shot Learning”, Snell et al 2017
“Meta Networks”, Munkhdalai & Yu 2017
“Understanding Synthetic Gradients and Decoupled Neural Interfaces”, Czarnecki et al 2017
“Understanding Synthetic Gradients and Decoupled Neural Interfaces”
“Deep Voice: Real-time Neural Text-to-Speech”, Arik et al 2017
“Adaptive Neural Networks for Efficient Inference”, Bolukbasi et al 2017
“Machine Learning Predicts Laboratory Earthquakes”, Bertr et al 2017
“Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks”, Katz et al 2017
“Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks”
“Dermatologist-level Classification of Skin Cancer With Deep Neural Networks”, Esteva et al 2017
“Dermatologist-level classification of skin cancer with deep neural networks”
“Machine Learning for Systems and Systems for Machine Learning”, Dean 2017
“Machine Learning for Systems and Systems for Machine Learning”
“Feedback Networks”, Zamir et al 2016
“CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning”, Johnson et al 2016
“CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning”
“Towards Information-Seeking Agents”, Bachman et al 2016
“Spatially Adaptive Computation Time for Residual Networks”, Figurnov et al 2016
“Deep Learning Reinvents the Hearing Aid: Finally, Wearers of Hearing Aids Can Pick out a Voice in a Crowded Room”, Wang 2016b
“MS MARCO: A Human Generated MAchine Reading COmprehension Dataset”, Bajaj et al 2016
“MS MARCO: A Human Generated MAchine Reading COmprehension Dataset”
“Learning to Reinforcement Learn”, Wang et al 2016
“Lip Reading Sentences in the Wild”, Chung et al 2016
“Could a Neuroscientist Understand a Microprocessor?”, Jonas & Kording 2016
“A Neural Network Playground”, Smilkov & Carter 2016
“Deep Information Propagation”, Schoenholz et al 2016
“Homotopy Analysis for Tensor PCA”, Anandkumar et al 2016
“Language As a Latent Variable: Discrete Generative Models for Sentence Compression”, Miao & Blunsom 2016
“Language as a Latent Variable: Discrete Generative Models for Sentence Compression”
“Why Does Deep and Cheap Learning Work so Well?”, Lin et al 2016
“SGDR: Stochastic Gradient Descent With Warm Restarts”, Loshchilov & Hutter 2016
“Concrete Problems in AI Safety”, Amodei et al 2016
“SQuAD: 100,000+ Questions for Machine Comprehension of Text”, Rajpurkar et al 2016
“SQuAD: 100,000+ Questions for Machine Comprehension of Text”
“Matching Networks for One Shot Learning”, Vinyals et al 2016
“Convolutional Sketch Inversion”, Güçlütürk et al 2016
“Unifying Count-Based Exploration and Intrinsic Motivation”, Bellemare et al 2016
“Synthesizing the Preferred Inputs for Neurons in Neural Networks via Deep Generator Networks”, Nguyen et al 2016
“Synthesizing the preferred inputs for neurons in neural networks via deep generator networks”
“Improving Information Extraction by Acquiring External Evidence With Reinforcement Learning”, Narasimhan et al 2016
“Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning”
“Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity”, Daniely et al 2016
“"Why Should I Trust You?": Explaining the Predictions of Any Classifier”, Ribeiro et al 2016
“"Why Should I Trust You?": Explaining the Predictions of Any Classifier”
“Mastering the Game of Go With Deep Neural Networks and Tree Search”, Silver et al 2016
“Mastering the game of Go with deep neural networks and tree search”
“Learning to Compose Neural Networks for Question Answering”, Andreas et al 2016
“Learning to Compose Neural Networks for Question Answering”
“How a Japanese Cucumber Farmer Is Using Deep Learning and TensorFlow”, Sato 2016
“How a Japanese cucumber farmer is using deep learning and TensorFlow”
“Random Gradient-Free Minimization of Convex Functions”, Nesterov & Spokoiny 2015
“Data-dependent Initializations of Convolutional Neural Networks”, Krähenbühl et al 2015
“Data-dependent Initializations of Convolutional Neural Networks”
“Online Batch Selection for Faster Training of Neural Networks”, Loshchilov & Hutter 2015
“Online Batch Selection for Faster Training of Neural Networks”
“Neural Module Networks”, Andreas et al 2015
“Deep DPG (DDPG): Continuous Control With Deep Reinforcement Learning”, Lillicrap et al 2015
“Deep DPG (DDPG): Continuous control with deep reinforcement learning”
“A Neural Algorithm of Artistic Style”, Gatys et al 2015
“VQA: Visual Question Answering”, Agrawal et al 2015
“Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks”, Weston et al 2015
“Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks”
“Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification”, He et al 2015
“Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification”
“Freeze-Thaw Bayesian Optimization”, Swersky et al 2014
“Microsoft COCO: Common Objects in Context”, Lin et al 2014
“Deep Learning in Neural Networks: An Overview”, Schmidhuber 2014
“Neural Networks, Manifolds, and Topology”, Olah 2014
“Exact Solutions to the Nonlinear Dynamics of Learning in Deep Linear Neural Networks”, Saxe et al 2013
“Exact solutions to the nonlinear dynamics of learning in deep linear neural networks”
“Distributed Representations of Words and Phrases and Their Compositionality”, Mikolov et al 2013
“Distributed Representations of Words and Phrases and their Compositionality”
“Whatever Next? Predictive Brains, Situated Agents, and the Future of Cognitive Science”, Clark 2013
“Whatever next? Predictive brains, situated agents, and the future of cognitive science”
“Surprisingly Turing-Complete”, Gwern 2012
“Deep Gaussian Processes”, Damianou & Lawrence 2012
“Timing Technology: Lessons From The Media Lab”, Gwern 2012
“Artist Agent: A Reinforcement Learning Approach to Automatic Stroke Generation in Oriental Ink Painting”, Xie et al 2012
“The Neural Net Tank Urban Legend”, Gwern 2011
“HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent”, Niu et al 2011
“HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent”
“How Complex Are Individual Differences?”, Gwern 2010
“A Free Energy Principle for the Brain”, Friston et al 2006
“Understanding the Nature of the General Factor of Intelligence: The Role of Individual Differences in Neural Plasticity As an Explanatory Mechanism”, Garlick 2002
“DARPA and the Quest for Machine Intelligence, 1983–1993”, Roland & Shiman 2002
“Starfish § Bulrushes”, Watts 1999
“Optimality in Biological and Artificial Networks?”, Levine & Elsberry 1997
“A Sociological Study of the Official History of the Perceptrons Controversy”, Olazaran 1996
“A Sociological Study of the Official History of the Perceptrons Controversy”
“Statistical Mechanics of Generalization”, Opper & Kinzel 1996
“Learning and Generalization in a Two-layer Neural Network: The Role of the Vapnik-Chervonvenkis Dimension”, Opper 1994
“A Sociological Study of the Official History of the Perceptrons Controversy [1993]”, Olazaran 1993
“A Sociological Study of the Official History of the Perceptrons Controversy [1993]”
“The Statistical Mechanics of Learning a Rule”, Watkin et al 1993
“On Learning the Past Tenses of English Verbs”, Rumelhart & McClelland 1993
“Statistical Mechanics of Learning from Examples”, Seung et al 1992
“Memorization Without Generalization in a Multilayered Neural Network”, Hansel et al 1992
“Memorization Without Generalization in a Multilayered Neural Network”
“Symbolic and Neural Learning Algorithms: An Experimental Comparison”, Shavlik et al 1991
“Symbolic and neural learning algorithms: An experimental comparison”
“Backpropagation Learning For Multilayer Feed-Forward Neural Networks Using The Conjugate Gradient Method”, Johansson et al 1991
“Exhaustive Learning”, Schwartz et al 1990
“Artificial Neural Networks, Back Propagation, and the Kelley-Bryson Gradient Procedure”, Dreyfus 1990
“Artificial neural networks, back propagation, and the Kelley-Bryson gradient procedure”
“International Joint Conference on Neural Networks, January 15–19, 1990: Volume 2: Applications Track”, Caudill 1990
“International Joint Conference on Neural Networks, January 15–19, 1990: Volume 1: Theory Track, Neural and Cognitive Sciences Track”, Caudill 1990
“Explanatory Coherence”, Thagard 1989
“Parallel Distributed Processing: Implications for Cognition and Development”, McClelland 1989
“Parallel Distributed Processing: Implications for Cognition and Development”
“The Brain As Template”, Finkbeiner 1988
“Observation of Phase Transitions in Spreading Activation Networks”, Shrager et al 1987
“Observation of Phase Transitions in Spreading Activation Networks”
“Learning Representations by Backpropagating Errors”, Rumelhart et al 1986b
“Toward An Interactive Model Of Reading”, Rumelhart 1985
“Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms”, Rosenblatt 1962
“Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms”
“Speculations on Perceptrons and Other Automata”, Good 1959
“Pandemonium: A Paradigm for Learning”, Selfridge 1959
“Gsutil Config—Obtain Credentials and Create Configuration File”, Google 2023
“gsutil config—Obtain credentials and create configuration file”
“Why Momentum Really Works”
“Glow: Better Reversible Generative Models”
“Deep Reinforcement Learning Doesn't Work Yet”
“Reddit: Reinforcement Learning Subreddit”, Reddit 2023
Wikipedia
Miscellaneous
-
/doc/ai/nn/2022-12-02-gwern-meme-itsafraid-googlereluctancetoproductizedeeplearningresearch.png
-
/doc/ai/nn/2021-santospata-figure1-hippocampusselfsupervisionlearning.jpg
-
/doc/ai/nn/2000-cartwright-intelligentdataanalysisinscience.pdf
-
/doc/ai/nn/1991-sethi-artificialneuralnetworksandstatisticalpatternrecognition.pdf
-
https://aleph.se/andart2/math/weird-probability-distributions/
-
https://juretriglav.si/compressing-global-illumination-with-neural-networks/
-
https://people.idsia.ch/~juergen/DanNet-triggers-deep-CNN-revolution-2011.html
-
https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/37648.pdf
-
https://www.kaggle.com/code/andy8744/predict-anime-face-using-pre-trained-model/data
-
https://www.lesswrong.com/posts/RKDQCB6smLWgs2Mhr/multi-component-learning-and-s-curves
-
https://www.neelnanda.io/mechanistic-interpretability/favourite-papers
-
https://www.protocol.com/china/i-built-bytedance-censorship-machine
-
https://www.quantamagazine.org/to-be-energy-efficient-brains-predict-their-perceptions-20211115/
-
https://www.vox.com/future-perfect/23775650/ai-regulation-openai-gpt-anthropic-midjourney-stable
Link Bibliography
-
https://arxiv.org/abs/2310.12109
: “Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture”, -
https://www.nber.org/papers/w31422
: “Combining Human Expertise With Artificial Intelligence: Experimental Evidence from Radiology”, Nikhil Agarwal, Alex Moehring, Pranav Rajpurkar, Tobias Salz -
https://arxiv.org/abs/2302.06675#google
: “Symbolic Discovery of Optimization Algorithms”, -
https://arxiv.org/abs/2210.13352#huggingface
: “ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition”, Sanchit Gandhi, Patrick von Platen, Alexander M. Rush -
https://www.tandfonline.com/doi/full/10.1080/00305316.2022.2121777
: “Selective Neutralization and Deterring of Cockroaches With Laser Automated by Machine Vision”, Ildar Rakhmatulin, Mathieu Lihoreau, Jose Pueyo -
https://arxiv.org/abs/2208.11012
: “AniWho: A Quick and Accurate Way to Classify Anime Character Faces in Images”, Martinus Grady Naftali, Jason Sebastian Sulistyawan, Kelvin Julian, Felix Indra Kurniadi -
https://arxiv.org/abs/2203.05482
: “Model Soups: Averaging Weights of Multiple Fine-tuned Models Improves Accuracy without Increasing Inference Time”, -
https://arxiv.org/abs/2201.13415
: “Towards Scaling Difference Target Propagation by Learning Backprop Targets”, -
https://www.sciencedirect.com/science/article/pii/S0169207021001874
: “M5 Accuracy Competition: Results, Findings, and Conclusions”, Spyros Makridakis, Evangelos Spiliotis, Vassilios Assimakopoulos -
https://arxiv.org/abs/2112.13314
: “Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow”, Florian Tambon, Amin Nikanjam, Le An, Foutse Khomh, Giuliano Antoniol -
2021-kirkpatrick.pdf#deepmind
: “Pushing the Frontiers of Density Functionals by Solving the Fractional Electron Problem”, -
https://www.word.golf/
: “Word Golf”, Eric Xia -
https://arxiv.org/abs/2110.07402#bytedance
: “TWIST: Self-Supervised Learning by Estimating Twin Class Distributions”, Feng Wang, Tao Kong, Rufeng Zhang, Huaping Liu, Hang Li -
2021-moses.pdf
: “Neuroprosthesis for Decoding Speech in a Paralyzed Person With Anarthria”, -
https://arxiv.org/abs/2106.08254#microsoft
: “BEiT: BERT Pre-Training of Image Transformers”, Hangbo Bao, Li Dong, Furu Wei -
https://arxiv.org/abs/2104.13963#facebook
: “PAWS: Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments With Support Samples”, Mahmoud Assran, Mathilde Caron, Ishan Misra, Piotr Bojanowski, Arm, Joulin, Nicolas Ballas, Michael Rabbat -
https://www.offconvex.org/2021/04/07/ripvanwinkle/
: “Rip Van Winkle’s Razor, a Simple New Estimate for Adaptive Data Analysis”, Sanjeev Arora, Yi Zhang -
https://arxiv.org/abs/2103.14005
: “Contrasting Contrastive Self-Supervised Representation Learning Models”, Klemen Kotar, Gabriel Ilharco, Ludwig Schmidt, Kiana Ehsani, Roozbeh Mottaghi -
https://arxiv.org/abs/2103.12719#facebook
: “Characterizing and Improving the Robustness of Self-Supervised Learning through Background Augmentations”, Chaitanya K. Ryali, David J. Schwab, Ari S. Morcos -
https://arxiv.org/abs/2102.06810#facebook
: “DirectPred: Understanding Self-supervised Learning Dynamics without Contrastive Pairs”, Yuandong Tian, Xinlei Chen, Surya Ganguli -
https://arxiv.org/abs/2007.07779
: “AdapterHub: A Framework for Adapting Transformers”, -
https://arxiv.org/abs/2005.12320
: “SCAN: Learning to Classify Images without Labels”, Wouter Van Gansbeke, Simon Vandenhende, Stamatios Georgoulis, Marc Proesmans, Luc Van Gool -
https://arxiv.org/abs/2004.11362#google
: “Supervised Contrastive Learning”, -
https://www.lesswrong.com/posts/SmDziGM9hBjW9DKmf/2019-ai-alignment-literature-review-and-charity-comparison
: “2019 AI Alignment Literature Review and Charity Comparison”, Larks -
https://arxiv.org/abs/1912.03098#google
: “Connecting Vision and Language With Localized Narratives”, Jordi Pont-Tuset, Jasper Uijlings, Soravit Changpinyo, Radu Soricut, Vittorio Ferrari -
13
: “2019 News”, Gwern -
https://arxiv.org/abs/1909.11942#google
: “ALBERT: A Lite BERT for Self-supervised Learning of Language Representations”, Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut -
https://arxiv.org/abs/1905.00537
: “SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems”, -
backstop
: “Evolution As Backstop for Reinforcement Learning”, Gwern -
arpa
: “ARPA and SCI: Surfing AI”, Gwern -
https://arxiv.org/abs/1806.10779
: “Differentiable Learning-to-Normalize via Switchable Normalization”, Ping Luo, Jiamin Ren, Zhanglin Peng, Ruimao Zhang, Jingyu Li -
https://arxiv.org/abs/1803.05407
: “Averaging Weights Leads to Wider Optima and Better Generalization”, Pavel Izmailov, Dmitrii Podoprikhin, Timur Garipov, Dmitry Vetrov, Andrew Gordon Wilson -
https://arxiv.org/abs/1802.08842
: “Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari”, Patryk Chrabaszcz, Ilya Loshchilov, Frank Hutter -
https://www.nytimes.com/2017/12/03/business/china-artificial-intelligence.html
: “China’s A.I. Advances Help Its Tech Industry, and State Security”, Paul Mozur, Keith Bradsher -
2017-silver.pdf#deepmind
: “AlphaGo Zero: Mastering the Game of Go without Human Knowledge”, -
https://arxiv.org/abs/1708.07120
: “Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates”, Leslie N. Smith, Nicholay Topin -
https://arxiv.org/abs/1705.05640
: “WebVision Challenge: Visual Learning and Understanding With Web Data”, Wen Li, Limin Wang, Wei Li, Eirikur Agustsson, Jesse Berent, Abhinav Gupta, Rahul Sukthankar, Luc Van Gool -
idea
: “Research Ideas”, Gwern -
https://arxiv.org/abs/1612.02297
: “Spatially Adaptive Computation Time for Residual Networks”, Michael Figurnov, Maxwell D. Collins, Yukun Zhu, Li Zhang, Jonathan Huang, Dmitry Vetrov, Ruslan Salakhutdinov -
turing-complete
: “Surprisingly Turing-Complete”, Gwern -
timing
: “Timing Technology: Lessons From The Media Lab”, Gwern -
tank
: “The Neural Net Tank Urban Legend”, Gwern -
difference
: “How Complex Are Individual Differences?”, Gwern -
1994-opper.pdf
: “Learning and Generalization in a Two-layer Neural Network: The Role of the Vapnik-Chervonvenkis Dimension”, Manfred Opper -
1993-olazaran.pdf
: “A Sociological Study of the Official History of the Perceptrons Controversy [1993]”, Mikel Olazaran -
1992-seung.pdf
: “Statistical Mechanics of Learning from Examples”, H. S. Seung, H. Sompolinsky, N. Tishby -
1992-hansel.pdf
: “Memorization Without Generalization in a Multilayered Neural Network”, D. Hansel, G. Mato, C. Meunier -
1989-mcclelland.pdf
: “Parallel Distributed Processing: Implications for Cognition and Development”, James L. McClelland