Bibliography (318):

  1. Oracle AI

  2. Tool AI

  3. 2011-05-10-givewell-holdenkarnofskyjaantallinn.doc

  4. Thoughts on the Singularity Institute (SI)

  5. Tools versus Agents

  6. Learning to summarize from human feedback

  7. GPT-3 Creative Fiction

  8. Decision Transformer: Reinforcement Learning via Sequence Modeling

  9. Some Moral and Technical Consequences of Automation: As machines learn they may develop unforeseen strategies at rates that baffle their programmers

  10. Deep Neural Networks for YouTube Recommendations

  11. Top-K Off-Policy Correction for a REINFORCE Recommender System

  12. Reinforcement Learning for Recommender Systems: A Case Study on Youtube

  13. note#advanced-chess-obituary

    [Transclude the forward-link's context]

  14. The Season of Burning Trucks

  15. What If the City Ran Waze and You Had to Obey It? Could This Cure Congestion?

  16. We Traveled to Holloman Air Force Base for a Glimpse of the Future of War-And the Future of Work

  17. Developers Predict That Pilotless Devices Will Join Planes in Civilian Airspace—And Dream of Electric Robots Counting Sheep

  18. Sam Altman’s Manifest Destiny: Is the head of Y Combinator fixing the world, or trying to take over Silicon Valley?

  19. The United States Has Put Artificial Intelligence at the Center of Its Defense Strategy, With Weapons That Can Identify Targets and Make Decisions.

  20. Forget about Drones, Forget about Dystopian Sci-Fi—A Terrifying New Generation of Autonomous Weapons Is Already Here. Meet the Small Band of Dedicated Optimists Battling Nefarious Governments and Bureaucratic Tedium to Stop the Proliferation of Killer Robots And, Just Maybe, save Humanity from Itself.

  21. https://media.defense.gov/2019/Oct/31/2002204458/-1/-1/0/DIB_AI_PRINCIPLES_PRIMARY_DOCUMENT.PDF#page=8

  22. https://www.gutenberg.org/cache/epub/16550/pg16550-images.html#Cpage445

  23. Traffic-Weary Homeowners and Waze Are at War, Again. Guess Who’s Winning?

  24. ‘Cut-Through’ Traffic Caused by Waze App Must Stop, L.A. Councilman Says

  25. LA Residents Complain about ‘Waze Craze’

  26. How Should We Critique Research?

  27. Probabilistic Integration: A Role in Statistical Computation?

  28. Pure Exploration for Multi-Armed Bandit Problems

  29. Best Arm Identification in Multi-Armed Bandits

  30. Multi-Bandit Best Arm Identification

  31. Decision Making Using Thompson Sampling

  32. Best-Arm Identification Algorithms for Multi-Armed Bandits in the Fixed Confidence Setting

  33. On the Complexity of Best Arm Identification in Multi-Armed Bandit Models

  34. Optimal Design in Psychological Research

  35. Rerandomization to improve covariate balance in experiments

  36. The Power of Two Random Choices: A Survey of Techniques and Results

  37. Attention and Augmented Recurrent Neural Networks

  38. Modeling Human Reading with Neural Attention

  39. Hierarchical Object Detection With Deep Reinforcement Learning

  40. Generating Images from Captions with Attention

  41. DRAW: A Recurrent Neural Network For Image Generation

  42. Show, Attend and Tell: Neural Image Caption Generation With Visual Attention

  43. Learning to Combine Foveal Glimpses With a Third-Order Boltzmann Machine

  44. Neural Machine Translation by Jointly Learning to Align and Translate

  45. On Learning Where To Look

  46. Recurrent Models of Visual Attention

  47. Iterative Alternating Neural Attention for Machine Reading

  48. Can Active Memory Replace Attention?

  49. Attention Is All You Need

  50. Foveation-based Mechanisms Alleviate Adversarial Examples

  51. Character-Level Language Modeling with Deeper Self-Attention

  52. Adaptive Computation Time for Recurrent Neural Networks

  53. Spatially Adaptive Computation Time for Residual Networks

  54. Feedback Networks

  55. Multi-Scale Dense Networks for Resource Efficient Image Classification

  56. RAM: Dynamic Computational Time for Visual Attention

  57. IDK Cascades: Fast Deep Learning by Learning not to Overthink

  58. BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks

  59. Learning Policies for Adaptive Tracking with Deep Feature Cascades

  60. Learning with Rethinking: Recurrently Improving Convolutional Neural Networks through Feedback

  61. Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks

  62. Deciding How to Decide: Dynamic Routing in Artificial Neural Networks

  63. Adaptive Neural Networks for Efficient Inference

  64. BlockDrop: Dynamic Inference Paths in Residual Networks

  65. Neural Speed Reading via Skim-RNN

  66. Learning to select computations

  67. Universal Transformers

  68. Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions

  69. PonderNet: Learning to Ponder

  70. Neural Ordinary Differential Equations

  71. Differentiable Neural Computers

  72. Reinforcement Learning Neural Turing Machines—Revised

  73. Hybrid computing using a neural network with dynamic external memory

  74. Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes

  75. Simple statistical gradient-following algorithms for connectionist reinforcement learning

  76. Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning

  77. Bidirectional Attention Flow for Machine Comprehension

  78. Towards Information-Seeking Agents

  79. Ask the Right Questions: Active Question Reformulation with Reinforcement Learning

  80. Learning to Organize Knowledge and Answer Questions with N-Gram Machines

  81. Estimate and Replace: A Novel Approach to Integrating Deep Neural Networks with Existing Applications

  82. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

  83. Machine Learning for Systems and Systems for Machine Learning

  84. A View on Deep Reinforcement Learning in System Optimization

  85. Device Placement Optimization with Reinforcement Learning

  86. A Hierarchical Model for Device Placement

  87. The Case for Learned Index Structures

  88. Meta-Learning Neural Bloom Filters

  89. SageDB: A Learned Database System

  90. GAP: Generalizable Approximate Graph Partitioning Framework

  91. Optimizing Query Evaluations using Reinforcement Learning for Web Search

  92. Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

  93. AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement Learning

  94. MLGO: a Machine Learning Guided Compiler Optimizations Framework

  95. Universal quantum control through deep reinforcement learning

  96. MuZero with Self-competition for Rate Control in VP9 Video Compression

  97. Reinforcement Learning for Datacenter Congestion Control

  98. Software-Defined Far Memory in Warehouse-Scale Computers

  99. Learning Memory Access Patterns

  100. Learning-based Memory Allocation for C++ Server Workloads

  101. Learning to Perform Local Rewriting for Combinatorial Optimization

  102. Solving Mixed Integer Programs Using Neural Networks

  103. Learning a Large Neighborhood Search Algorithm for Mixed Integer Programs

  104. Placement Optimization with Deep Reinforcement Learning

  105. Chip Placement with Deep Reinforcement Learning

  106. A Full-stack Accelerator Search Technique for Vision Applications

  107. The Deep Learning Revolution and Its Implications for Computer Architecture and Chip Design

  108. Tuning Recurrent Neural Networks with Reinforcement Learning

  109. Reward Augmented Maximum Likelihood for Neural Structured Prediction

  110. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

  111. Sequence Level Training with Recurrent Neural Networks

  112. Deep Reinforcement Learning for Dialogue Generation

  113. Mastering the game of Go with deep neural networks and tree search

  114. AlphaGo Zero: Mastering the game of Go without human knowledge

  115. The Predictron: End-To-End Learning and Planning

  116. Deep Reinforcement Learning for Mention-Ranking Coreference Models

  117. Language as a Latent Variable: Discrete Generative Models for Sentence Compression

  118. Self-critical Sequence Training for Image Captioning

  119. Dual Learning for Machine Translation

  120. Neural Combinatorial Optimization with Reinforcement Learning

  121. Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets

  122. End-to-end optimization of goal-driven and visually grounded dialogue systems

  123. Adversarial Neural Machine Translation

  124. Zero-Shot Dual Machine Translation

  125. Artist Agent: A Reinforcement Learning Approach to Automatic Stroke Generation in Oriental Ink Painting

  126. Stochastic Constraint Programming as Reinforcement Learning

  127. A Deep Reinforced Model for Abstractive Summarization

  128. Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models

  129. Deal or No Deal? End-To-End Learning for Negotiation Dialogues

  130. Grammatical Error Correction with Neural Reinforcement Learning

  131. Tracking as Online Decision-Making: Learning a Policy from Streaming Videos with Reinforcement Learning

  132. Reinforced Video Captioning with Entailment Rewards

  133. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning

  134. Reinforcement Learning of Speech Recognition System Based on Policy Gradient and Hypothesis Selection

  135. Towards the Use of Deep Reinforcement Learning with Global Policy For Query-based Extractive Summarization

  136. Automatically Composing Representation Transformations as a Means for Generalization

  137. Improving Abstraction in Text Summarization

  138. A Study of Reinforcement Learning for Neural Machine Translation

  139. Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction

  140. Learning to Optimize Join Queries With Deep Reinforcement Learning

  141. OCD: Optimal Completion Distillation for Sequence Learning

  142. Better Rewards Yield Better Summaries: Learning to Summarise Without References

  143. Fine-Tuning Language Models from Human Preferences

  144. A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models

  145. Generative Adversarial Imitation Learning

  146. Connecting Generative Adversarial Networks and Actor-Critic Methods

  147. Generative Adversarial Parallelization

  148. NIPS 2016 Tutorial: Generative Adversarial Networks

  149. 6.6 Actor-Critic Methods

  150. Asynchronous Network Architecture for Semi-Supervised Learning

  151. Decoupled Neural Interfaces using Synthetic Gradients

  152. Learning to Discover Cross-Domain Relations with Generative Adversarial Networks

  153. Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks

  154. Unsupervised Machine Translation Using Monolingual Corpora Only

  155. Learning to Optimize

  156. Learning to Optimize Neural Nets

  157. Learning to learn by gradient descent by gradient descent

  158. Neural Optimizer Search With Reinforcement Learning

  159. Deep Reinforcement Learning for Accelerating the Convergence Rate

  160. An Actor-critic Algorithm for Learning Rate Learning

  161. Learned Optimizers that Scale and Generalize

  162. Metacontrol for Adaptive Imagination-Based Optimization

  163. Reinforcement Learning for Learning Rate Control

  164. Online Learning of a Memory for Learning Rates

  165. Rover Descent: Learning to optimize by learning to navigate on prototypical loss surfaces

  166. Backprop Evolution

  167. Understanding and correcting pathologies in the training of learned optimizers

  168. LHOPT: A Generalizable Approach to Learning Optimizers

  169. Biased Importance Sampling for Deep Neural Network Training

  170. Online Batch Selection for Faster Training of Neural Networks

  171. Neural Data Filter for Bootstrapping Stochastic Gradient Descent

  172. Stochastic Optimization with Bandit Sampling

  173. ScreenerNet: Learning Self-Paced Curriculum for Deep Neural Networks

  174. Differentiable Learning-to-Normalize via Switchable Normalization

  175. AutoAugment: Learning Augmentation Policies from Data

  176. Bayesian Active Learning for Classification and Preference Learning

  177. Active Learning for High Dimensional Inputs Using Bayesian Convolutional Neural Networks

  178. Uncertainty in Deep Learning

  179. Teaching Machines to Describe Images via Natural Language Feedback

  180. Deep reinforcement learning from human preferences

  181. Active Learning for Convolutional Neural Networks: A Core-Set Approach

  182. Why Pay More When You Can Pay Less: A Joint Learning Framework for Active Feature Acquisition and Classification

  183. Classification with Costly Features using Deep Reinforcement Learning

  184. Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning

  185. Learning to Learn from Noisy Web Videos

  186. Brief Summary of the Panel Discussion at DL Workshop @ICML 2015

  187. Active Learning Literature Survey

  188. Machine Teaching for Bayesian Learners in the Exponential Family

  189. Dataset Distillation

  190. "Less Than One"-Shot Learning: Learning n Classes From M < N Samples

  191. A Solvable Model of Neural Scaling Laws

  192. https://x.com/fchollet/status/1082347142830743552

  193. Freeze-Thaw Bayesian Optimization

  194. Neural Architecture Search with Reinforcement Learning

  195. Designing Neural Network Architectures using Reinforcement Learning

  196. Learning to Learn without Gradient Descent by Gradient Descent

  197. RL2: Fast Reinforcement Learning via Slow Reinforcement Learning

  198. Learning to reinforcement learn

  199. Approximate Bayes Optimal Policy Search Using Neural Networks

  200. HyperNetworks

  201. PathNet: Evolution Channels Gradient Descent in Super Neural Networks

  202. Optimization as a Model for Few-Shot Learning

  203. Efficient K-shot Learning with Regularized Deep Networks

  204. DeepArchitect: Automatically Designing and Training Deep Architectures

  205. CoDeepNEAT: Evolving Deep Neural Networks

  206. Large-Scale Evolution of Image Classifiers

  207. Learning to Reason: End-to-End Module Networks for Visual Question Answering

  208. Inferring and Executing Programs for Visual Reasoning

  209. Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks

  210. Meta Networks

  211. Efficient Architecture Search by Network Transformation

  212. Learning Transferable Architectures for Scalable Image Recognition

  213. SMASH: One-Shot Model Architecture Search through HyperNetworks

  214. Practical Block-wise Neural Network Architecture Generation

  215. N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning

  216. Gradient-free Policy Architecture Search and Adaptation

  217. Swish: Searching for Activation Functions

  218. Intriguing Properties of Adversarial Examples

  219. Finding Competitive Network Architectures Within a Day Using UCT

  220. A Flexible Approach to Automated RNN Architecture Generation

  221. Learning to Prune Filters in Convolutional Neural Networks

  222. Regularized Evolution for Image Classifier Architecture Search

  223. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions

  224. Efficient Multi-objective Neural Architecture Search via Lamarckian Evolution

  225. Learning to Optimize Tensor Programs

  226. Resource-Efficient Neural Architect

  227. Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search

  228. MnasNet: Platform-Aware Neural Architecture Search for Mobile

  229. Searching for Efficient Multi-Scale Architectures for Dense Image Prediction

  230. Searching Toward Pareto-Optimal Device-Aware Neural Architectures

  231. Evolutionary-Neural Hybrid Agents for Architecture Search

  232. InstaNAS: Instance-aware Neural Architecture Search

  233. IRLAS: Inverse Reinforcement Learning for Architecture Search

  234. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

  235. The Evolved Transformer

  236. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection

  237. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

  238. StyleNAS: An Empirical Study of Neural Architecture Search to Uncover Surprisingly Fast End-to-End Universal Style Transfer Networks

  239. EfficientNet-EdgeTPU: Creating Accelerator-Optimized Neural Networks With AutoML

  240. Evolving Space-Time Neural Architectures for Videos

  241. Google Vizier: A Service for Black-Box Optimization

  242. Introducing FBLearner Flow: Facebook's AI Backbone

  243. Efficient Reductions for Imitation Learning

  244. Unifying Count-Based Exploration and Intrinsic Motivation

  245. Complexity no Bar to AI

  246. Candy Japan’s new box A/B test

  247. https://news.ycombinator.com/item?id=13231808

  248. https://www.reddit.com/r/ControlProblem/comments/5jlkgi/why_tool_ais_want_to_be_agent_ais/

  249. Risks from Learned Optimization: Introduction

  250. On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models

  251. One Big Net For Everything

  252. Sutton & Barto Book: Reinforcement Learning: An Introduction

  253. Reddit: Reinforcement Learning subreddit

  254. https://bair.berkeley.edu/blog/2017/07/18/learning-to-learn/

  255. https://hci.iwr.uni-heidelberg.de/system/files/private/downloads/1848175122/schmitt_kunstliche-motivation-report.pdf

  256. https://philarchive.org/archive/TURMAA-6v2

  257. Deep Reinforcement Learning Doesn’t Work Yet

  258. The Ethics of Reward Shaping

  259. 2018-07-26-synced-googleaichiefjeffdeansmlsystemarchitectureblueprint.html

  260. Solving the Mystery of Link Imbalance: A Metastable Failure State at Scale

  261. Reflective Oracles: A Foundation for Classical Game Theory

  262. https://www.fhi.ox.ac.uk/wp-content/uploads/Reframing_Superintelligence_FHI-TR-2019-1.1-1.pdf

  263. The Bitter Lesson

  264. AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence

  265. ‘end-to-end’ directory

  266. There’s plenty of room at the Top: What will drive computer performance after Moore’s law?

  267. ‘tech economics’ directory

  268. Modeling the Human Trajectory

  269. Superexponential [Modeling the Human Trajectory]

  270. Wikipedia Bibliography:

    1. Reinforcement learning

    2. Google Maps

    3. Waze  :

    4. Nick Bostrom

    5. Superintelligence: Paths, Dangers, Strategies  :

    6. Amdahl’s law

    7. High-frequency trading

    8. Knight Capital Group § 2012 stock trading disruption  :

    9. Autopilot  :

    10. Death by GPS  :

    11. William Wordsworth

    12. Samuel Taylor Coleridge  :

    13. Extended mind thesis

    14. A/B testing

    15. Partially observable Markov decision process

    16. Braess's paradox  :

    17. Design of experiments

    18. Sequential analysis

    19. Response surface methodology

    20. Adaptive design (medicine)

    21. Multi-armed bandit

    22. Optimal experimental design

    23. Variance

    24. Latin square

    25. Blocking (statistics)

    26. Queueing theory

    27. Deep learning

    28. Long short-term memory

    29. Committee machine  :

    30. Jeff Dean

    31. Virtual memory compression  :

    32. Linear programming § Integer unknowns  :

    33. Greedy algorithm

    34. Beam search

    35. Monte Carlo tree search

    36. Stochastic gradient descent

    37. Markov decision process

    38. Importance sampling  :

    39. Oversampling and undersampling in data analysis

    40. Active learning (machine learning)

    41. Decision boundary  :

    42. Hyperparameter optimization

    43. Gaussian process

    44. Dilution (neural networks)

    45. Type system

    46. AlphaGo

    47. KGS Go Server

    48. Weak supervision § Semi-supervised learning

    49. Montezuma’s Revenge (video game)