Bibliography:

  1. Machine Learning Scaling

  2. scaling-hypothesis#blessings-of-scale

    [Transclude the forward-link's context]

  3. https://cse-robotics.engr.tamu.edu/dshell/cs689/papers/anderson72more_is_different.pdf

  4. Appendix F: Personal Observations on the Reliability of the Shuttle

  5. unseeing#confirmation-bias

    [Transclude the forward-link's context]

  6. AutoML-Zero: Evolving Machine Learning Algorithms From Scratch

  7. BMT: Binarized Neural Machine Translation

  8. Absolute Unit NNs: Regression-Based MLPs for Everything

  9. This Section Presents an Expanded (but Still Quite Compact) Version of the Terse ConvMixer Implementation That We Presented in the Paper. The Code Is given in **Figure 7**. We Also Present an Even More Terse Implementation in **Figure 8**, Which to the Best of Our Knowledge Is the First Model That Achieves the Elusive Dual Goals of 80%+ ImageNet Top-1 Accuracy While Also Fitting into a Tweet.

  10. Rip van Winkle’s Razor, a Simple New Estimate for Adaptive Data Analysis

  11. Reward is enough

  12. backstop#clune-2019

    [Transclude the forward-link's context]

  13. ‘meta-learning’ tag

  14. Meta Learning Backpropagation And Improving It

  15. BLUR: Meta-Learning Bidirectional Update Rules

  16. PatrickStar: Parallel Training of Pre-trained Models via Chunk-based Memory Management

  17. Pathways: Asynchronous Distributed Dataflow for ML

  18. DeepSpeed: Accelerating Large-Scale Model Inference and Training via System Optimizations and Compression

  19. ZeRO-Infinity and DeepSpeed: Unlocking Unprecedented Model Scale for Deep Learning Training

  20. GSPMD: General and Scalable Parallelization for ML Computation Graphs

  21. Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines

  22. Efficient Large-Scale Language Model Training on GPU Clusters

  23. TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models

  24. There’s plenty of room at the Top: What will drive computer performance after Moore’s law?

  25. Moore's Law, AI, and the pace of Progress

  26. Effect of scale on catastrophic forgetting in neural networks

  27. Slowing Moore’s Law: How It Could Happen

  28. Pony Preservation Project Panel 2021—FULL

  29. SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient

  30. Distributed Deep Learning in Open Collaborations

  31. DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications

  32. ‘experience curves’ tag

  33. AI and Efficiency: We’re releasing an analysis showing that since 2012 the amount of compute needed to train a neural net to the same performance on ImageNet classification has been decreasing by a factor of 2 every 16 months

  34. Measuring the Algorithmic Efficiency of Neural Networks

  35. Robert Oppenheimer

  36. DeepMind and Google: the battle to control artificial intelligence. Demis Hassabis founded a company to build the world’s most powerful AI. Then Google bought him out. Hal Hodson asks who is in charge

  37. An Empirical Model of Large-Batch Training

  38. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

  39. https://www.lesswrong.com/posts/65qmEJHDw3vw69tKm/proposal-scaling-laws-for-rl-generalization#bdzbeD9YvarEEopCq

  40. One Big Net For Everything

  41. Scaling Laws for Language Transfer Learning

  42. Scaling Laws for Transfer

  43. Why Tool AIs Want to Be Agent AIs

  44. Risks from Learned Optimization in Advanced Machine Learning Systems

  45. index#decisiontransformer-blog-section

    [Transclude the forward-link's context]

  46. https://x.com/arankomatsuzaki/status/1399471244760649729

  47. Codex: Evaluating Large Language Models Trained on Code: Figure 14: When the Prompt Includes Subtle Bugs, Codex Tends to Produce Worse Code Than It Is Capable of Producing. This Gap Increases With Model Size. Including an Instruction to Write Correct Code Helps a Little but Does Not Fix the Problem. Even With No Examples in the Context, Codex Produces Substantially Worse Code Than It Is Capable Of.

  48. gpt-3#roleplaying

    [Transclude the forward-link's context]

  49. The Basic AI Drives

  50. Human-level performance in 3D multiplayer games with population-based reinforcement learning

  51. Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads

  52. Gato: A Generalist Agent

  53. GPT-3: Language Models are Few-Shot Learners

  54. DALL·E 1: Creating Images from Text: We’ve trained a neural network called DALL·E that creates images from text captions for a wide range of concepts expressible in natural language

  55. BEiT: BERT Pre-Training of Image Transformers

  56. WebGPT: Browser-assisted question-answering with human feedback

  57. Boosting Search Engines with Interactive Agents

  58. A data-driven approach for learning to control computers

  59. Player of Games

  60. Open-Ended Learning Leads to Generally Capable Agents

  61. From Motor Control to Team Play in Simulated Humanoid Football

  62. https://deepmind.google/discover/blog/learning-robust-real-time-cultural-transmission-without-human-data/

  63. Grounded Language Learning Fast and Slow

  64. Imitating Interactive Intelligence

  65. Learning to Ground Multi-Agent Communication with Autoencoders

  66. Collaborating with Humans without Human Data

  67. Hidden Agenda: a Social Deduction Game with Diverse Learned Equilibria

  68. Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination

  69. Off-Belief Learning

  70. Multitasking Inhibits Semantic Drift

  71. What Are Bayesian Neural Network Posteriors Really Like?

  72. ‘Codex’ tag

  73. Competitive Programming With AlphaCode

  74. ‘MuZero’ tag

  75. Evolving Normalization-Activation Layers

  76. fully-connected#mlp-mixer-why-now

    [Transclude the forward-link's context]

  77. NVAE: A Deep Hierarchical Variational Autoencoder

  78. Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images

  79. R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning

  80. Cores that don’t count

  81. Microsoft researchers win ImageNet computer vision challenge

  82. Deep Residual Learning for Image Recognition

  83. Learning To Tell Two Spirals Apart

  84. A Recipe for Training Neural Networks

  85. Fine-Tuning GPT-2 from Human Preferences § Bugs can optimize for bad behavior

  86. Grokking: Generalization Beyond Overfitting On Small Algorithmic Datasets

  87. The Shape of Learning Curves: a Review

  88. The Phase Transition In Human Cognition

  89. https://www.reddit.com/r/mlscaling/comments/sjzvl0/d_instances_of_nonlog_capability_spikes_or/

  90. In-Context Learning and Induction Heads

  91. The Bitter Lesson

  92. Fully-Connected Neural Nets

  93. The Brain as a Universal Learning Machine

  94. Magna Alta Doctrina

  95. Ray Interference: a Source of Plateaus in Deep Reinforcement Learning

  96. ‘inner monologue (AI)’ tag

  97. HyperNetworks

  98. On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models

  99. https://www.lesswrong.com/posts/jtoPawEhLNXNxvgTT/bing-chat-is-blatantly-aggressively-misaligned#AAC8jKeDp6xqsZK2K

  100. Shaking the foundations: delusions in sequence models for interaction and control

  101. https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7/the-waluigi-effect-mega-post

  102. ‘preference learning’ tag

  103. tank#alternative-examples

    [Transclude the forward-link's context]

  104. 2015-01-28-spidermanandthexmen-vol1-no2-sauron-cancerdinosaurs.jpg

  105. Friendship Is Optimal (Fanfic)

  106. Mathematics on a Distant Planet

  107. Don’t Worry—It Can’t Happen

  108. Now You Can (try To) Serve Five Terabytes, Too

  109. I Just Want to Serve 5 Terabytes

  110. L2L: Training Large Neural Networks with Constant Memory using a New Execution Algorithm

  111. Blowing the Lid off the CryptoNote/Bytecoin Scam (with the Exception of Monero)

  112. Rekt - Value DeFi

  113. Really Stupid ‘Smart Contract’ Bug Let Hackers Steal $31 Million in Digital Coin

  114. Crypto Firm Nomad Loses Nearly $200 Million in Bridge Hack

  115. Today’s LiFi Hack Happed Because Its Internal Swap() Function Would Call out to Any Address Using Whatever Message the Attacker Passed In. This Allowed the Attacker to Have the Contract TransferFrom() out the Funds from Anyone Who Had Approved the Contract. Since the Contract Was Designed to Make Multiple Swaps in a Single Transaction, the Attacker Sent a Single Huge Transaction With a Wall of TransferFrom‘s for the Contract to Send, Each Moving Money from a User That Had Approved the Contract, to the Attacker. · `// Solhint-Disable-Next-Line Avoid-Low-Level-Calls` That’s Really Putting Salt in the Wound ._. · Should Not Ignore Warnings. ^_^

  116. https://milksad.info/disclosure.html

  117. AI Accelerators, Part IV: The Very Rich Landscape

  118. TensorFlow Research Cloud (TRC): Accelerate your cutting-edge machine learning research with free Cloud TPUs

  119. It’s All about the Benjamins: An Empirical Study on Incentivizing Users to Ignore Security Advice

  120. A Style-Based Generator Architecture for Generative Adversarial Networks

  121. Scammers Created an AI Hologram of Me to Scam Unsuspecting Projects

  122. A Field Guide to Federated Optimization

  123. Net2Net: Accelerating Learning via Knowledge Transfer

  124. M6–10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining

  125. Dota 2 With Large Scale Deep Reinforcement Learning § Pg11

  126. Scaling Scaling Laws with Board Games

  127. https://openai.com/research/formal-math

  128. A Universal Law of Robustness via Isoperimetry

  129. The Dirty Pipe Vulnerability

  130. Surprisingly Turing-Complete

  131. CVE-2022-21449: Psychic Signatures in Java

  132. It Is Nevertheless Funny That There Is a Wycheproof Test for This Bug (of Course There Is, It’s the Most Basic Implementation Check in ECDSA) and Nobody Bothered to Run It against One of the Most Important ECDSA’s Until Now.

  133. dnm-archive#logout

    [Transclude the forward-link's context]

  134. https://x.com/rombulow/status/990684453734203392

  135. How Many Computers Are In Your Computer?

  136. https://msrc.microsoft.com/update-guide/en-US/vulnerability/CVE-2022-34718

  137. Indefinite survival through backup copies

  138. 2004-perry.html

  139. ‘NN sparsity’ tag

  140. On the Predictability of Pruning Across Scales

  141. Knowledge distillation: A good teacher is patient and consistent

  142. https://x.com/thiteanish/status/1635188333705043969

  143. Community Alert: Ronin Validators Compromised

  144. complexity#control

    [Transclude the forward-link's context]

  145. July 2020 News § ‘Modeling the Human Trajectory’

    [Transclude the forward-link's context]

  146. https://bullfrogreview.substack.com/p/honey-i-hacked-the-empathy-machine

  147. Hackers Gaining Power of Subpoena Via Fake ‘Emergency Data Requests’

  148. Apple and Meta Gave User Data to Hackers Who Used Forged Legal Requests: Hackers compromised the emails of law enforcement agencies; Data was used to enable harassment, may aid financial fraud

  149. GPT-3 Creative Fiction § Literary Parodies

    [Transclude the forward-link's context]

  150. Uber Apparently Hacked by Teen, Employees Thought It Was a Joke: ‘I Think IT Would Appreciate Less Memes While They Handle the Breach’

  151. The Radicalization Risks of GPT-3 and Advanced Neural Language Models

  152. Computer Optimization: Your Computer Is Faster Than You Think

  153. OpenAI Five: 2016–2019

  154. Grandmaster level in StarCraft II using multi-agent reinforcement learning

  155. scaling-hypothesis#meta-learning

    [Transclude the forward-link's context]

  156. The Billion Dollar AI Problem That Just Keeps Scaling

  157. Fermi Estimate of Future Training Runs

  158. https://cset.georgetown.edu/wp-content/uploads/AI-and-Compute-How-Much-Longer-Can-Computing-Power-Drive-Artificial-Intelligence-Progress.pdf

  159. Factored Cognition

  160. CycleGAN, a Master of Steganography

  161. The Toxoplasma Of Rage

  162. Duty Calls

  163. Sort By Controversial

  164. Specialist Ukrainian Drone Unit Picks off Invading Russian Forces As They Sleep

  165. https://www.amazon.com/Genius-Makers-Mavericks-Brought-Facebook/dp/1524742678

  166. $2012

  167. CoreWeave

  168. Target Hackers Broke in Via HVAC Company

  169. Chinese Spies Hacked a Livestock App to Breach US State Networks: Vulnerabilities in Animal Tracking Software USAHERDS and Log4j Gave the Notorious APT41 Group a Foothold in Multiple Government Systems.

  170. Supply chain attacks

  171. China Has Already Reached Exascale—On Two Separate Systems

  172. https://x.com/ID_AA_Carmack/status/1300280139717189640

  173. NYU Accidentally Exposed Military Code-Breaking Computer Project to Entire Internet

  174. Is Programmable Overhead Worth The Cost? How much do we pay for a system to be programmable? It depends upon who you ask

  175. Fast Stencil-Code Computation on a Wafer-Scale Processor

  176. Unifying Language Learning Paradigms

  177. Chinchilla: Training Compute-Optimal Large Language Models

  178. Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer

  179. Scaling Laws for Deep Learning

  180. Extrapolating GPT-N performance

  181. https://www.quantamagazine.org/computer-scientists-achieve-crown-jewel-of-cryptography-20201110/

  182. ‘tech economics’ tag

  183. AI and Compute

  184. https://arxiv.org/pdf/2108.07686.pdf#page=85

  185. PILCO: A Model-Based and Data-Efficient Approach to Policy Search

  186. Agile Locomotion via Model-Free Learning

  187. Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World

  188. Solving Rubik’s Cube with a Robot Hand

  189. Learning agile and dynamic motor skills for legged robots

  190. Learning robust perceptive locomotion for quadrupedal robots in the wild

  191. Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam

  192. Decoupled Neural Interfaces using Synthetic Gradients

  193. Cerebro-cerebellar networks facilitate learning through feedback decoupling

  194. ‘end-to-end’ tag

  195. Russia Will Probably Legalize Some Software Piracy to Mitigate Sanctions

  196. Russian Government Rolls Back Intellectual Property Rights in Response to Western Sanctions

  197. Complexity no Bar to AI

  198. Cellular automata as convolutional neural networks

  199. Differentiable Self-Organizing Systems

  200. Self-Organising Textures

  201. Growing Neural Cellular Automata: Differentiable Model of Morphogenesis

  202. Adversarial Reprogramming of Neural Cellular Automata

  203. Regenerating Soft Robots through Neural Cellular Automata

  204. Growing 3D Artefacts and Functional Machines with Neural Cellular Automata

  205. Texture Generation with Neural Cellular Automata

  206. Variational Neural Cellular Automata

  207. The Future of Artificial Intelligence Is Self-Organizing and Self-Assembling

  208. 𝜇NCA: Texture Generation with Ultra-Compact Neural Cellular Automata

  209. Bioelectric Networks: Taming the Collective Intelligence of Cells for Regenerative Medicine

  210. On Having No Head: Cognition throughout Biological Systems

  211. https://www.quantamagazine.org/flying-fish-and-aquarium-pets-yield-secrets-of-evolution-20220105/

  212. Synthetic living machines: A new window on life

  213. Fundamental behaviors emerge from simulations of a living minimal cell

  214. An Account of Electricity and the Body, Reviewed

  215. Is Bioelectricity the Key to Limb Regeneration?

  216. ‘Amazing Science’: Researchers Find Xenobots Can Give Rise to Offspring Science

  217. Perceptein: A synthetic protein-level neural network in mammalian cells

  218. Living Robots Made from Frog Cells Can Replicate Themselves in a Dish

  219. Cells Form Into ‘Xenobots’ on Their Own: Embryonic cells can self-assemble into new living forms that don’t resemble the bodies they usually generate, challenging old ideas of what defines an organism

  220. 9 Missile Commanders Fired, Others Disciplined In Air Force Scandal

  221. Security Troops on US Nuclear Missile Base Took LSD

  222. Amazing Details from the Drunken Moscow Bender That Got an Air Force General Fired

  223. Joan Rohlfing on how to avoid catastrophic nuclear blunders: The interaction between nuclear weapons and cybersecurity

  224. Hacking the Bomb: Cyber Threats and Nuclear Weapons

  225. The Curious Case of the Accidental Indian Missile Launch

  226. ‘illusion-of-depth bias’ tag

  227. https://arxiv.org/pdf/2109.01517#page=12

  228. Colab Notebook: HQU-V3.4-Light (Jax TPU)

  229. https://www.aleph.se/papers/Spamming%20the%20universe.pdf

  230. Advantages of Artificial Intelligences, Uploads, and Digital Minds

  231. Intelligence Explosion Microeconomics

  232. There is plenty of time at the bottom: the economics, risk and ethics of time compression

  233. AI Takeoff

  234. Fiction Relevant to AI Futurism

  235. Understand —A Novelette by Ted Chiang

  236. Slow Tuesday Night

  237. That Alien Message

  238. https://www.ssec.wisc.edu/~billh/g/mcnrsts.html

  239. AI Takeoff Story: a Continuation of Progress by Other Means

  240. Optimality Is the Tiger, and Agents Are Its Teeth

  241. https://press.asimov.com/resources/tinker

  242. https://x.com/robbensinger/status/1503220020175769602

  243. AGI Ruin: A List of Lethalities

  244. Without Specific Countermeasures, the Easiest Path to Transformative AI Likely Leads to AI Takeover

  245. http://skynetsimulator.com/

  246. ML Scaling subreddit

  247. It Looks Like You'Re Trying To Take Over The World

  248. Shah and Yudkowsky on Alignment Failures

  249. https://www.reddit.com/r/slatestarcodex/comments/tag4lm/it_looks_like_youre_trying_to_take_over_the_world/

  250. https://www.reddit.com/r/rational/comments/ta57ag/it_looks_like_youre_trying_to_take_over_the_world/

  251. https://news.ycombinator.com/item?id=30818895

  252. https://news.ycombinator.com/item?id=34808718#34809360

  253. Wikipedia Bibliography:

    1. Defamiliarization

    2. All Your Base Are Belong to Us

    3. Variance

    4. Backpropagation

    5. System Accident

    6. Common Crawl

    7. Office Assistant

    8. Universal Paperclips

    9. Evidential Decision Theory

    10. Wirehead (science Fiction)

    11. Expected Value

    12. Logit

    13. SQL

    14. SQL Injection

    15. Metasploit

    16. Tornado Cash

    17. Panama Papers

    18. Paradise Papers

    19. Pandora Papers

    20. IOTA (technology)

    21. Social Engineering (security)

    22. Federated Learning

    23. Diminishing Returns

    24. Elo Rating System

    25. Log4Shell

    26. JASBUG

    27. Spectre (security Vulnerability)

    28. Unreachable Code § Goto Fail Bug

    29. Random Number Generator Attack § Debian OpenSSL

    30. Heartbleed

    31. Shellshock (software Bug)

    32. Teleprinter § Teleprinters in Computing

    33. Phishing § Spear Phishing

    34. Great Oxidation Event

    35. Human Evolution

    36. Neolithic Revolution

    37. Industrial Revolution

    38. Warhol Worm

    39. Storm Oil

    40. Brandolini's Law

    41. Mirai (malware)

    42. 2020 Twitter Account Hijacking

    43. Amdahl’s Law

    44. Iran–U.S. RQ-170 Incident

    45. Supply Chain Attack

    46. Fluorinert

    47. Seymour Cray

    48. Chudnovsky Brothers

    49. Renaissance Technologies

    50. Jim Simons (mathematician)

    51. Flatiron Institute

    52. Cerebras § Technology

    53. Static Random-Access Memory

    54. Entropy (information Theory)

    55. Experience Curve Effects

    56. Autonomous System (Internet)

    57. Operation Barbarossa § Soviet Preparations

    58. Decentralized Autonomous Organization

    59. Decentralized Finance

    60. Elden Ring

    61. Demoscene

    62. Fat Leonard Scandal

    63. Stuxnet

    64. Strava § Privacy Concerns

    65. Permissive Action Link

    66. Nuclear Close Calls § 25 October 1962

    67. Russian Invasion of Ukraine

    68. Nuclear Close Calls § 9 November 1979

    69. 1983 Soviet Nuclear False Alarm Incident

    70. 2018 Hawaii False Missile Alert § The Alert

    71. 2017–2018 North Korea Crisis

    72. Launch on Warning § History

    73. Dead Hand

    74. Self-Replicating Spacecraft

    75. R. A. Lafferty

    76. Accelerando

    77. The Last Question