scaling-hypothesis#blessings-of-scale
[Transclude the forward-link's
context]
https://cse-robotics.engr.tamu.edu/dshell/cs689/papers/anderson72more_is_different.pdf
Appendix F: Personal Observations on the Reliability of the Shuttle
unseeing#confirmation-bias
[Transclude the forward-link's
context]
AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
BMT: Binarized Neural Machine Translation
Absolute Unit NNs: Regression-Based MLPs for Everything
This Section Presents an Expanded (But Still Quite Compact) Version of the Terse ConvMixer Implementation That We Presented in the Paper. The Code Is given in **Figure 7**. We Also Present an Even More Terse Implementation in **Figure 8**, Which to the Best of Our Knowledge Is the First Model That Achieves the Elusive Dual Goals of 80%+ ImageNet Top-1 Accuracy While Also Fitting into a Tweet.
Rip van Winkle’s Razor, a Simple New Estimate for Adaptive Data Analysis
Reward is enough
backstop#clune-2019
[Transclude the forward-link's
context]
‘meta-learning’ directory
Meta Learning Backpropagation And Improving It
BLUR: Meta-Learning Bidirectional Update Rules
PatrickStar: Parallel Training of Pre-trained Models via Chunk-based Memory Management
Pathways: Asynchronous Distributed Dataflow for ML
DeepSpeed: Accelerating Large-Scale Model Inference and Training via System Optimizations and Compression
ZeRO-Infinity and DeepSpeed: Unlocking Unprecedented Model Scale for Deep Learning Training
GSPMD: General and Scalable Parallelization for ML Computation Graphs
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines
Efficient Large-Scale Language Model Training on GPU Clusters
TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
There’s plenty of room at the Top: What will drive computer performance after Moore’s law?
Moore’s Law, AI, and the pace of Progress
Effect of scale on catastrophic forgetting in neural networks
Slowing Moore’s Law: How It Could Happen
Pony Preservation Project Panel 2021—FULL
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
Distributed Deep Learning in Open Collaborations
DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications
‘experience curve’ directory
AI and Efficiency: We’re releasing an analysis showing that since 2012 the amount of compute needed to train a neural net to the same performance on ImageNet classification has been decreasing by a factor of 2 every 16 months
Measuring the Algorithmic Efficiency of Neural Networks
Robert Oppenheimer
DeepMind and Google: the battle to control artificial intelligence. Demis Hassabis founded a company to build the world’s most powerful AI. Then Google bought him out. Hal Hodson asks who is in charge
An Empirical Model of Large-Batch Training
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
https://www.lesswrong.com/posts/65qmEJHDw3vw69tKm/proposal-scaling-laws-for-rl-generalization#bdzbeD9YvarEEopCq
One Big Net For Everything
Scaling Laws for Language Transfer Learning
Scaling Laws for Transfer
Why Tool AIs Want to Be Agent AIs
Risks from Learned Optimization in Advanced Machine Learning Systems
index#decisiontransformer-blog-section
[Transclude the forward-link's
context]
https://x.com/arankomatsuzaki/status/1399471244760649729
Codex: Evaluating Large Language Models Trained on Code: Figure 14: When the Prompt Includes Subtle Bugs, Codex Tends to Produce Worse Code Than It Is Capable of Producing. This Gap Increases With Model Size. Including an Instruction to Write Correct Code Helps a Little but Does Not Fix the Problem. Even With No Examples in the Context, Codex Produces Substantially Worse Code Than It Is Capable Of.
gpt-3#roleplaying
[Transclude the forward-link's
context]
The Basic AI Drives
Human-level performance in 3D multiplayer games with population-based reinforcement learning
Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads
Gato: A Generalist Agent
GPT-3: Language Models are Few-Shot Learners
DALL·E 1: Creating Images from Text: We’ve trained a neural network called DALL·E that creates images from text captions for a wide range of concepts expressible in natural language
BEiT: BERT Pre-Training of Image Transformers
WebGPT: Browser-assisted question-answering with human feedback
Boosting Search Engines with Interactive Agents
A data-driven approach for learning to control computers
Player of Games
Open-Ended Learning Leads to Generally Capable Agents
From Motor Control to Team Play in Simulated Humanoid Football
https://deepmind.google/discover/blog/learning-robust-real-time-cultural-transmission-without-human-data/
Grounded Language Learning Fast and Slow
Imitating Interactive Intelligence
Learning to Ground Multi-Agent Communication with Autoencoders
Collaborating with Humans without Human Data
Hidden Agenda: a Social Deduction Game with Diverse Learned Equilibria
Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination
Off-Belief Learning
Multitasking Inhibits Semantic Drift
What Are Bayesian Neural Network Posteriors Really Like?
‘Codex’ directory
Competitive Programming With AlphaCode
‘MuZero’ directory
Evolving Normalization-Activation Layers
index#mlp-mixer-why-now
[Transclude the forward-link's
context]
NVAE: A Deep Hierarchical Variational Autoencoder
Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images
R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning
Cores that don’t count
Microsoft researchers win ImageNet computer vision challenge
Deep Residual Learning for Image Recognition
Learning To Tell Two Spirals Apart
A Recipe for Training Neural Networks
Fine-Tuning GPT-2 from Human Preferences § Bugs can optimize for bad behavior
Grokking: Generalization Beyond Overfitting On Small Algorithmic Datasets
The Shape of Learning Curves: a Review
The Phase Transition In Human Cognition
https://www.reddit.com/r/mlscaling/comments/sjzvl0/d_instances_of_nonlog_capability_spikes_or/
In-Context Learning and Induction Heads
The Bitter Lesson
‘MLP NN’ directory
The Brain as a Universal Learning Machine
Magna Alta Doctrina
Ray Interference: a Source of Plateaus in Deep Reinforcement Learning
‘inner monologue (AI)’ directory
HyperNetworks
On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models
https://www.lesswrong.com/posts/jtoPawEhLNXNxvgTT/bing-chat-is-blatantly-aggressively-misaligned?commentId=AAC8jKeDp6xqsZK2K
Shaking the foundations: delusions in sequence models for interaction and control
https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7/the-waluigi-effect-mega-post
‘preference learning’ directory
tank#alternative-examples
[Transclude the forward-link's
context]
2015-01-28-spidermanandthexmen-vol1-no2-sauron-cancerdinosaurs.jpg
Friendship Is Optimal (Fanfic)
Mathematics on a Distant Planet
Don’t Worry—It Can’t Happen
Now You Can (Try To) Serve Five Terabytes, Too
I Just Want to Serve 5 Terabytes
L2L: Training Large Neural Networks with Constant Memory using a New Execution Algorithm
Blowing the Lid off the CryptoNote/Bytecoin Scam (With the Exception of Monero)
Rekt - Value DeFi
Really Stupid ‘Smart Contract’ Bug Let Hackers Steal $31 Million in Digital Coin
Crypto Firm Nomad Loses Nearly $200 Million in Bridge Hack
Today’s LiFi hack happened because its internal swap()
function would call out to any address using whatever message the attacker passed in
https://milksad.info/disclosure.html
AI Accelerators, Part IV: The Very Rich Landscape
TensorFlow Research Cloud (TRC): Accelerate your cutting-edge machine learning research with free Cloud TPUs
It’s All about the Benjamins: An Empirical Study on Incentivizing Users to Ignore Security Advice
A Style-Based Generator Architecture for Generative Adversarial Networks
Scammers Created an AI Hologram of Me to Scam Unsuspecting Projects
A Field Guide to Federated Optimization
Net2Net: Accelerating Learning via Knowledge Transfer
M6–10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Dota 2 With Large Scale Deep Reinforcement Learning § Pg11
Scaling Scaling Laws with Board Games
https://openai.com/research/formal-math
A Universal Law of Robustness via Isoperimetry
The Dirty Pipe Vulnerability
Surprisingly Turing-Complete
CVE-2022-21449: Psychic Signatures in Java
It Is Nevertheless Funny That There Is a Wycheproof Test for This Bug (Of Course There Is, It’s the Most Basic Implementation Check in ECDSA) and Nobody Bothered to Run It against One of the Most Important ECDSA’s Until Now.
dnm-archive#logout
[Transclude the forward-link's
context]
https://x.com/rombulow/status/990684453734203392
How Many Computers Are In Your Computer?
https://msrc.microsoft.com/update-guide/en-US/vulnerability/CVE-2022-34718
Indefinite survival through backup copies
2004-perry.html
‘NN sparsity’ directory
On the Predictability of Pruning Across Scales
Knowledge distillation: A good teacher is patient and consistent
https://x.com/thiteanish/status/1635188333705043969
Community Alert: Ronin Validators Compromised
complexity#control
[Transclude the forward-link's
context]
July 2020 News § ‘Modeling the Human Trajectory’
[Transclude the forward-link's
context]
https://bullfrogreview.substack.com/p/honey-i-hacked-the-empathy-machine
Hackers Gaining Power of Subpoena Via Fake ‘Emergency Data Requests’
Apple and Meta Gave User Data to Hackers Who Used Forged Legal Requests: Hackers compromised the emails of law enforcement agencies; Data was used to enable harassment, may aid financial fraud
GPT-3 Creative Fiction § Literary Parodies
[Transclude the forward-link's
context]
Uber Apparently Hacked by Teen, Employees Thought It Was a Joke: ‘I Think IT Would Appreciate Less Memes While They Handle the Breach’
The Radicalization Risks of GPT-3 and Advanced Neural Language Models
Computer Optimization: Your Computer Is Faster Than You Think
OpenAI Five: 2016–2019
Grandmaster level in StarCraft II using multi-agent reinforcement learning
scaling-hypothesis#meta-learning
[Transclude the forward-link's
context]
The Billion Dollar AI Problem That Just Keeps Scaling
Fermi Estimate of Future Training Runs
https://cset.georgetown.edu/wp-content/uploads/AI-and-Compute-How-Much-Longer-Can-Computing-Power-Drive-Artificial-Intelligence-Progress.pdf
Factored Cognition
CycleGAN, a Master of Steganography
The Toxoplasma Of Rage
Duty Calls
Sort By Controversial
Specialist Ukrainian Drone Unit Picks off Invading Russian Forces As They Sleep
https://www.amazon.com/Genius-Makers-Mavericks-Brought-Facebook/dp/1524742678
CoreWeave
Target Hackers Broke in Via HVAC Company
Chinese Spies Hacked a Livestock App to Breach US State Networks: Vulnerabilities in Animal Tracking Software USAHERDS and Log4j Gave the Notorious APT41 Group a Foothold in Multiple Government Systems.
Supply chain attacks
China Has Already Reached Exascale—On Two Separate Systems
https://x.com/ID_AA_Carmack/status/1300280139717189640
NYU Accidentally Exposed Military Code-Breaking Computer Project to Entire Internet
Is Programmable Overhead Worth The Cost? How much do we pay for a system to be programmable? It depends upon who you ask
Fast Stencil-Code Computation on a Wafer-Scale Processor
UL2: Unifying Language Learning Paradigms
Chinchilla: Training Compute-Optimal Large Language Models
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
Scaling Laws for Deep Learning
Extrapolating GPT-N performance
https://www.quantamagazine.org/computer-scientists-achieve-crown-jewel-of-cryptography-20201110/
‘tech economics’ directory
AI and Compute
https://arxiv.org/pdf/2108.07686.pdf#page=85
PILCO: A Model-Based and Data-Efficient Approach to Policy Search
Agile Locomotion via Model-Free Learning
Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World
Solving Rubik’s Cube with a Robot Hand
Learning agile and dynamic motor skills for legged robots
Learning robust perceptive locomotion for quadrupedal robots in the wild
Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam
Decoupled Neural Interfaces using Synthetic Gradients
Cerebro-cerebellar networks facilitate learning through feedback decoupling
‘end-to-end’ directory
Russia Will Probably Legalize Some Software Piracy to Mitigate Sanctions
Russian Government Rolls Back Intellectual Property Rights in Response to Western Sanctions
Complexity no Bar to AI
Cellular automata as convolutional neural networks
Differentiable Self-Organizing Systems
Self-Organising Textures
Growing Neural Cellular Automata: Differentiable Model of Morphogenesis
Adversarial Reprogramming of Neural Cellular Automata
Regenerating Soft Robots through Neural Cellular Automata
Growing 3D Artefacts and Functional Machines with Neural Cellular Automata
Texture Generation with Neural Cellular Automata
Variational Neural Cellular Automata
The Future of Artificial Intelligence Is Self-Organizing and Self-Assembling
𝜇NCA: Texture Generation with Ultra-Compact Neural Cellular Automata
Bioelectric Networks: Taming the Collective Intelligence of Cells for Regenerative Medicine
On Having No Head: Cognition throughout Biological Systems
https://www.quantamagazine.org/flying-fish-and-aquarium-pets-yield-secrets-of-evolution-20220105/
Synthetic living machines: A new window on life
Fundamental behaviors emerge from simulations of a living minimal cell
An Account of Electricity and the Body, Reviewed
Is Bioelectricity the Key to Limb Regeneration?
‘Amazing Science’: Researchers Find Xenobots Can Give Rise to Offspring Science
Perceptein: A synthetic protein-level neural network in mammalian cells
Living Robots Made from Frog Cells Can Replicate Themselves in a Dish
Cells Form Into ‘Xenobots’ on Their Own: Embryonic cells can self-assemble into new living forms that don’t resemble the bodies they usually generate, challenging old ideas of what defines an organism
9 Missile Commanders Fired, Others Disciplined In Air Force Scandal
Security Troops on US Nuclear Missile Base Took LSD
Amazing Details from the Drunken Moscow Bender That Got an Air Force General Fired
Joan Rohlfing on how to avoid catastrophic nuclear blunders: The interaction between nuclear weapons and cybersecurity
Hacking the Bomb: Cyber Threats and Nuclear Weapons
The Curious Case of the Accidental Indian Missile Launch
‘illusion-of-depth bias’ directory
https://arxiv.org/pdf/2109.01517#page=12
Colab Notebook: HQU-V3.4-Light (Jax TPU)
Clippy Desktop Assistant
https://www.aleph.se/papers/Spamming%20the%20universe.pdf
Advantages of Artificial Intelligences, Uploads, and Digital Minds
Intelligence Explosion Microeconomics
There is plenty of time at the bottom: the economics, risk and ethics of time compression
AI Takeoff Tag
Fiction Relevant to AI Futurism
Understand
Slow Tuesday Night
That Alien Message
https://www.ssec.wisc.edu/~billh/g/mcnrsts.html
AI Takeoff Story: a Continuation of Progress by Other Means
Optimality Is the Tiger, and Agents Are Its Teeth
https://press.asimov.com/resources/tinker
https://x.com/robbensinger/status/1503220020175769602
AGI Ruin: A List of Lethalities
Without Specific Countermeasures, the Easiest Path to Transformative AI Likely Leads to AI Takeover
http://skynetsimulator.com/
How AI Takeover Might Happen in 2 Years § Pandora’s 1 GW Box
ML Scaling subreddit
It Looks Like You'Re Trying To Take Over The World
Shah and Yudkowsky on Alignment Failures
https://www.reddit.com/r/slatestarcodex/comments/tag4lm/it_looks_like_youre_trying_to_take_over_the_world/
https://www.reddit.com/r/rational/comments/ta57ag/it_looks_like_youre_trying_to_take_over_the_world/
https://news.ycombinator.com/item?id=30818895
https://news.ycombinator.com/item?id=34808718#34809360