scaling-hypothesis#blessings-of-scale
https://cse-robotics.engr.tamu.edu/dshell/cs689/papers/anderson72more_is_different.pdf
Appendix F: Personal Observations on the Reliability of the Shuttle
AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
This Section Presents an Expanded (but Still Quite Compact) Version of the Terse ConvMixer Implementation That We Presented in the Paper. The Code Is given in **Figure 7**. We Also Present an Even More Terse Implementation in **Figure 8**, Which to the Best of Our Knowledge Is the First Model That Achieves the Elusive Dual Goals of 80%+ ImageNet Top-1 Accuracy While Also Fitting into a Tweet.
Rip van Winkle’s Razor, a Simple New Estimate for Adaptive Data Analysis
PatrickStar: Parallel Training of Pre-trained Models via Chunk-based Memory Management
DeepSpeed: Accelerating Large-Scale Model Inference and Training via System Optimizations and Compression
ZeRO-Infinity and DeepSpeed: Unlocking Unprecedented Model Scale for Deep Learning Training
GSPMD: General and Scalable Parallelization for ML Computation Graphs
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines
Efficient Large-Scale Language Model Training on GPU Clusters
TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
There’s plenty of room at the Top: What will drive computer performance after Moore’s law?
Effect of scale on catastrophic forgetting in neural networks
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications
AI and Efficiency: We’re releasing an analysis showing that since 2012 the amount of compute needed to train a neural net to the same performance on ImageNet classification has been decreasing by a factor of 2 every 16 months
DeepMind and Google: the battle to control artificial intelligence. Demis Hassabis founded a company to build the world’s most powerful AI. Then Google bought him out. Hal Hodson asks who is in charge
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
https://www.lesswrong.com/posts/65qmEJHDw3vw69tKm/proposal-scaling-laws-for-rl-generalization#bdzbeD9YvarEEopCq
Risks from Learned Optimization in Advanced Machine Learning Systems
index#decisiontransformer-blog-section
Codex: Evaluating Large Language Models Trained on Code: Figure 14: When the Prompt Includes Subtle Bugs, Codex Tends to Produce Worse Code Than It Is Capable of Producing. This Gap Increases With Model Size. Including an Instruction to Write Correct Code Helps a Little but Does Not Fix the Problem. Even With No Examples in the Context, Codex Produces Substantially Worse Code Than It Is Capable Of.
Human-level performance in 3D multiplayer games with population-based reinforcement learning
Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads
DALL·E 1: Creating Images from Text: We’ve trained a neural network called DALL·E that creates images from text captions for a wide range of concepts expressible in natural language
WebGPT: Browser-assisted question-answering with human feedback
From Motor Control to Team Play in Simulated Humanoid Football
https://deepmind.google/discover/blog/learning-robust-real-time-cultural-transmission-without-human-data/
Learning to Ground Multi-Agent Communication with Autoencoders
Hidden Agenda: a Social Deduction Game with Diverse Learned Equilibria
Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination
fully-connected#mlp-mixer-why-now
Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images
R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning
Microsoft researchers win ImageNet computer vision challenge
Fine-Tuning GPT-2 from Human Preferences § Bugs can optimize for bad behavior
Grokking: Generalization Beyond Overfitting On Small Algorithmic Datasets
https://www.reddit.com/r/mlscaling/comments/sjzvl0/d_instances_of_nonlog_capability_spikes_or/
Ray Interference: a Source of Plateaus in Deep Reinforcement Learning
On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models
https://www.lesswrong.com/posts/jtoPawEhLNXNxvgTT/bing-chat-is-blatantly-aggressively-misaligned#AAC8jKeDp6xqsZK2K
Shaking the foundations: delusions in sequence models for interaction and control
https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7/the-waluigi-effect-mega-post
2015-01-28-spidermanandthexmen-vol1-no2-sauron-cancerdinosaurs.jpg
L2L: Training Large Neural Networks with Constant Memory using a New Execution Algorithm
Blowing the Lid off the CryptoNote/Bytecoin Scam (with the Exception of Monero)
Really Stupid ‘Smart Contract’ Bug Let Hackers Steal $31 Million in Digital Coin
Crypto Firm Nomad Loses Nearly $200 Million in Bridge Hack
Today’s LiFi Hack Happed Because Its Internal Swap() Function Would Call out to Any Address Using Whatever Message the Attacker Passed In. This Allowed the Attacker to Have the Contract TransferFrom() out the Funds from Anyone Who Had Approved the Contract. Since the Contract Was Designed to Make Multiple Swaps in a Single Transaction, the Attacker Sent a Single Huge Transaction With a Wall of TransferFrom‘s for the Contract to Send, Each Moving Money from a User That Had Approved the Contract, to the Attacker. · `// Solhint-Disable-Next-Line Avoid-Low-Level-Calls` That’s Really Putting Salt in the Wound ._. · Should Not Ignore Warnings. ^_^
TensorFlow Research Cloud (TRC): Accelerate your cutting-edge machine learning research with free Cloud TPUs
It’s All about the Benjamins: An Empirical Study on Incentivizing Users to Ignore Security Advice
A Style-Based Generator Architecture for Generative Adversarial Networks
Scammers Created an AI Hologram of Me to Scam Unsuspecting Projects
M6–10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Dota 2 With Large Scale Deep Reinforcement Learning § Pg11
It Is Nevertheless Funny That There Is a Wycheproof Test for This Bug (of Course There Is, It’s the Most Basic Implementation Check in ECDSA) and Nobody Bothered to Run It against One of the Most Important ECDSA’s Until Now.
https://msrc.microsoft.com/update-guide/en-US/vulnerability/CVE-2022-34718
Knowledge distillation: A good teacher is patient and consistent
July 2020 News § ‘Modeling the Human Trajectory’
https://bullfrogreview.substack.com/p/honey-i-hacked-the-empathy-machine
Hackers Gaining Power of Subpoena Via Fake ‘Emergency Data Requests’
Apple and Meta Gave User Data to Hackers Who Used Forged Legal Requests: Hackers compromised the emails of law enforcement agencies; Data was used to enable harassment, may aid financial fraud
GPT-3 Creative Fiction § Literary Parodies
Uber Apparently Hacked by Teen, Employees Thought It Was a Joke: ‘I Think IT Would Appreciate Less Memes While They Handle the Breach’
The Radicalization Risks of GPT-3 and Advanced Neural Language Models
Computer Optimization: Your Computer Is Faster Than You Think
Grandmaster level in StarCraft II using multi-agent reinforcement learning
scaling-hypothesis#meta-learning
https://cset.georgetown.edu/wp-content/uploads/AI-and-Compute-How-Much-Longer-Can-Computing-Power-Drive-Artificial-Intelligence-Progress.pdf
Specialist Ukrainian Drone Unit Picks off Invading Russian Forces As They Sleep
https://www.amazon.com/Genius-Makers-Mavericks-Brought-Facebook/dp/1524742678
Chinese Spies Hacked a Livestock App to Breach US State Networks: Vulnerabilities in Animal Tracking Software USAHERDS and Log4j Gave the Notorious APT41 Group a Foothold in Multiple Government Systems.
China Has Already Reached Exascale—On Two Separate Systems
NYU Accidentally Exposed Military Code-Breaking Computer Project to Entire Internet
Is Programmable Overhead Worth The Cost? How much do we pay for a system to be programmable? It depends upon who you ask
Chinchilla: Training Compute-Optimal Large Language Models
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
https://www.quantamagazine.org/computer-scientists-achieve-crown-jewel-of-cryptography-20201110/
PILCO: A Model-Based and Data-Efficient Approach to Policy Search
Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World
Learning robust perceptive locomotion for quadrupedal robots in the wild
Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam
Cerebro-cerebellar networks facilitate learning through feedback decoupling
Russia Will Probably Legalize Some Software Piracy to Mitigate Sanctions
Russian Government Rolls Back Intellectual Property Rights in Response to Western Sanctions
Growing Neural Cellular Automata: Differentiable Model of Morphogenesis
Growing 3D Artefacts and Functional Machines with Neural Cellular Automata
The Future of Artificial Intelligence Is Self-Organizing and Self-Assembling
𝜇NCA: Texture Generation with Ultra-Compact Neural Cellular Automata
Bioelectric Networks: Taming the Collective Intelligence of Cells for Regenerative Medicine
On Having No Head: Cognition throughout Biological Systems
https://www.quantamagazine.org/flying-fish-and-aquarium-pets-yield-secrets-of-evolution-20220105/
Fundamental behaviors emerge from simulations of a living minimal cell
‘Amazing Science’: Researchers Find Xenobots Can Give Rise to Offspring Science
Perceptein: A synthetic protein-level neural network in mammalian cells
Living Robots Made from Frog Cells Can Replicate Themselves in a Dish
Cells Form Into ‘Xenobots’ on Their Own: Embryonic cells can self-assemble into new living forms that don’t resemble the bodies they usually generate, challenging old ideas of what defines an organism
9 Missile Commanders Fired, Others Disciplined In Air Force Scandal
Amazing Details from the Drunken Moscow Bender That Got an Air Force General Fired
Joan Rohlfing on how to avoid catastrophic nuclear blunders: The interaction between nuclear weapons and cybersecurity
Advantages of Artificial Intelligences, Uploads, and Digital Minds
There is plenty of time at the bottom: the economics, risk and ethics of time compression
AI Takeoff Story: a Continuation of Progress by Other Means
Without Specific Countermeasures, the Easiest Path to Transformative AI Likely Leads to AI Takeover
https://www.reddit.com/r/slatestarcodex/comments/tag4lm/it_looks_like_youre_trying_to_take_over_the_world/
https://www.reddit.com/r/rational/comments/ta57ag/it_looks_like_youre_trying_to_take_over_the_world/
Wikipedia Bibliography: