-
Gwern.net newsletter (Substack subscription page)
-
March 2021 News
-
‘newsletter’ directory
-
Changelog
-
Gwern Branwen Creating Essays on Gwern.net
-
Rare Greek Variables
-
Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks
-
Perceiver: General Perception with Iterative Attention
-
Attention Is All You Need
-
Do Transformer Modifications Transfer Across Implementations and Applications?
-
Predictive Coding Can Do Exact Backpropagation on Any Neural Network
-
Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates
-
Grokking: Generalization Beyond Overfitting On Small Algorithmic Datasets
-
The large learning rate phase of deep learning: the catapult mechanism
-
https://www.reddit.com/r/MachineLearning/comments/ba1wg5/d_thoughts_about_superconvergence_and/
-
Rip van Winkle’s Razor, a Simple New Estimate for Adaptive Data Analysis
-
Ambigrammatic Figures: 55 Grotesque Ambigrams
-
Making Anime Faces With StyleGAN § Reversing StyleGAN To Control & Modify Images
-
ML Scaling subreddit
-
The Akronomicon: an Extreme-Scale Leaderboard
-
Naver unveils first ‘hyperscale’ AI platform
-
PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
-
PCL-Platform.Intelligence/PanGu-Alpha: 2000亿开源中文预训练语言模型
-
ChinAI #141: The PanGu Origin Story: Notes from an informative Zhihu Thread on PanGu
-
LaMDA: Our Breakthrough Conversation Technology
-
MUM: A New AI Milestone for Understanding Information
-
[Ali released PLUG: 27 billion parameters, the largest pre-trained language model in the Chinese community]
-
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
-
PALM: Pre-training an Autoencoding & Autoregressive Language Model for Context-conditioned Generation
-
CogView: Mastering Text-to-Image Generation via Transformers
-
DALL·E 1: Creating Images from Text: We’ve trained a neural network called DALL·E that creates images from text captions for a wide range of concepts expressible in natural language
-
M6: A Chinese Multimodal Pretrainer
-
VideoGPT: Video Generation using VQ-VAE and Transformers
-
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions
-
Efficient Large-Scale Language Model Training on GPU Clusters
-
NVIDIA/Megatron-LM: Ongoing Research Training Transformer Models at Scale
-
GSPMD: General and Scalable Parallelization for ML Computation Graphs
-
GTC 2021 Keynote With NVIDIA CEO Jensen Huang: NVIDIA CEO Jensen Huang Delivers the #GTC21 Keynote, Where He Introduced Amazing Breakthroughs in Building Virtual Worlds With NVIDIA Omniverse; in Advancing Enterprise Computing With New NVIDIA DGX Systems and Software; in Turning the Data Center into the New Unit of Computing With the New NVIDIA Grace CPU, BlueField-3 DPU, and DOCA 1.0 SDK; in Broadening the Reach of AI to All Companies and Industries With NVIDIA EGX and Aerial 5G; and in Transforming Transportation With NVIDIA DRIVE Orin and Atlan.
-
2021-04-12-jensenhuang-gtc2021keynote-ean_oizwuxa.en.vtt.txt
-
Chinese AI lab challenges Google, OpenAI with a model of 1.75 trillion parameters
-
China’s GPT-3? BAAI Introduces Superscale Intelligence Model ‘Wu Dao 1.0’: The Beijing Academy of Artificial Intelligence (BAAI) releases Wu Dao 1.0, China’s first large-scale pretraining model.
-
Exploring Sparse Expert Models and Beyond
-
MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
-
MuZero Unplugged: Online and Offline Reinforcement Learning by Planning with a Learned Model
-
2021-schrittwieser-figure1-mspacmanmuzerologrewardscaling.jpg
-
Decision Transformer: Reinforcement Learning via Sequence Modeling
-
Learning and Planning in Complex Action Spaces
-
Continuous Control for Searching and Planning with a Learned Model
-
Muesli: Combining Improvements in Policy Optimization
-
Visualizing MuZero Models
-
Scaling Scaling Laws with Board Games
-
Andy Jones
-
Computer Optimization: Your Computer Is Faster Than You Think
-
Scaling Laws for Language Transfer Learning
-
Scaling Laws for Transfer
-
Carbon Emissions and Large Neural Network Training
-
How to Train BERT with an Academic Budget
-
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
-
https://web.archive.org/web/20211101000000/https://bls.gov/news.release/ecec.nr0.htm
-
https://bls.gov/news.release/archives/ecec_031986.pdf
-
Precision exercise medicine: understanding exercise response variability
-
Analysis of genomic DNA from medieval plague victims suggests long-term effect of Yersinia pestis on human immunity genes
-
China officially bans CRISPR babies, human clones and animal-human hybrids
-
Reflecting Sunlight: Recommendations for Solar Geoengineering Research and Research Governance
-
Should We Block the Sun? Scientists Say the Time Has Come to Study It. The National Academies said the United States must study technologies that would artificially cool the planet by reflecting away some sunlight, citing the lack of progress fighting global warming.
-
Improving Public Sector Management at Scale? Experimental Evidence on School Governance India
-
Jay-Z’s 99 Problems, Verse 2: A Close Reading with Fourth Amendment Guidance for Cops and Perps
-
Oxylipin biosynthesis reinforces cellular senescence and allows detection of senolysis
-
Inside the Secret Sting Operations to Expose Celebrity Psychics: Are some celebrity mediums fooling their audience members by reading social media pages in advance? A group of online vigilantes is out to prove it
-
If I fits I sits: A citizen science investigation into illusory contour susceptibility in domestic cats (Felis silvestris catus)
-
Cetaceans, sex and sea serpents: an analysis of the Egede accounts of a "most dreadful monster" seen off the coast of Greenland in 1734
-
Paxo's Pot-Pourri
-
Building the perfect curse word: A psycholinguistic investigation of the form and meaning of taboo words
-
How Developers Choose Names
-
Bringing GNU Emacs to Native Code
-
Hosting SQLite databases on Github Pages (or any static file hoster)
-
Check out This Demo: I Run the SQL Query select Country_code, Long_name from Wdi_country Order by Rowid Desc Limit 100
and It Fetches Just 54.2KB of New Data (Across 49 Small HTTP Requests) to Return 100 Results—From a Statically Hosted Database File That’s 668.8MB!
-
Fontemon
-
How I Did Relay Quine
-
Surprisingly Turing-Complete
-
https://sigbovik.org/2021/proceedings.pdf
-
https://sigbovik.org/2021/proceedings.pdf#page=8
-
https://sigbovik.org/2021/proceedings.pdf#page=83
-
https://sigbovik.org/2021/proceedings.pdf#page=126
-
https://sigbovik.org/2021/proceedings.pdf#page=167
-
https://sigbovik.org/2021/proceedings.pdf#page=216
-
https://sigbovik.org/2021/proceedings.pdf#page=252
-
Time Travel and Computing
-
https://sigbovik.org/2021/proceedings.pdf#page=282
-
The Association for Computational Heresy
-
On the Impossibility of Supersized Machines
-
https://journals.le.ac.uk/index.php/pst/issue/archive
-
BMJ Christmas Issue
-
Bahfest
-
Possible Girls
-
The Kelly Criterion in Blackjack Sports Betting, and the Stock Market
-
The Performance Pay Nobel
-
Evolution as Backstop for Reinforcement Learning
-
The Ocean’s Hot Dog: The Development of the Fish Stick
-
The esthetics of Smelly Art
-
The Odor Value Concept in the Formal Analysis of Olfactory Art
-
Hedonic Tone, Memetics, Scent, Sex, Spirituality
-
Qualia Research Diary: Scents [Consciousness Research, Experiment, Genetics, Memetics, Scent, Valence]
-
The Scent of the Nile: Jean-Claude Ellena creates a new perfume
-
Mechanisms of scent-tracking in humans
-
2006-porter-humanscenttracking-41593_2007_bfnn1819_moesm2_esm.mp4
-
Poor human olfaction is a 19th-century myth
-
Perceptual convergence of multi-component mixtures in olfaction implies an olfactory white
-
History of Combinatorial Generation (The Art of Computer Programming: Volume 4: Pre-Fascicle 4B: §7.2.1.7) § Pg22
-
Https://x.com/add_hawk/status/1357071738731814912
-
The Best in Fragrance…and More
-