September 2021 Gwern.net newsletter with links on TODO
September 2021’s Gwern.net newsletter is now out; previous, August 2021 (archives). This is a collation of links and summary of major changes, overlapping with my Changelog; brought to you by my donors on Patreon.
Writings
-
Gwern.net: link bibliographies (eg. for this page); new server
Links
AI
-
“DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning”, 2021
-
“How Does AI Improve Human Decision-Making? Evidence from the AI-Powered Go Program”, et al 2021 (absolute human error rates: from the AI’s perspective, every move a Go pro makes, costs them ~1.2% chance of winning)
-
“State of AI 2021”, 2021
-
“Achieving Human Parity on Visual Question Answering”, et al 2021
-
“Transformers are Meta-Reinforcement Learners”, 2021 (expected, but good to check); “Transformers Can Do Bayesian Inference”, 2021 (meta-learning amortized Bayesian inference inside a Transformer—less expected, but neat)
-
“ruDALL-E”: 1.2b-parameter DALL·E 1 trained on n=120m by Sberbank, public trained model (generates everything, including anime)
-
Shadow Planet, by The Cotton Modules (album by Jesse Solomon Clark & Robin Sloan, exchanging edits of Jukebox tracks; “It’s basically a tuba! A very… strange… and powerful… tuba… … like wandering in an enormous labyrinth or a dead city”)
-
“This Catgirl Does Not Exist”, EdZ543 (StyleGAN2-Ada on Safebooru catgirls, transfer-learned from TWDNEv2)
-
“Fictitious Co-Play: Collaborating with Humans without Human Data”, et al 2021 (blessings of scale: diverse populations of agents automatically train more flexible & human-compatible agents, without fancy tricks)
-
“A Recipe For Arbitrary Text Style Transfer with Large Language Models”, et al 2021 (improves on Story Centaur; turns out I almost got general text style transfer working earlier; see also this review of older text style transfer research—another Bitter Lesson; how many tens of millions of dollars in researcher/volunteer time, grants, overhead, publication/peer-review etc was, and will be, spent on dataset creation & clever research over the past decade, all to get worse results than politely asking an off-the-shelf large text model?)
-
“Symbolic Knowledge Distillation: from General Language Models to Commonsense Models”, et al 2021 ( GPT-3 can generate commonsense causal knowledge graphs as good as a human-generated graph)
-
“Masked Autoencoders Are Scalable Vision Learners”, et al 2021
-
“Visible Thoughts Project and Bounty Announcement”, MIRI ($200k prize for dataset teaching language models to ‘internal monologue’)
-
“LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs”, et al 2021
-
“CoAtNet: Marrying Convolution and Attention for All Data Sizes”, et al 2021 (90.88% ImageNet SOTA, set by CoAtNet-2.44b pretrained on JFT-3B)
-
“Turing Bletchley: A Universal Image Language Representation model by Microsoft” (2.5b-param; n ~ billions + 0.5b CC translation pairs; beats CLIP/ALIGN)
-
“Vector-quantized Image Modeling with Improved VQGAN”, 2021 (improving ViT-GAN up to 1.7b-parameters); “HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator”, 2021
-
“Effect of scale on catastrophic forgetting in neural networks”, 2021
-
“On the Opportunities and Risks of Foundation Models”, et al 2021
-
“A Universal Law of Robustness via Isoperimetry”, 2021 (Twitter, talk; blessings of scale—scaling is also magic pixie dust for adversarial attacks, bitter-lessoning an entire academic field of ever more elaborate (yet failed) defenses…?); “Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power”, et al 2022
-
“On the Predictability of Pruning Across Scales”, et al 2020 (scaling laws for sparsity: initially free large size reductions, then power-law worsening, then plateau at tiny but bad models)
-
“Sparse Is Enough in Scaling Transformers”, et al 2021
-
“What Are Bayesian Neural Network Posteriors Really Like?”, et al 2021 (frequentist-trained NNs nevertheless are samples from the posterior, so ensembles—and increasingly-large models?—are fully Bayesian)
-
“GHVAE: Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction”, et al 2021
-
“DALL·E 1 mini: Generate images from a text prompt”, et al 2021 (project/hackathon writeup of training a small-scale DALL·E 1; while not nearly as good as the full-scale OA model, of course, the results would’ve been striking just 2 years ago, and are trained on merely a TPUv3-8—demonstrating the hardware overhang and the ease of replication once you know what you’re doing.1)
-
“Multimodal Few-Shot Learning with Frozen Language Models”, et al 2021; “AudioCLIP: Extending CLIP to Image, Text and Audio”, et al 2021
-
“Google demonstrates leading performance in latest MLPerf Benchmarks” using TPUv4s
-
“Time-Aware Language Models as Temporal Knowledge Bases”, et al 2021 (a nice use of the inline metadata trick to greatly improve T5 temporal knowledge/reasoning—just include more metadata! No fancy new symbolic architectures or training methods. Just provide more useful data.)
Genetics
Everything Is Heritable:
-
“Genetic risk factors have a substantial impact on healthy life years”, et al 2022
-
“Novel disease associations with schizophrenia genetic risk revealed in ~400,000 UK Biobank participants”, et al 2021
-
“Does Parental Education Influence Child Educational Outcomes? A Developmental Analysis in a Full-Population Sample and Adoptee Design”, et al 2021 (“no”)
-
“The contribution of additive genetic variation to personality variation: heritability of personality”, et al 2015 (“52% of animal personality variation was attributable to additive genetic variation”)
-
“Genome-wide association analyses of individual differences in quantitatively assessed reading-related and language-related skills in up to 34,000 people”, et al 2021 (unusually detailed phenotyping)
-
“Identifying a living great-grandson of the Lakota Sioux leader Tatanka Iyotake (Sitting Bull)”, et al 2021 (estimating relatedness using 2,259 SNPs extracted from an 1890134ya arsenic-preserved sample of Sitting Bull’s hair)
-
“Global Biobank Meta-Analysis Initiative: powering genetic discovery across human diseases”, Global Biobank Meta-Analysis 2021 (n = 2.1m)
Recent Evolution:
Engineering:
Statistics/Meta-Science
-
“Sigmoids behaving badly: why they usually cannot predict the future as well as they seem to promise”, et al 2021 (Older discussions of this forecasting folklore: 1, 2, 3, 4)
-
“Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS Experiment”, 2021
Politics/Religion
-
“Gender Identity, Coworking Spouses and Relative Income within Households”, 2021 (better explanations for the ‘gender cliff’ than misogyny)
-
“The Razor Blade in the Apple: The Social Construction of Urban Legends”, Best & Horiuchi 198539ya (the poisoned-Halloween-candy thing was never real)
-
“Obesity of politicians and corruption in post-Soviet countries”, 2020
-
“Rule Enforcement Without Visible Means: Christmas Gift Giving in Middletown”, 1984
Psychology/Biology
-
“Remembering immunity: Neuronal ensembles in the insular cortex encode and retrieve specific immune responses”, et al 2020 ( media)
-
“Erosion of the Epigenetic Landscape and Loss of Cellular Identity as a Cause of Aging in Mammals”, et al 2021; “Making sense of the ageing methylome”, et al 2022
-
“Blood-based epigenome-wide analyses of cognitive abilities”, et al 2021 (since the epigenome is further downstream, should be interesting for investigating how genetic & environmental causes of intelligence are, mechanistically, mediated)
-
“A collective analysis of lifespan-extending compounds in diverse model organisms, and of species whose lifespan can be extended the most by the application of compounds”, 2021 (the ‘everything works in mice’ problem of life extension research)
-
“Echolocating bats rely on an innate speed-of-sound reference”, 2021 (raising bats from birth in a helium environment to test nature vs nurture)
-
“Reality shifting: psychological features of an emergent online daydreaming culture”, et al 2021
-
“Nothing Ventured, Nothing Gained: [Toxoplasma Gondii] Parasite Infection is Associated with Entrepreneurial Initiation, Engagement, and Performance”, et al 2020 (getting increasingly difficult to explain away as a confound)
-
“Escape of hair follicle stem cells causes stem cell exhaustion during aging” (hair follicles stop working with age as genes dysregulate & stem cells ‘leak’ out to be killed by the immune system)
Technology
-
“Synthetic fat from petroleum as a resilient food for global catastrophes: Preliminary techno-economic assessment and technology roadmap”, et al 2021; “David Denkenberger on using paper mills and seaweed to feed everyone in a catastrophe”
-
“2005: Shades of Doom” (Doom for the blind)
-
“Ink traps and pals”, Toshi Omagari (challenges of microtypography & computer displays: ink or light runs)
-
“Is this the simplest (and most surprising) sorting algorithm ever?”, 2021 (an even simpler slower insertion sort; you can make arbitrarily-slow sorting algorithms because you can work on permutations of arbitrary computable functions, but they won’t be as simple); “Coolex: The coolest way to generate combinations”, 2009
Economics
-
“Hotelling’s law” (why do all those gas stations/coffee shops/pharmacies—or deep learning models…?—cluster together?)
Philosophy
-
“The science of cycology: Failures to understand how everyday objects work”, 2006
-
“Donors vastly underestimate differences in charities’ effectiveness”, et al 2020
-
“Elephas Anthropogenus”, Uli 2015 (visualizing the degeneration of medieval European knowledge of what ‘an elephant’ looks like); “Distinguishing Real Vs. Fake Tiger Penises [Identification Guides for Wildlife Law Enforcement No. 6]”, 2005
Fiction
Miscellaneous
-
The dastardly career of Emmanuel Barthélemy (shot a cop during a failed coup, was exiled after another failed uprising, was responsible for the last fatal duel in England which kill a fellow activist, murdered his employer and then a neighbor after possibly attempting to blackmail him with a fraudulent daughter and while plotting to assassinate Napoleon III; finally convicted, he made blasphemous jokes about his execution & requested Paradise Lost to read. And the one good killing Barthélemy might’ve done, that of Karl Marx—for being too conservative—he failed!)
Books
Nonfiction:
Fiction:
Film/TV
Live-action:
Animated:
Music
MLP:
Doujin:
Misc:
-
As Norbert Wiener remarked of the A-bomb, the only nuclear secret worth keeping from Stalin was the secret that it was feasible: knowledge of the implementation details isn’t that important, as much work as they may take, compared to the general idea & the knowledge that it works. I am reminded of the joke about the repairman, but to rewrite it for DL: “Here’s a 10-line diff fixing your AGI; my compute-bill is 10 million petaflop-days.”; “What‽ But it only takes 0.1 million to train it!” “Yes, 0.1m to train it with the diff, and 9.90m to know which 10 lines.”↩︎