September 2021 News
September 2021 Gwern.net newsletter with links on TODO
September 2021’s Gwern.net newsletter is now out; previous, August 2021 (archives). This is a collation of links and summary of major changes, overlapping with my Changelog; brought to you by my donors on Patreon.
Writings
Gwern.net: link bibliographies (eg. for this page); new server
Links
AI
“DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning”, Wang & Lian 2021
“How Does AI Improve Human Decision-Making? Evidence from the AI-Powered Go Program”, Choi et al 2021 (absolute human error rates: from the AI’s perspective, every move a Go pro makes, costs them ~1.2% chance of winning)
“State of AI Report 2021”, Benaich & Hogarth 2021
“Achieving Human Parity on Visual Question Answering”, Yan et al 2021
“Transformers are Meta-Reinforcement Learners”, Anonymous 2021 (expected, but good to check); “Transformers Can Do Bayesian Inference”, Anonymous 2021 (meta-learning amortized Bayesian inference inside a Transformer—less expected, but neat)
“ruDALL-E”: 1.2b-parameter DALL·E 1 trained on n=120m by Sberbank, public trained model (generates everything, including anime)
Shadow Planet, by The Cotton Modules (album by Jesse Solomon Clark & Robin Sloan, exchanging edits of Jukebox tracks; “It’s basically a tuba! A very… strange… and powerful… tuba… … like wandering in an enormous labyrinth or a dead city”)
“This Catgirl Does Not Exist”, EdZ543 (StyleGAN2-Ada on Safebooru catgirls, transfer-learned from TWDNEv2)
“Fictitious Co-Play: Collaborating with Humans without Human Data”, Strouse et al 2021 (blessings of scale: diverse populations of agents automatically train more flexible & human-compatible agents, without fancy tricks)
“A Recipe For Arbitrary Text Style Transfer with Large Language Models”, Reif et al 2021 (improves on Story Centaur; turns out I almost got general text style transfer working earlier; see also this review of older text style transfer research—another Bitter Lesson; how many tens of millions of dollars in researcher/volunteer time, grants, overhead, publication/peer-review etc. was, and will be, spent on dataset creation & clever research over the past decade, all to get worse results than politely asking an off-the-shelf large text model?)
“Symbolic Knowledge Distillation: from General Language Models to Commonsense Models”, West et al 2021 (GPT-3 can generate commonsense causal knowledge graphs as good as a human-generated graph)
“Masked Autoencoders Are Scalable Vision Learners”, He et al 2021
“Visible Thoughts Project and Bounty Announcement”, MIRI ($200k prize for dataset teaching language models to ‘internal monologue’)
“LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs”, Schuhmann et al 2021
“CoAtNet: Marrying Convolution and Attention for All Data Sizes”, Dai et al 2021 (90.88% ImageNet SOTA, set by CoAtNet-2.44b pretrained on JFT-3B)
“Turing Bletchley: A Universal Image Language Representation model by Microsoft” (2.5b-param; n ~ billions + 0.5b CC translation pairs; beats CLIP/ALIGN)
“Vector-quantized Image Modeling with Improved VQGAN”, Anonymous 2021 (improving ViT-GAN up to 1.7b-parameters); “HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator”, Anonymous 2021
“Effect of scale on catastrophic forgetting in neural networks”, Anonymous 2021
“On the Opportunities and Risks of Foundation Models”, Bommasani et al 2021
“A Universal Law of Robustness via Isoperimetry”, Bubeck & Sellke 2021 (Twitter, talk; blessings of scale—scaling is also magic pixie dust for adversarial attacks, bitter-lessoning an entire academic field of ever more elaborate (yet failed) defenses…?); “Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power”, Li et al 2022
“On the Predictability of Pruning Across Scales”, Rosenfeld et al 2020 (scaling laws for sparsity: initially free large size reductions, then power-law worsening, then plateau at tiny but bad models)
“Sparse Is Enough in Scaling Transformers”, Jaszczur et al 2021
“What Are Bayesian Neural Network Posteriors Really Like?”, Izmailov et al 2021 (frequentist-trained NNs nevertheless are samples from the posterior, so ensembles—and increasingly-large models?—are fully Bayesian)
“GHVAE: Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction”, Wu et al 2021
“DALL·E 1 mini: Generate images from a text prompt”, Daym et al 2021 (project/hackathon writeup of training a small-scale DALL·E 1; while not nearly as good as the full-scale OA model, of course, the results would’ve been striking just 2 years ago, and are trained on merely a TPUv3-8—demonstrating the hardware overhang and the ease of replication once you know what you’re doing.1)
“Multimodal Few-Shot Learning with Frozen Language Models”, Tsimpoukelli et al 2021; “AudioCLIP: Extending CLIP to Image, Text and Audio”, Guzhov et al 2021
“Google demonstrates leading performance in latest MLPerf Benchmarks” using TPUv4s
“Time-Aware Language Models as Temporal Knowledge Bases”, Dhingra et al 2021 (a nice use of the inline metadata trick to greatly improve T5 temporal knowledge/reasoning—just include more metadata! No fancy new symbolic architectures or training methods. Just provide more useful data.)
Genetics
Everything Is Heritable:
“Genetic risk factors have a substantial impact on healthy life years”, Jukarainen et al 2022
“Novel disease associations with schizophrenia genetic risk revealed in ~400,000 UK Biobank participants”, Zhang et al 2021
“Does Parental Education Influence Child Educational Outcomes? A Developmental Analysis in a Full-Population Sample and Adoptee Design”, Ludeke et al 2021 (“no”)
“Using genes to explore the effects of cognitive and non-cognitive skills on education and labor market outcomes”, Buser et al 2021
“The contribution of additive genetic variation to personality variation: heritability of personality”, Dochtermann et al 201511ya (“52% of animal personality variation was attributable to additive genetic variation”)
“Uncovering the Genetic Architecture of Broad Antisocial Behavior through a Genome-Wide Association Study Meta-analysis”, Tielbeek et al 2021
“Genome-wide association analyses of individual differences in quantitatively assessed reading-related and language-related skills in up to 34,000 people”, Eising et al 2021 (unusually detailed phenotyping)
“Identifying a living great-grandson of the Lakota Sioux leader Tatanka Iyotake (Sitting Bull)”, Moltke et al 2021 (estimating relatedness using 2,259 SNPs extracted from an 1890136ya arsenic-preserved sample of Sitting Bull’s hair)
“Global Biobank Meta-Analysis Initiative: powering genetic discovery across human diseases”, Global Biobank Meta-Analysis Initiative 2021 (n = 2.1m)
Recent Evolution:
“Natural selection contributes to the myopia epidemic”, Long & Zhang 2020
Engineering:
“Facultative Parthenogenesis in California Condors”, Ryder et al 2021
Statistics/Meta-Science
“Sigmoids behaving badly: why they usually cannot predict the future as well as they seem to promise”, Sandberg et al 2021 (Older discussions of this forecasting folklore: 1, 2, 3, 4)
“Inconsistency in Conference Peer Review: Revisiting the 201412ya NeurIPS Experiment”, Cortes & Lawrence 2021
Politics/Religion
“Gender Identity, Coworking Spouses and Relative Income within Households”, Zinovyeva & Tverdostup 2021 (better explanations for the ‘gender cliff’ than misogyny)
“The Razor Blade in the Apple: The Social Construction of Urban Legends”, Best & Horiuchi 198541ya (the poisoned-Halloween-candy thing was never real)
“Obesity of politicians and corruption in post-Soviet countries”, Blavatskyy 2020
“Rule Enforcement Without Visible Means: Christmas Gift Giving in Middletown”, Caplow 1984
Psychology/Biology
“Remembering immunity: Neuronal ensembles in the insular cortex encode and retrieve specific immune responses”, Koren et al 2020 (media)
“Erosion of the Epigenetic Landscape and Loss of Cellular Identity as a Cause of Aging in Mammals”, Yang et al 2021; “Making sense of the aging methylome”, Seale et al 2022
“Blood-based epigenome-wide analyses of cognitive abilities”, McCartney et al 2021 (since the epigenome is further downstream, should be interesting for investigating how genetic & environmental causes of intelligence are, mechanistically, mediated)
“A collective analysis of lifespan-extending compounds in diverse model organisms, and of species whose lifespan can be extended the most by the application of compounds”, Berkel & Cacan 2021 (the ‘everything works in mice’ problem of life extension research)
“Echolocating bats rely on an innate speed-of-sound reference”, Amichai & Yovel 2021 (raising bats from birth in a helium environment to test nature vs nurture)
“Attractor and integrator networks in the brain”, Khona & Fiete 2021
“Reality shifting: psychological features of an emergent online daydreaming culture”, Somer et al 2021
“Nothing Ventured, Nothing Gained: [Toxoplasma Gondii] Parasite Infection is Associated with Entrepreneurial Initiation, Engagement, and Performance”, Lerner et al 2020 (getting increasingly difficult to explain away as a confound)
“Escape of hair follicle stem cells causes stem cell exhaustion during aging” (hair follicles stop working with age as genes dysregulate & stem cells ‘leak’ out to be killed by the immune system)
Technology
“Synthetic fat from petroleum as a resilient food for global catastrophes: Preliminary techno-economic assessment and technology roadmap”, Martínez et al 2021; “David Denkenberger on using paper mills and seaweed to feed everyone in a catastrophe”
“2005: Shades of Doom” (Doom for the blind)
“Ink traps and pals”, Toshi Omagari (challenges of microtypography & computer displays: ink or light runs)
“Is this the simplest (and most surprising) sorting algorithm ever?”, Fung 2021 (an even simpler slower insertion sort; you can make arbitrarily-slow sorting algorithms because you can work on permutations of arbitrary computable functions, but they won’t be as simple); “Coolex: The coolest way to generate combinations”, Ruskey & Williams 2009
Economics
“Hotelling’s law” (why do all those gas stations/coffee shops/pharmacies—or deep learning models…?—cluster together?)
Philosophy
“The science of cycology: Failures to understand how everyday objects work”, Lawson 2006
“Donors vastly underestimate differences in charities’ effectiveness”, Caviola et al 2020
“Elephas Anthropogenus”, Uli Westphal 2015 (visualizing the degeneration of medieval European knowledge of what ‘an elephant’ looks like); “Distinguishing Real Versus Fake Tiger Penises [Identification Guides for Wildlife Law Enforcement No. 6]”, Yates 2005
Fiction
Miscellaneous
The dastardly career of Emmanuel Barthélemy (shot a cop during a failed coup, was exiled after another failed uprising, was responsible for the last fatal duel in England which kill a fellow activist, murdered his employer and then a neighbor after possibly attempting to blackmail him with a fraudulent daughter and while plotting to assassinate Napoleon III; finally convicted, he made blasphemous jokes about his execution & requested Paradise Lost to read. And the one good killing Barthélemy might’ve done, that of Karl Marx—for being too conservative—he failed!)
Books
Nonfiction:
Fiction:
Film/TV
Live-action:
Animated:
Music
MLP:
Doujin:
Misc:
As Norbert Wiener remarked of the A-bomb, the only nuclear secret worth keeping from Stalin was the secret that it was feasible: knowledge of the implementation details isn’t that important, as much work as they may take, compared to the general idea & the knowledge that it works. I am reminded of the joke about the repairman, but to rewrite it for DL: “Here’s a 10-line diff fixing your AGI; my compute-bill is 10 million petaflop-days.”; “What‽ But it only takes 0.1 million to train it!” “Yes, 0.1m to train it with the diff, and 9.90m to know which 10 lines.”↩︎