September 2021 News

Gwern

September 2021 News

September 2021 Gwern.net newsletter with links on TODO

in progress certainty: log importance: 0 bibliography

Writings
Links
Books
Film/TV
Music

September 2021’s Gwern.net newsletter is now out; previous, August 2021 (archives). This is a collation of links and summary of major changes, overlapping with my Changelog; brought to you by my donors on Patreon.

Writings

The Math of Hunting Lions
Gwern.net: link bibliographies (eg. for this page); new server

Links

AI

“DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning”, Wang & Lian2021
“How Does AI Improve Human Decision-Making? Evidence from the AI-Powered Go Program”, Choi et al 2021 (absolute human error rates: from the AI’s perspective, every move a Go pro makes, costs them ~1.2% chance of winning)
“State of AI Report2021”, Benaich & Hogarth2021
“Achieving Human Parity on Visual Question Answering”, Yan et al 2021
“Transformers are Meta-Reinforcement Learners”, Anonymous2021 (expected, but good to check); “Transformers Can Do Bayesian Inference”, Anonymous2021 (meta-learning amortized Bayesian inference inside a Transformer—less expected, but neat)
“ruDALL-E”: 1.2b-parameter DALL·E 1 trained on n=120m by Sberbank, public trained model (generates everything, including anime)
Shadow Planet, by The Cotton Modules (album by Jesse Solomon Clark & Robin Sloan, exchanging edits of Jukebox tracks; “It’s basically a tuba! A very… strange… and powerful… tuba… … like wandering in an enormous labyrinth or a dead city”)
- sequel
“This Catgirl Does Not Exist”, EdZ543 (StyleGAN2-Ada on Safebooru catgirls, transfer-learned from TWDNEv2)

Matters Of Scale:

“Fictitious Co-Play: Collaborating with Humans without Human Data”, Strouse et al 2021 (blessings of scale: diverse populations of agents automatically train more flexible & human-compatible agents, without fancy tricks)
“A Recipe For Arbitrary Text Style Transfer with Large Language Models”, Reif et al 2021 (improves on Story Centaur; turns out I almost got general text style transfer working earlier; see also this review of older text style transfer research—another Bitter Lesson; how many tens of millions of dollars in researcher/volunteer time, grants, overhead, publication/peer-review etc was, and will be, spent on dataset creation & clever research over the past decade, all to get worse results than politely asking an off-the-shelf large text model?)
“Symbolic Knowledge Distillation: from General Language Models to Commonsense Models”, West et al 2021 (GPT-3 can generate commonsense causal knowledge graphs as good as a human-generated graph)
“Masked Autoencoders Are Scalable Vision Learners”, He et al 2021
“Visible Thoughts Project and Bounty Announcement”, MIRI ($200k prize for dataset teaching language models to ‘internal monologue’)
“LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs”, Schuhmann et al 2021
“CoAtNet: Marrying Convolution and Attention for All Data Sizes”, Dai et al 2021 (90.88% ImageNet SOTA, set by CoAtNet-2.44b pretrained on JFT-3B)
“Turing Bletchley: A Universal Image Language Representation model by Microsoft” (2.5b-param; n ~ billions + 0.5b CC translation pairs; beats CLIP/ALIGN)
“Vector-quantized Image Modeling with Improved VQGAN”, Anonymous2021 (improving ViT-GAN up to 1.7b-parameters); “HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator”, Anonymous2021
“Effect of scale on catastrophic forgetting in neural networks”, Anonymous2021
“On the Opportunities and Risks of Foundation Models”, Bommasani et al 2021
“A Universal Law of Robustness via Isoperimetry”, Bubeck & Sellke2021 (Twitter, talk; blessings of scale—scaling is also magic pixie dust for adversarial attacks, bitter-lessoning an entire academic field of ever more elaborate (yet failed) defenses…?); “Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power”, Li et al 2022
“On the Predictability of Pruning Across Scales”, Rosenfeld et al 2020 (scaling laws for sparsity: initially free large size reductions, then power-law worsening, then plateau at tiny but bad models)
“Sparse Is Enough in Scaling Transformers”, Jaszczur et al 2021
“What Are Bayesian Neural Network Posteriors Really Like?”, Izmailov et al 2021 (frequentist-trained NNs nevertheless are samples from the posterior, so ensembles—and increasingly-large models?—are fully Bayesian)
“GHVAE: Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction”, Wu et al 2021
“DALL·E 1 mini: Generate images from a text prompt”, Daym et al 2021 (project/hackathon writeup of training a small-scale DALL·E 1; while not nearly as good as the full-scale OA model, of course, the results would’ve been striking just 2 years ago, and are trained on merely a TPUv3-8—demonstrating the hardware overhang and the ease of replication once you know what you’re doing.¹)
“Multimodal Few-Shot Learning with Frozen Language Models”, Tsimpoukelli et al 2021; “AudioCLIP: Extending CLIP to Image, Text and Audio”, Guzhov et al 2021
“Google demonstrates leading performance in latest MLPerf Benchmarks” using TPUv4s
“Time-Aware Language Models as Temporal Knowledge Bases”, Dhingra et al 2021 (a nice use of the inline metadata trick to greatly improve T5 temporal knowledge/reasoning—just include more metadata! No fancy new symbolic architectures or training methods. Just provide more useful data.)

Genetics

Everything Is Heritable:

“Genetic risk factors have a substantial impact on healthy life years”, Jukarainen et al 2022
“Novel disease associations with schizophrenia genetic risk revealed in ~400,000 UK Biobank participants”, Zhang et al 2021
“Does Parental Education Influence Child Educational Outcomes? A Developmental Analysis in a Full-Population Sample and Adoptee Design”, Ludeke et al 2021 (“no”)
“Using genes to explore the effects of cognitive and non-cognitive skills on education and labor market outcomes”, Buser et al 2021
“The contribution of additive genetic variation to personality variation: heritability of personality”, Dochtermann et al 2015 (“52% of animal personality variation was attributable to additive genetic variation”)
“Uncovering the Genetic Architecture of Broad Antisocial Behavior through a Genome-Wide Association Study Meta-analysis”, Tielbeek et al 2021
“Genome-wide association analyses of individual differences in quantitatively assessed reading-related and language-related skills in up to 34,000 people”, Eising et al 2021 (unusually detailed phenotyping)
“Identifying a living great-grandson of the Lakota Sioux leader Tatanka Iyotake (Sitting Bull)”, Moltke et al 2021 (estimating relatedness using 2,259 SNPs extracted from an 1890_135ya arsenic-preserved sample of Sitting Bull’s hair)
“Global Biobank Meta-Analysis Initiative: powering genetic discovery across human diseases”, Global Biobank Meta-Analysis Initiative2021 (n = 2.1m)

Recent Evolution:

“Natural selection contributes to the myopia epidemic”, Long & Zhang2020

Engineering:

“Facultative Parthenogenesis in California Condors”, Ryder et al 2021

Statistics/Meta-Science

“Sigmoids behaving badly: why they usually cannot predict the future as well as they seem to promise”, Sandberg et al 2021 (Older discussions of this forecasting folklore: 1, 2, 3, 4)
“Inconsistency in Conference Peer Review: Revisiting the 2014_11ya NeurIPS Experiment”, Cortes & Lawrence2021

Politics/Religion

“Gender Identity, Coworking Spouses and Relative Income within Households”, Zinovyeva & Tverdostup2021 (better explanations for the ‘gender cliff’ than misogyny)
“The Razor Blade in the Apple: The Social Construction of Urban Legends”, Best & Horiuchi1985 (the poisoned-Halloween-candy thing was never real)
“Obesity of politicians and corruption in post-Soviet countries”, Blavatskyy2020
“Rule Enforcement Without Visible Means: Christmas Gift Giving in Middletown”, Caplow1984
“Lost in Thought: The psychological risks of meditation”

Psychology/Biology

“Remembering immunity: Neuronal ensembles in the insular cortex encode and retrieve specific immune responses”, Koren et al 2020 (media)
“Erosion of the Epigenetic Landscape and Loss of Cellular Identity as a Cause of Aging in Mammals”, Yang et al 2021; “Making sense of the ageing methylome”, Seale et al 2022
“Blood-based epigenome-wide analyses of cognitive abilities”, McCartney et al 2021 (since the epigenome is further downstream, should be interesting for investigating how genetic & environmental causes of intelligence are, mechanistically, mediated)
“A collective analysis of lifespan-extending compounds in diverse model organisms, and of species whose lifespan can be extended the most by the application of compounds”, Berkel & Cacan2021 (the ‘everything works in mice’ problem of life extension research)
“A New Way to Be Mad: The phenomenon is not as rare as one might think: healthy people deliberately setting out to rid themselves of one or more of their limbs…Why do pathologies sometimes arise as if from nowhere? Can the mere description of a condition make it contagious?”
“Echolocating bats rely on an innate speed-of-sound reference”, Amichai & Yovel2021 (raising bats from birth in a helium environment to test nature vs nurture)
“Attractor and integrator networks in the brain”, Khona & Fiete2021
“Reality shifting: psychological features of an emergent online daydreaming culture”, Somer et al 2021
“Nothing Ventured, Nothing Gained: [Toxoplasma Gondii] Parasite Infection is Associated with Entrepreneurial Initiation, Engagement, and Performance”, Lerner et al 2020 (getting increasingly difficult to explain away as a confound)
“Escape of hair follicle stem cells causes stem cell exhaustion during aging” (hair follicles stop working with age as genes dysregulate & stem cells ‘leak’ out to be killed by the immune system)

Technology

“Synthetic fat from petroleum as a resilient food for global catastrophes: Preliminary techno-economic assessment and technology roadmap”, Martínez et al 2021; “David Denkenberger on using paper mills and seaweed to feed everyone in a catastrophe”
“2005: Shades of Doom” (Doom for the blind)
“Ink traps and pals”, Toshi Omagari (challenges of microtypography & computer displays: ink or light runs)
Railroad torpedo
“Anti-Orbit Laser Submarines”
“Is this the simplest (and most surprising) sorting algorithm ever?”, Fung2021 (an even simpler slower insertion sort; you can make arbitrarily-slow sorting algorithms because you can work on permutations of arbitrary computable functions, but they won’t be as simple); “Coolex: The coolest way to generate combinations”, Ruskey & Williams2009
Scunthorpe problem

Economics

“Choices Are Really Bad”, Zvi Mowshowitz
Spanish Christmas Lottery
“Hotelling’s law” (why do all those gas stations/coffee shops/pharmacies—or deep learning models…?—cluster together?)

Philosophy

“The science of cycology: Failures to understand how everyday objects work”, Lawson2006
“Donors vastly underestimate differences in charities’ effectiveness”, Caviola et al 2020
“Elephas Anthropogenus”, Uli Westphal2015 (visualizing the degeneration of medieval European knowledge of what ‘an elephant’ looks like); “Distinguishing Real Versus Fake Tiger Penises [Identification Guides for Wildlife Law Enforcement No. 6]”, Yates2005
Norton’s dome

Fiction

Miscellaneous

The dastardly career of Emmanuel Barthélemy (shot a cop during a failed coup, was exiled after another failed uprising, was responsible for the last fatal duel in England which kill a fellow activist, murdered his employer and then a neighbor after possibly attempting to blackmail him with a fraudulent daughter and while plotting to assassinate Napoleon III; finally convicted, he made blasphemous jokes about his execution & requested Paradise Lost to read. And the one good killing Barthélemy might’ve done, that of Karl Marx—for being too conservative—he failed!)

Books

Nonfiction:

Fiction:

Film/TV

Live-action:

Animated:

Music

Touhou:

Vocaloid:

MLP:

Doujin:

Misc:

As Norbert Wiener remarked of the A-bomb, the only nuclear secret worth keeping from Stalin was the secret that it was feasible: knowledge of the implementation details isn’t that important, as much work as they may take, compared to the general idea & the knowledge that it works. I am reminded of the joke about the repairman, but to rewrite it for DL: “Here’s a 10-line diff fixing your AGI; my compute-bill is 10 million petaflop-days.”; “What‽ But it only takes 0.1 million to train it!” “Yes, 0.1m to train it with the diff, and 9.90m to know which 10 lines.”↩︎

[Error: JavaScript disabled.]

[Backlinks, similar links, and the bibliography require JS enabled to load.]

Bibliography

[Bibliography of links/references used in page]