February 2021 News

Gwern

February 2021 News

February 2021 Gwern.net newsletter with links on AI scaling, semaglutide, and ethicist ethics.

finished certainty: log importance: 0 bibliography

Writings
Links

February 2021’s Gwern.net newsletter is now out; previous, January 2021 (archives). This is a collation of links and summary of major changes, overlapping with my Changelog; brought to you by my donors on Patreon.

Writings

Gwern.net: popups: can now be moved, stickied, and full-screened (another step towards our ambition of Windows-95-in-the-browser!)

Links

AI

“Controllable Neural Text Generation”, Lilian Weng; “Recent Advances in Language Model Fine-tuning”, Sebastian Ruder (review)
- “Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm”, Reynolds & McDonell2021 (original 10-shot Fr → En translation can be beaten by the better 0-shot prompt: “French: XYZ / English:…”; this is “true of most worst-performing prompts…”); “Calibrate Before Use: Improving Few-Shot Performance of Language Models”, Zhao et al 2021 (huge boost from calibrating unstable prompts; both demonstrate, as always, that “sampling can prove the presence of knowledge but not the absence.”)
“TransGAN: Two Transformers Can Make One Strong GAN”, Jiang et al 2021 (Transformer-only GAN: attention is all you need)
“PACT: Proof Artifact Co-training for Theorem Proving with Language Models”, Han et al 2021 (GPT-f for Lean)
“Towards End-to-End In-Image Neural Machine Translation”, Mansimov et al 2020 (sure why not)
Brains:
- “Artificial Neural Nets Finally Yield Clues to How Brains Learn”; Whittington & Bogacz2019 (short overview of biologically-plausible backprop: feedback alignment, target propagation, predictive coding, & attentional feedback; also of recent interest, VS-ML; given their increasing success in training while respecting more biological constraints, the increasing power of backprop-trained ANNs and the neurological success of ANNs in predicting & imitating brain signals, it is increasingly clear that brains really do do backprop in some sense)
- “NSD: A massive 7-tesla fMRI dataset to bridge cognitive and computational neuroscience”, Jean et al 2021 (“…The availability of NSD thus opens the door to using brain activity to directly guide the optimization of deep neural networks.”)
- “Brain2Pix: Fully convolutional naturalistic video reconstruction from brain activity”, Le et al 2021 (reconstructing Dr. Who)
- “High-performance brain-to-text communication via imagined handwriting”, Willett et al 2020
- “Brain-computer interface for generating personally attractive images”, Spape et al 2021 (simple EEG-based optimization of ProGAN faces; many ways to improve this…)

Matters Of Scale:

“Scaling Laws for Transfer”, Hernandez et al 2021 (“We find that pre-training effectively multiplies the fine-tuning dataset size”; a shot across the bow of anyone floating on a proprietary-dataset moat: large models can drop data requirements by orders of magnitude overnight, even surpassing you)
“ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision”, Jia et al 2021 (see also CC-12M; CLIP-like w/EfficientNet trained on 1.8 billion images on a TPUv3-1024—DM argues that fancier cross-modal Transformers are better, nevertheless, ‘TPUs go brrr’. Given DALL·E 1, CLIP, ALIGN, VDVAE, CW-VAE, AIPO, DCTransformer neural radiance fields et al, are GANs already dead, and just don’t realize it yet? Or at least soon to be relegated to only DRL-like uses as a final finetuning phase to sharpen up a self-supervised model?); “WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training”, Huo et al 2021
“DALL·E 1: Zero-Shot Text-to-Image Generation”, Ramesh et al 2021 (original blog); “M6: A Chinese Multimodal Pretrainer”, Lin et al 2021 (Chinese DALL·E 1: 1.9TB images/0.29TB text for 10b-parameter dense/100b-parameter MoE Transformer; shockingly fast Chinese replication of DALL·E 1/CLIP)
“Explaining Neural Scaling Laws”, Bahri et al 2021/“Learning Curve Theory”, Hutter2021 (Rohin Shah commentary; more on the manifold hypothesis)

Genetics

Everything Is Heritable:

“Phenotypic covariance across the entire spectrum of relatedness for 86 billion pairs of individuals”, Kemper et al 2021
“Genetic variation, brain, and intelligence differences”, Deary et al 2021
“Pathfinder: A gamified measure to integrate general cognitive ability into the biological, medical and behavioral sciences”, Malanchini et al 2021 (not the focus, but the IQ PGS is a slight improvement over Allegrini et al 2018 due to less phenotype measurement error?)
“Polygenic burden has broader impact on health, cognition, and socioeconomic outcomes than most rare and high-risk copy number variants”, Saarentaus et al 2021
On candidate-genes & COMT

Recent Evolution:

Engineering:

First Black-Footed Ferret cloned

Statistics/Meta-Science

“Lessons from Gerolamo Cardano’s The Book of My Life” (progress studies; see also Newton’s anthropic argument, Bakewell & inventing progress, The Autobiography of Benvenuto Cellini)
“How Many Microcovids Would You Spend on a Burrito?” (on the microCOVID Project Calculator)
On Piffles
“Artifact and Recording Concepts in EEG”, Tatum et al 2011 (on the EEG signals of Jell-O, or, the importance of negative controls)

Politics/Religion

Fads: “The Logic of Fashion Cycles”, Acerbi et al 2012; “Fashion and art cycles are driven by counter-dominance signals of elite competition: quantitative evidence from music styles”, Klimek et al 2019; “The hipster effect: When anti-conformists all look the same”, Touboul2019; “Right Is The New Left”, Scott Alexander (see also Han et al 2010, Downs1972/Gupta & Jenkins-Smith2015, Lorenz-Spreen et al 2019/Candia et al 2019, Loury1994)
“What can we learn from the lunar pandemic that never was?” (NASA’s lunar quarantine was a sham intended to mollify the public as they covered up repeated major failures & lab leaks both before & after—had there been any dangerous lunar organisms, they would have escaped easily)
MrBeast (the new aristocracy of prestige? Borrowed plumage, perhaps, but effective…)
“Russia’s new Lysenkoism”, Kolchinsky et al 2017

Psychology/Biology

“Lessons from the host defences of bats, a unique viral reservoir”, Irving et al 2021 (bat-borne viruses; previously, Trevor Klee)
“Beneficial & Detrimental Effects of Reactive Oxygen Species on Lifespan: A Comprehensive Review of Comparative & Experimental Studies”, Shields et al 2021 (antioxidants still aren’t the fountain of youth, and may be harmful; animal studies still frequently inconsistent)
“Positive expectations predict improved mental-health outcomes linked to psychedelic microdosing”, Kaertner et al 2021 (placebo)
“The Effects of Fluoride in Drinking Water”, Aggeborn & Öhman 2021
“Sleep & Sex: What Can Go Wrong? A Review of the Literature on Sleep Related Disorders and Abnormal Sexual Behaviors & Experiences”, Schenck et al 2007

Semaglutide

WP: “Once-Weekly Semaglutide in Adults with Overweight or Obesity”, Wilding et al 2021; “Effect of Subcutaneous Semaglutide vs Placebo as an Adjunct to Intensive Behavioral Therapy on Body Weight in Adults With Overweight or Obesity: The STEP 3 Randomized Clinical Trial”, Wadden et al 2021

A longer-acting version of the insulin/appetite peptide liraglutide, semaglutide greatly reduces weight, fat, blood sugar, cholesterol etc, with an upcoming oral version; background: Kushner et al 2020, Aroda et al 2019, Nauck & Meier2019, O’Neil et al 2018, Blundell et al 2017, Nauck et al 2016, Lau et al 2015.

Quick-fixes like semaglutide may be our only hope, however unvirtuous they seem, because society is fixed but biology mutable.

Technology

New X-Prize: $100m in prizes for Carbon Removal
Wringing gauge blocks (“With their precisely-flat metal faces, gauge blocks can be stuck together non-magnetically via a process calling ‘wringing’, requiring substantial effort to separate. Scientists are still uncertain exactly how wringing works.”)
Armored train

Economics

“Why did renewables become so cheap so fast? And what can we do to use this global opportunity for green growth?”, Max Roser (specifically, why such an extreme experience curve?)
“IQ, trading behavior, and performance”, Grinblatt et al 2012; “Genetic Endowments and Wealth Inequality”, Barth et al 2020 (why, despite notorious setbacks, did Isaac Newton & LTCM’s founders die wealthy? Why, in general, are more intelligent people so much better investors? ‘The indifference of the indicator’: it’s not one thing, it’s everything—more intelligent people have lower discount rates, save more for longer & are less risk-averse, more accurately predict future growth or inflation, are more likely to participate in +EV opportunities like the stock market, to use low-fee rather than high-fee (and thus, underperforming) mutual funds, succumb less to biases like herding as they trade better & at better times, trade less, and harvest losses more efficiently when trading poorly.)