June 2021 News

Gwern

June 2021 News

June 2021 Gwern.net newsletter with new site features/refactoring, and links on Codex, embryo selection, and gametogenesis.

in progress : log : 0

June 2021’s ⁠Gwern.net⁠ newsletter⁠ is now out; previous, ⁠May 2021⁠ (archives⁠). This is a collation of links and summary of major changes, overlapping with my ⁠Changelog⁠; brought to you by my donors on Patreon⁠.

Writings

Gwern.net:

⁠LinkAuto.hs⁠: a Pandoc library for automatically turning user-defined regexp-matching strings into links (discussion⁠)

Refactoring pages (newly split out):

Links

AI

“Vector Quantized Models for Planning”⁠, Ozairet al2021 (MCTS⁠ on VQ-VAE⁠ to generalize MuZero⁠ to stochastic/hidden-info environments—towards ⁠planning over world models…); “Stochastic MuZero: Planning in Stochastic Environments with a Learned Model”⁠, Astonogluet al2022; “Playing Nondeterministic Games through Planning with a Learned Model”⁠, Willkens & Pollack2021
“FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness”⁠, Daoet al2022
⁠“A graph placement methodology for fast chip design”⁠, Mirhoseiniet al2021 (media⁠; optimizing TPU⁠ ‘chip floor planning’ circuit placement)
“The Robot Household Marathon Experiment”⁠, Kazhoyanet al2020 (⁠media⁠; benchmarking PR2 robot on making & cleaning up breakfast: successful setup, but many failures in cleanup—bad planning & control software continues to bottleneck robotics. “Intelligent matter”⁠ doesn’t matter if you lack enough ‘intelligence’ to make the matter, matter.)
“GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation (works for videos too!)”⁠, Chong & Forsyth2021 (code/video/Colab⁠)
“Alias-Free GAN (Generative Adversarial Networks)”⁠, Karraset al2021 (successor to StyleGAN2-ADA⁠ introduces a major architectural revamp to enable ultra-smooth interpolations; this should be much more useful for video: ⁠AFHQ example⁠, ⁠more samples⁠); “Efficient Geometry-aware 3D Generative Adversarial Networks”⁠, Chanet al2021 (also good AFHQ⁠ examples when interpolated or video)
⁠“What A Long, Strange Trip It’s Been: EleutherAI One Year Retrospective”⁠ (⁠AmA⁠); ⁠“We Are Conjecture, A New Alignment Research Startup”⁠ (EAIers moving on)
- “Alien Dreams: An Emerging Art Scene”⁠, Charlie Snell (catching up on hobbyist exploration of CLIP⁠-generated images); ⁠“Tour of the Sacred Library”, Ryan Moulton (curated Dinotopia-style⁠ images)

Matters Of Scale⁠:

video generation⁠:
- “Imagen Video”⁠, Hoet al2022 (editing⁠); ⁠“Phenaki”, Villegaset al2022 (‘⁠Parti⁠ Video’ analogue); “W.A.L.T.: Photorealistic Video Generation with Diffusion Models”⁠, Guptaet al2023; “VideoPoet: A large language model for zero-shot video generation”⁠ (text2video/image/stylizing/audio-generation/inpainting…)
- Make-A-Video⁠, Singeret al2022
- “MAGVIT: Masked Generative Video Transformer”⁠, Yuet al2022
- ⁠“TECO: Temporally Consistent Video Transformer for Long-Term Video Prediction”, Yanet al2022
- “NÜWA-∞: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis”⁠, Wuet al2022 (video using “NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion”⁠, Wuet al2021)
- “TATS: Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer”⁠, Geet al2022;
- “CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers”⁠
- “FDM: Flexible Diffusion Modeling of Long Videos”⁠, Harveyet al2022 (hour-long plausible videos with 1–2 GPU-weeks⁠)/“Video Diffusion Models”⁠, Hoet al2021
“Large Language Models are Zero-Shot Reasoners”⁠, Kojimaet al2022 (inner monologue⁠ in base GPT-3⁠ unlocked by prompt “let’s think step by step”: 3 → 20% zero-shot arithmetic); “CodeT: Code Generation with Generated Tests”⁠, Chenet al2022 (boosting Codex correctness by half by thinking-aloud tests first); “CodeGen: A Conversational Paradigm for Program Synthesis”⁠, Nijkampet al2022 (improving Codex-style gen by step-by-step dialogue); “Minerva: Solving Quantitative Reasoning Problems with Language Models”⁠, Lewkowyczet al2022; “Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification”⁠, Zhouet al2023; “Faithful Reasoning Using Large Language Models”⁠, Creswell & Shanahan2022 (Chinchilla inner-monologue for beam search⁠ over arguments); “ReAct: Synergizing Reasoning and Acting in Language Models”⁠, Yaoet al2022 (PaLM⁠-540B inner-monologue for accessing live Internet APIs to reason over, beating RL agents); “Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them”⁠, Suzgunet al2022 (“yes”—inner-monologue reveals flat BIG-Bench scaling curves conceal large capability gains); “Large Language Models Can Self-Improve”⁠, Huanget al2022 (PaLM self-distillation⁠ on inner-monologue scores++; knowledge distillation); “PAL: Program-aided Language Models”⁠, Gaoet al2022 (Codex inner-monologue); “Language Models are Multilingual Chain-of-Thought Reasoners”⁠, Shiet al2022; “Interactive-Chain-Prompting: Ambiguity Resolution for Crosslingual Conditional Generation with Interaction”⁠, Pilautet al2023 (emergent PaLM ability to ask inner-monologue-like clarifying questions at 63b → 540b); “Boosting Theory-of-Mind Performance in Large Language Models via Prompting”⁠, Moghaddam & Honey2023; “q2d: Turning Questions into Dialogs to Teach Models How to Search”⁠, Bittonet al2023; “Think Before You Act: Unified Policy for Interleaving Language Reasoning with Actions”⁠, Mezghaniet al2023 (Decision-Transformer+inner-monologue in game-playing?)
“Teaching Models to Express Their Uncertainty in Words”⁠, Linet al2022 (finetuned GPT-3-175b can be calibrated about answer correctness); “Language Models (Mostly) Know What They Know”⁠, Kadavathet al2022
“GitHub Copilot: Your AI pair programmer”⁠ (media⁠); “Codex: Evaluating Large Language Models Trained on Code”⁠, Chenet al2021 (on small versions)
New GPT-3-based code completion⁠ for GitHub⁠; like TabNine or IntelliCode⁠, but more so, and a tasty lollipop indeed; puzzlingly, OA/GH appear to do no checks like n-grams for possible copying, but copying is rare anyway⁠.
It is darkly hilarious to see programmers react little better than artists did in peddling misinformation about Copilot like unlabeled “humor”, instantly turning into substance dualists insisting that “computers can never truly understand code or be creative unlike us humans with souls^Wminds”, Internet IP lawyers who have never heard of the word “transformative”⁠, and infosec experts engaging in histrionics about it “leaking secrets”—from public Github repos, y’know, made up of public commits to public repos you really should not be uploading any passwords or keys to because attackers have been actively monitoring in realtime for credentials to steal for over a decade now? In a year, who will remember any of this BS? A few months later (also like TFDNE⁠), it looked like the world hadn’t ended and ⁠people moved on⁠. Machiavelli⁠ had it right: “…there is nothing more difficult to take in hand, more perilous to conduct, or more uncertain in its success, than to take the lead in the introduction of a new order of things. Because the innovator has for enemies all those who have done well under the old conditions, and lukewarm defenders in those who may do well under the new. This coolness arises partly from…the incredulity of men, who do not readily believe in new things until they have had a long experience of them.” (What if there was a revolution and no one cared?)
Regardless, “attacks only get better”, and Copilot surely will. I’m a little surprised that Copilot/Codex appear to be trained only on entire source code files, when patches are the perfect training data for making a promptable edit-capable LM: a patch is a human-readable summary/explanation of the following changes, provides an immediately promptable description of quality⁠ before/after, programmer-level conditioning (so you can prompt for great programmers), and a compact word-diff⁠ format is ⁠an ideal output format for a LM to bugfix/update the context without the overhead of generating the entire module, particularly if done repeatedly as part of an “inner monologue”⁠ approach to incrementally update a program towards correctness rather than attempting to brute-force program writing in a single shot. (Plus, who has more Git⁠ repos or expertise in parsing Git patches than Github?) I look forward to seeing what better Codex models can do—Sam Altman mentioned in the 2021-09-05 SSC Q&A that the next version will be much better, and that one underappreciated benefit of the OA/MS partnership is the sheer volume of human feedback that OA can train models on.
Incidentally, Perlis also remarked⁠ that “In a 5 year period we get one superb programming language. Only we can’t control when the 5 year period will begin.” A fancy tab-complete⁠ can do wonders in making an enormously-overcomplicated ecosystem ‘discoverable’ and getting one unstuck, but it is still just a tab-complete. If you were designing a language/IDE/ecosystem on the presumption of a GPT-3-level model, would you really settle on “Visual Studio IDE for an OO-glazed ALGOL language but with a fancier tab-complete”? Given the importance of prompt engineering⁠, a native DL programming solution would probably emphasize writing docs/comments explaining the intent to the paired model—“au pair programming”.

Genetics

Everything Is Heritable:

⁠“Genetics of substance use disorders in the era of big data”⁠, Gelernter & Polimanti2021
“Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses”⁠, Bartonet al2020

Engineering:

Aurea Smigrodzki: ⁠“The first baby in history to be conceived with the help of polygenic testing” (born mid-2020; ⁠father’s panel⁠; discussion⁠, ⁠Hsu, media⁠, podcast⁠, ⁠video⁠; see also Conley⁠)
I was surprised how little discussion the announcement provoked, after all the Sturm und drang the prospect of embryo selection provoked. But when it happened, there wasn’t a single media article for 3 months, and that article seemed to focus more on her father’s distaste for journalists. (Such a focus seems unlikely to improve Dr. Smigrodzki’s opinion of them.) What if there was a revolution and no one cared?
“CRISPR-Cas9 In Vivo Gene Editing for Transthyretin Amyloidosis”⁠, Gillmoreet al2021
Gametogenesis⁠ (catching up):
- “Artificially produced gametes in mice, humans and other species”⁠, Hayashiet al2021
- “Gametes from stem cells: Status and applications in animal reproduction”⁠, Goszczynskiet al2019 (previously: ⁠“In vitro breeding: application of embryonic stem cells to animal production”⁠, Goszczynskiet al2018)
- “Danish dairy farmers’ acceptance of and willingness to use semen from bulls produced by means of in vitro embryo production and genomic selection”⁠, Lundet al2021

Statistics/Meta-Science

“What We Learned Doing Fast Grants”, Patrick Collison, Tyler Cowen⁠, & Patrick Hsu (media⁠); “Introducing Arc Institute”, Patrick Hsu & Silvana Konermann & Patrick Collison
“Shifting the impossible to the inevitable: A Private ARPA (PARPA) user manual”, Ben Reinhardt (how to try to clone the ARPA model outside government)
“Do Meta-Analyses Oversell the Longer-Term Effects of Programs? (Part 1): Detecting Follow-Up Selection Bias in Studies of Postsecondary Education Programs”, Bailey & Weiss2022 (yes: people don’t like to follow up experiments which don’t look like they’re working…)
“Do multiple experimenters improve the reproducibility of animal studies?”⁠, von Kortzfleischet al2022
“Test & Roll: Profit-Maximizing A/B Tests”⁠, Feit & Berman2017
⁠“A visual introduction to Gaussian Belief Propagation: A Framework for Distributed Inference with Emerging Hardware”, Ortizet al2021 (explorable ‘loopy’ belief propagation⁠ for computing Bayesian networks⁠)
⁠“Statistical Inquiries into the Efficacy of Prayer”, Galton1872

Politics/Religion

“Get lucky: Happy 245^th, America!”⁠, Razib Khan
The 1989 California medfly attack⁠ (a successful bioterrorism extortion of the California government; cf. “Dark Harvest”⁠)

Psychology/Biology

“Air Pollution and Adult Cognition: Evidence from Brain Training”⁠, La Nauze & Severnini2021 (using 4.6m scores from Lumosity brain-game players to measure PM2.5 cognitive effects: large but heterogeneous by task/age/experience—this may explain the inconsistencies of previous air-pollution/cognition studies)
“Efficacy of Wolbachia-Infected Mosquito Deployments for the Control of Dengue”⁠, Utariniet al2021
“Senescent cell turnover slows with age providing an explanation for the Gompertz law”⁠, Karinet al2019
“Human click-based echolocation: Effects of blindness and age, and real-life implications in a 10-week training program”⁠, Normanet al2021
Hypospray⁠
“The Scavenging Patterns of Feral Cats on Human Remains in an Outdoor Setting”⁠, Garciaet al2019

Technology

“5 Years of Graduate CS Education Online and at Scale”⁠, Joyneret al2019 (why are there so many interesting grad students/researchers at Georgia Tech⁠ eg. in EleutherAI? perhaps their master’s MOOC⁠, which admits everyone; an experience report)
Meteor burst communications⁠ (“AMBCS was able to greatly improve the data rates, averaging 4 kilobits per second”)
Kapitza’s pendulum⁠ (surprisingly, you can stabilize an inverted pendulum without any feedback or control by brute force—simply wiggling it fast)

Economics

Another reminder for my tech readers: ask for more money!⁠ As the USA comes out of coronavirus, there may well have never been a better time in history for programmers (especially for in-demand specialties) to negotiate more pay. Demand money, equity, or true remote (not ‘come in one day less’, that doesn’t let you escape the Bay Area)—nothing less. Don’t be fobbed off with trinkets like “free kitchenette snacks”. If you don’t get it, leave and get the raise elsewhere. (Only rolling stones gather moss these days.); “How Not to Bomb Your Offer Negotiation”

Miscellaneous

“Harris’s List of Covent Garden Ladies⁠ [TPDR], a directory of London prostitutes published annually 1757^–₃₈1795_230ya, containing entries describing the physical appearance and sexual specialties of the women…”
Cummingtonite⁠