August 2017 News

Gwern

August 2017 News

August 2017 Gwern.net newsletter with links on genetics, AI, measurement error, bioethics, catnip, formal schooling, 1 book review, 2 anime reviews

finished certainty: log bibliography

Writings
Media
- Links
- Books
- Film/TV

This is the August 2017 edition of the Gwern.net newsletter; previous, July 2017 (archives). This is a collation of links and summary of major changes, overlapping with my Changelog; brought to you by my donors on Patreon.

Writings

Nothing completed

Media

Links

Genetics:

Engineering:
- “Correction of a pathogenic gene mutation in human embryos”, Ma et al 2017 (Human CRISPR editing: no observed off-targets, 27.9% efficiency)
- “Inactivation of porcine endogenous retrovirus in pigs using CRISPR-Cas9”, Niu et al 2017 (media; the Church lab continues making progress on editing out potentially dangerous viruses from pig DNA with CRISPR germline engineering.)
- “China’s embrace of embryo selection raises thorny questions: Fertility centres are making a massive push to increase preimplantation genetic diagnosis in a bid to eradicate certain diseases” (The wind is rising.)
- “U.S. attitudes on human genome editing”, Scheufele et al 2017 (increasingly positive public opinion, especially among the most informed)
- “A Future of Genetically Engineered Children Is Closer Than You’d Think: Last week, US scientists edited a human embryo for the first time. That’s just the beginning”
- “A New Way to Reproduce: Scientists are trying to manufacture eggs and sperm in the laboratory. Will it end reproduction as we know it?” (Rapid progress towards iterated embryo selection; will it become a reality before genome synthesis?)
- “Genomic prediction unifies animal and plant breeding programs to form platforms for biological discovery”, Hickey et al 2017
- “Mothers want extraversion over conscientiousness or intelligence for their children”, Latham & von Stumm 2017
- “How Driscoll’s Reinvented the Strawberry”
Recent Evolution:
- “Evidence for evolutionary shifts in the fitness landscape of human complex traits”, Uricchio et al 2017 (Recent human evolution <50kya: selection on education/intelligence, schizophrenia, height, BMI, Crohn’s disease)
- “The Uncertain Future of North Ronaldsay’s Seaweed-Eating Sheep: Concerned islanders are rethinking old traditions to ensure the survival of this very special breed” (North Ronaldsay sheep have evolved in <180 years to eat exclusively seaweed, to the point where an ordinary grass diet might kill them by copper poisoning, due to an enormous wall called the Sheepdyke which was built around the entire island by their ancestors to exile the sheep onto the beaches away from them. Oh, and winter is actually the best season for them to fatten up. You would hardly believe it inside Game of Thrones.)
Everything Is Heritable (the promised UKBB gold rush):
- “An atlas of genetic associations in UK Biobank”, Canela-Xandri et al 2017 (at least 559 of 717 UKBB traits have detectable SNP heritability, with pervasive genetic correlations; polygenic scores for all have been made public via Gene ATLAS. See previously: Ge et al 2016)
- “10 Years of GWAS Discovery: Biology, Function, and Translation”, Visscher et al 2017 (Almost a decade into the human genetic revolution—GCTA ~2009—where do we stand?)
- Intelligence:
  - “99 independent genetic loci influencing general cognitive function include genes associated with brain health and structure (n = 280,360)”, Davies et al 2017 (4% polygenic score)
  - “Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits and implications for the future”, Zhang et al 2018 (IQ GWASes still on track to crack most of SNP heritability by n = 1m, so we may see PGSes of anywhere up to 30% within a few years)
  - “Multi-polygenic score approach to trait prediction”, Krapohl et al 2017 (New IQ polygenic score: 4.8%. This result is not as good as Hill et al 2017’s 7%, because the elastic net is applied only to the selection of whole PGSes rather than the SNPs which make up each PGS & the PGSes are combined crudely in a linear regression—but note that it’s much older work, submitted almost a year ago. Note also the implications for the upcoming SSGAC paper: big boosts, double or more, possible through better regularization & genetic correlations. So >10% is entirely possible, which is amazing to think about coming so soon. Exciting!)
  - “Functional consequences of genetic loci associated with intelligence in a meta-analysis of 87,740 individuals”, Coleman et al 2017
  - “The genetic basis of human brain structure and function: 1,262 genome-wide associations found from 3,144 GWAS of multimodal brain imaging phenotypes from 9,707 UK Biobank participants”, Elliott et al 2017
- Psychiatric:
  - “Genome-wide analysis of risk-taking behavior and cross-disorder genetic correlations in 116,255 individuals from the UK Biobank cohort”, Strawbridge et al 2017 (more concentration of misery)
  - “Genome-wide association study identifies 30 loci associated with bipolar disorder”, Stahl et al 2017
  - “Genome-wide association study of depression phenotypes in UK Biobank (n = 322,580) identifies variants in excitatory synaptic pathways”, Howard et al 2018 (Insight into depression from the recent GWASes & UK Biobank data releases—and again synaptic plasticity/neurogenesis seems to be the common theme.)
  - “Genomic dissection of bipolar disorder and schizophrenia including 28 subphenotypes”, PGC Bipolar Disorder/Schizophrenia Working Group et al 2017
- “Large-scale GWAS identifies multiple loci for hand grip strength providing biological insights into muscular fitness”, Willems et al 2017 (Mendelian Randomization suggests hand grip strength has no causal effect on mortality & is confounded)
- “GWAS of habitual physical activity in over 277,000 UK Biobank participants identifies novel variants and genetic correlations with chronotype and obesity-related traits”, Klimentidis et al 2017; “Differences in genetic and environmental variation in adult BMI by sex, age, time period, and region: an individual-based pooled analysis of 40 twin cohorts”, Silventoinen et al 2017 (Consistently high heritability & low shared-environment of BMI over 60 years across North America/Europe/Australia/East Asia)
- “Cost-effectiveness of pharmacogenetic-guided treatment: are we there yet?”, Verbelen et al 2016
- “After A Comeback, 23andMe Faces Its Next Test: Can the pioneering DNA-testing company satisfy the FDA while also staying true to its founding mission: putting people in control of their healthcare?” (23andMe now has n > 2,000,000)
- “The surprising implications of familial association in disease risk”, Valberg et al 2017
- “Heritability and Characteristics of Catnip Response in 2 Domestic Cat Populations”, Villani 2011_15ya (since Todd 1962, 55 years of catnip literature has assumed catnip response was due to a Mendelian dominant gene; the unknown thesis Villani 2011_15ya shows this is wrong, it is a highly heritable polygenic liability threshold. Why was Todd wrong? Sampling error exacerbated by measurement error—cat response to catnip is noisy so they need to be measured multiple times, and possibly Todd’s ratings, not blinded to his Mendelian hypothesis, were biased to find a Mendelian pattern.)

AI:

“SMASH: One-Shot Model Architecture Search through HyperNetworks”, Brock et al 2017

Video; Code; discussion. It’s amazing this works; is there anything convolutions can’t do?

The architecture: at each training step, generate the schematics of a random NN architecture; feed the skeleton into the hypernetwork, which will directly spit out numbers for each neuron (as a convolutional hypernetwork it can handle big and small NNs the same way); with the fleshed out NN, train 1 minibatch on the image classification task as usual, and update its parameters; use that update as the ‘error’ for the hypernetwork to train it to spit out weights for that skeleton which are slightly closer to what it was after 1 minibatch. After training the hypernetwork many times on many random NN architectures, its generated weights will be close to what training each random NN architecture from scratch would have been. Now you can simply generate lots of random NN architectures, fill them in, run them on a small validation set, and see their ‘final’ performance without ever actually training them fully (which would be like 10,000× more expensive). So this runs on 1 GPU in a day or two versus papers like Zoph which used 800 GPUs for a few weeks…

It’s amazing this works, and like synthetic gradients it troubles me a little because it implies that even complex highly sophisticated NNs are in some sense simple & predictable as their weights/error-gradients can be predicted by other NNs which are as small as linear layers or don’t even see the data, and thus are incredibly wasteful in both training & parameter size, implying a large hardware overhang.
“CAN: Creative Adversarial Networks, Generating ‘Art’ by Learning About Styles and Deviating from Style Norms”, Elgammal et al 2017 (An interesting way to define novelty in a GAN context.)
“ArtGAN: Artwork Synthesis with Conditional Categorical GANs”, Tan et al 2017 (fun samples)
“A Survey of Monte Carlo Tree Search (MCTS) Methods”, Browne et al 2012_14ya; “A Tutorial on Thompson Sampling”, Russo et al 2017
“Inside Google Waymo’s Secret World for Training Self-Driving Cars: An exclusive look at how Alphabet understands its most ambitious artificial intelligence project” (extensive discussion of Google Waymo’s simulation testing of self-driving cars)
“LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions”, Wang et al 2017
iNaturalist: a smartphone CNN app for identifying photos of 13k species of wild animals/insects/plants. (discussion)
“Learning to Infer Graphics Programs from Hand-Drawn Images”, Ellis et al 2017 (cute diagrams for LaTeX)
Make Girls Moe: in-browser anime face generation (SResnetGANs) (Surprisingly high quality: “Towards the Automatic Anime Characters Creation with Generative Adversarial Networks”, Jin et al 2017; discussion)

Statistics/Meta-Science:

“Discontinuation and Nonpublication of Randomized Clinical Trials Conducted in Children”, Pica et al 2016 (>35000 children subjected to useless human experimentation annually in the USA. I’m glad the bioethicists and IRBs are tackling the real problems, like what happens to leftover embryos or whether genetic engineering might insult the disabled or whether enough community meetings have been held & all “stakeholders” properly informed.)
“A long journey to reproducible results: Replicating our work took four years and 100,000 worms but brought surprising discoveries” (background to Lucanic et al 2017)
“The prior can generally only be understood in the context of the likelihood”, Gelman et al 2017 (“They strain at the gnat of the prior who swallow the camel of the likelihood.”)
“My IRB Nightmare”
“When Exactly Will the Eclipse Happen? A Multi-millennium Tale of Computation”

Politics/religion:

“The High Cost of Not Doing Experiments”, Nisbett 2015
“Japan’s Eightfold Fence”: re-evaluating the failure of Japanese modernization, Auslin 2017

Psychology/biology:

“Mapping the Human Exposome: It’s now possible to map a person’s lifetime exposure to nutrition, bacteria, viruses, and environmental toxins-which profoundly influence human health”; “Numerous uncharacterized and highly divergent microbes which colonize humans are revealed by circulating cell-free DNA”, Kowarsky et al 2017 (media; towards finding the nonshared-environment component: taking the Gloomy Hypothesis seriously, we are going to need GWAS-scale approaches to measuring pollutants & infections & stressors to find the actual causal agents, instead of the get-rich-quick schemes which dominate epidemiology.)
“China launches brain-imaging factory: Hub aims to make industrial-scale high-resolution brain mapping a standard tool for neuroscience” (Such data streams could also impact GWASes for intelligence and other cognitive traits: the brains will of course be genotyped and available for GWAS, so the data will be ready and waiting for hierarchical and SEM methods which fractionate intelligence. It is worth noting that the UK Biobank—the gift that keeps on giving—intends to fMRI up to 100,000 of its participants, which would be a good complement to these connectomes, as its current dataset is underpowered eg. Wigmore et al 2017)
“Intelligence and all-cause mortality in the 6-Day Sample of the Scottish Mental Survey 1947_79ya and their siblings: testing the contribution of family background”, Iveson et al 2017 (Within-family study demonstrating IQ predicts longevity regardless of family SES)
“Responsiveness of cats (Felidae) to silver vine (Actinidia polygama), Tatarian honeysuckle (Lonicera tatarica), valerian (Valeriana officinalis) and catnip (Nepeta cataria)”, Bol et al 2017

One of the best catnip research papers ever written: using both a large number of cats n = 100 and all 4 major cat stimulants, Bol et al 2017 is an impressive experiment. It’s worth highlighting that the 4 stimulants were offered to the same set of cats, allowing measurement of intercorrelations, cats were exposed at least twice with attention given to a low-stress administration (reducing measurement error, as emphasized by Villani 2011_15ya), gas chromatography was used for a chemical analysis, the sample size is one of the largest ever (exceeded only by Villani 2011_15ya/Lyons 2013_13ya and my surveys), the partnership with Big Cat Rescue extends the results to several other interesting species, and the full dataset is included with the paper (!). I took a closer look at the intercorrelations in the raw data to derive an optimal test sequence.
“Chronotype variation drives night-time sentinel-like behavior in hunter-gatherers”, Samson et al 2017 (This could also explain why there is a wrenching shift in circadian rhythms towards inefficient later night-wakefulness during puberty and then only gradually shifts back to day-wakefulness with increasing age: this is the same time at which males become useful warriors and females start having offspring which must be minded. It becomes a stable equilibrium due to any defectors having an advantage in attacking rival tribes—anthropologists have long noted that ambushes & night raids are much preferred to open battle in the day.)
“Lower school performance in late chronotypes: underlying factors and mechanisms”, Zerbini et al 2017
“Have You Tried Solving The Problem?” (by the late Grognor)

Technology:

“NASA’s ambitious plan to save the Earth from the Yellowstone supervolcano: with an eruption brewing, it may be the only way to prevent the extinction of the human race” (Who knew fixing Yellowstone would be as easy as sticking a geothermal power plant on top of it? Obvious in retrospect. If it takes 600,000 years for Yellowstone to heat up enough to blow, then the total heat inflow per year can’t be that much, and one is moving heat away to dissipate it and equalize entropy, so thermodynamics is your ally for once, and you can even use the heat gradient to power whatever mechanism you are using, and you don’t even have to take out the net flow of heat as it would be enough to merely slow the heating and gain humanity another few centuries, after which it won’t matter.)
“7.2.1.7: History of Combinatorial Generation”, Knuth (reviews the long and error-prone history of combinatorial generation of sequences/permutations.)
A twist on one-time pads: “hyper-encryption”
DCoins: a malicious ICO for DDOsing Ethereum (~$3.5m/week) (A simple game-theory mechanism to incentivize spamming the Ethereum network: a contract which pays out decreasing amounts to only low-fee transactions. Since rejected transactions cost nothing, it becomes dominant for everyone to send in a withdrawal as often and early as possible…)
Note to Reddit users: ChangeTip.com still has outstanding Bitcoin balances for many users, and the dust there may be worth the time to withdraw; for example, I had ~$269.64^$200₂₀₁₇ worth of Bitcoin there.