“Modeling Functional Enrichment Improves Polygenic Prediction Accuracy in UK Biobank and 23andMe Data Sets”, Carla Márquez-Luna, Steven Gazal, Po-Ru Loh, Nicholas A. Furlotte, Adam Auton, 23andMe, Alkes Price2018-07-24 (; backlinks; similar)⁠:

Genetic variants in functional regions of the genome are enriched for complex trait heritability. Here, we introduce a new method for polygenic prediction, LDpred-funct, that leverages trait-specific functional enrichments to increase prediction accuracy. We fit priors using the recently developed baseline-LD model, which includes coding, conserved, regulatory, and LD-related annotations.

We analytically estimate posterior mean causal effect sizes and then use cross-validation to regularize these estimates, improving prediction accuracy for sparse architectures. LDpred-funct attained higher prediction accuracy than other polygenic prediction methods in simulations using real genotypes.

We applied LDpred-funct to predict 16 highly heritable traits in the UK Biobank. We used association statistics from British-ancestry samples as training data (avg JV=365K) and samples of other European ancestries as validation data (avg 7V=22K), to minimize confounding. LDpred-funct attained a +27% relative improvement in prediction accuracy (avg prediction R2=0.173; highest R2=0.417 for height) compared to existing methods that do not incorporate functional information, consistent with simulations. For height, meta-analyzing training data from UK Biobank and 23andMe cohorts (total iV = 1107K; higher heritability in UK Biobank cohort) increased prediction R^2 to 0.429.

Our results show that modeling functional enrichment substantially improves polygenic prediction accuracy, bringing polygenic prediction of complex traits closer to clinical utility.