“Efficient Polygenic Risk Scores for Biobank Scale Data by Exploiting Phenotypes from Inferred Relatives”, Buu Truong, Xuan Zhou, Jisu Shin, Jiuyong Li, Julius H. J. van der Werf, Thuc D. Le, S. Hong Lee2020-06-17 (; backlinks; similar)⁠:

Polygenic risk scores are emerging as a potentially powerful tool to predict future phenotypes of target individuals, typically using unrelated individuals, thereby devaluing information from relatives. Here, for 50 traits from the UK Biobank data, we show that a design of 5,000 individuals with first-degree relatives of target individuals can achieve a prediction accuracy similar to that of around 220,000 unrelated individuals (mean prediction accuracy = 0.26 vs. 0.24, mean fold-change = 1.06 (95% CI: 0.99–1.13), p = 0.08), despite a 44-fold difference in sample size. For lifestyle traits, the prediction accuracy with 5,000 individuals including first-degree relatives of target individuals is statistically-significantly higher than that with 220,000 unrelated individuals (mean prediction accuracy = 0.22 vs. 0.16, mean fold-change = 1.40 (1.17–1.62), p = 0.025). Our findings suggest that polygenic prediction integrating family information may help to accelerate precision health and clinical intervention.

…We demonstrated that the polygenic prediction using close relatives between reference and target samples outperformed the analyses with unrelated individuals only by using the small-scale design. Compared with the analyses with second-degree or third-degree relatives, or unrelated individuals, a higher prediction accuracy was observed from the analysis with first-degree relatives, which was because of a lower value of Me that required fewer independent parameters to be estimated25,26,27. Moreover, this higher prediction accuracy was also probably due to the fact that close relatives share some unknown (unmodeled) factors in addition to additive genetic effects, which may be dominance, gene-by-family interaction and familial environmental effects. It was also shown that the analyses with second-degree and third-degree relatives outperformed the analysis with unrelated individuals although they were less efficient to improve the prediction accuracy, compared to first-degree relatives.

The approach of including close relatives will be most useful in applications where accuracy matters more than delineating between causal genetic effects and other effects. It is known that family-based heritability estimates can be inflated if nonadditive genetic effects or common environmental effects shared between close relatives are confounded with additive genetic effects3, which can be considered biased according to the concept of narrow-sense heritability that includes the additive genetic effects only. However, this bias should not be an issue when predicting the future phenotypes of target sample (ie. a new-born baby) because such nonadditive genetic and common environmental effects can be a valuable source to improve the prediction accuracy28,42. Indeed, family history has been widely used as a biomarker to predict disease risk43,44, and it can also be used to increase the power to identify causal variants in GWAS45,46,47. We consider that our method is a more systematic approach to use information of family history as well as within-family segregation48.