“Genetic Diversity Fuels Gene Discovery for Tobacco and Alcohol Use”, Gretchen R. B. Saunders, Xingyan Wang, Fang Chen, Seon-Kyeong Jang, Mengzhen Liu, Chen Wang, Shuang Gao, Yu Jiang, Chachrit Khunsriraksakul, Jacqueline M. Otto, Clifton Addison, Masato Akiyama, Christine M. Albert, Fazil Aliev, Alvaro Alonso, Donna K. Arnett, Allison E. Ashley-Koch, Aneel A. Ashrani, Kathleen C. Barnes, R. Graham Barr, Traci M. Bartz, Diane M. Becker, Lawrence F. Bielak, Emelia J. Benjamin, Joshua C. Bis, Gyda Bjornsdottir, John Blangero, Eugene R. Bleecker, Jason D. Boardman, Eric Boerwinkle, Dorret I. Boomsma, Meher Preethi Boorgula, Donald W. Bowden, Jennifer A. Brody, Brian E. Cade, Daniel I. Chasman, Sameer Chavan, Yii-Der Ida Chen, Zhengming Chen, Iona Cheng, Michael H. Cho, Hélène Choquet, John W. Cole, Marilyn C. Cornelis, Francesco Cucca, Joanne E. Curran, Mariza de Andrade, Danielle M. Dick, Anna R. Docherty, Ravindranath Duggirala, Charles B. Eaton, Marissa A. Ehringer, Tõnu Esko, Jessica D. Faul, Lilian Fernandes Silva, Edoardo Fiorillo, Myriam Fornage, Barry I. Freedman, Maiken E. Gabrielsen, Melanie E. Garrett, Sina A. Gharib, Christian Gieger, Nathan Gillespie, David C. Glahn, Scott D. Gordon, Charles C. Gu, Dongfeng Gu, Daniel F. Gudbjartsson, Xiuqing Guo, Jeffrey Haessler, Michael E. Hall, Toomas Haller, Kathleen Mullan Harris, Jiang He, Pamela Herd, John K. Hewitt, Ian Hickie, Bertha Hidalgo, John E. Hokanson, Christian Hopfer, JoukeJan Hottenga, Lifang Hou, Hongyan Huang, Yi-Jen Hung, David J. Hunter, Kristian Hveem, Shih-Jen Hwang, Chii-Min Hwu, William Iacono, Marguerite R. Irvin, Yon Ho Jee, Eric O. Johnson, Yoonjung Y. Joo, Eric Jorgenson, Anne E. Justice, Yoichiro Kamatani, Robert C. Kaplan, Jaakko Kaprio, Sharon L. R. Kardia, Matthew C. Keller, Tanika N. Kelly, Charles Kooperberg, Tellervo Korhonen, Peter Kraft, Kenneth Krauter, Johanna Kuusisto, Markku Laakso, Jessica Lasky-Su, Wen-Jane Lee, James J. Lee, Daniel Levy, Liming Li, Kevin Li, Yuqing Li, Kuang Lin, Penelope A. Lind, Chunyu Liu, Donald M. Lloyd-Jones, Sharon M. Lutz, Jiantao Ma, Reedik Mägi, Ani Manichaikul, Nicholas G. Martin, Ravi Mathur, Nana Matoba, Patrick F. McArdle, Matt McGue, Matthew B. McQueen, Sarah E. Medland, Andres Metspalu, Deborah A. Meyers, Iona Y. Millwood, Braxton D. Mitchell, Karen L. Mohlke, Matthew Moll, May E. Montasser, Alanna C. Morrison, Antonella Mulas, Jonas B. Nielsen, Kari E. North, Elizabeth C. Oelsner, Yukinori Okada, Valeria Orrù, Nicholette D. Palmer, Teemu Palviainen, Anita Pandit, S. Lani Park, Ulrike Peters, Annette Peters, Patricia A. Peyser, Tinca J. C. Polderman, Nicholas Rafaels, Susan Redline, Robert M. Reed, Alex P. Reiner, John P. Rice, Stephen S. Rich, Nicole E. Richmond, Carol Roan, Jerome I. Rotter, Michael N. Rueschman, Valgerdur Runarsdottir, Nancy L. Saccone, David A. Schwartz, Aladdin H. Shadyab, Jingchunzi Shi, Suyash S. Shringarpure, Kamil Sicinski, Anne Heidi Skogholt, Jennifer A. Smith, Nicholas L. Smith, Nona Sotoodehnia, Michael C. Stallings, Hreinn Stefansson, Kari Stefansson, Jerry A. Stitzel, Xiao Sun, Moin Syed, Ruth Tal-Singer, Amy E. Taylor, Kent D. Taylor, Marilyn J. Telen, Khanh K. Thai, Hemant Tiwari, Constance Turman, Thorarinn Tyrfingsson, Tamara L. Wall, Robin G. Walters, David R. Weir, Scott T. Weiss, Wendy B. White, John B. Whitfield, Kerri L. Wiggins, Gonneke Willemsen, Cristen Jennifer Willer, Bendik S. Winsvold, Huichun Xu, Lisa R. Yanek, Jie Yin, Kristin L. Young, Kendra A. Young, Bing Yu, Wei Zhao, Wei Zhou, Sebastian Zöllner, Luisa Zuccolo, 23andMe, The Biobank Japan Project, Chiara Batini, Andrew W. Bergen, Laura J. Bierut, Sean P. David, Sarah A. Gagliano Taliun, Dana B. Hancock, Bibo Jiang, Marcus R. Munafò, Thorgeir E. Thorgeirsson, Dajiang J. Liu, Scott Vrieze2022-12-07 (, , )⁠:

Tobacco and alcohol use are heritable behaviors associated with 15% and 5.3% of worldwide deaths, respectively, due largely to broad increased risk for disease and injury. These substances are used across the globe, yet genome-wide association studies have focused largely on individuals of European ancestries.

Here we leveraged global genetic diversity across 3.4 million individuals from 4 major clines of global ancestry (~21% non-European) to power the discovery and fine-mapping of genomic loci associated with tobacco and alcohol use, to inform function of these loci via ancestry-aware transcriptome-wide association studies, and to evaluate the genetic architecture and predictive power of polygenic risk within and across populations.

…Using our multi-ancestry meta-analysis, we identified 2,143 associated loci across all phenotypes (sentinel variant p < 5 × 10−9), with 3,823 independently associated variants (Extended Data Figure 2, Supplementary Tables 2 & 3 & Supplementary Figures 2 & 3). Of these, 1,346 loci and 2,486 independent variants were associated with SmkInit, 33 loci (39 variants) with AgeSmk, 140 loci (243 variants) with CigDay, 128 loci (206 variants) with SmkCes and 496 loci (849 variants) with DrnkWk. ~64% (n = 1,364) of loci were phenotype-specific, 5 loci were associated with all 4 smoking phenotypes but not with DrnkWk, and 5 loci were associated with all 5 phenotypes. All sentinel variants within identified loci had high posterior probabilities that their effect would replicate in a sufficiently powered study according to a trans-ancestry extension of our GWAS cross-validation technique.

…We found that increases in sample size and genetic diversity improved locus identification and fine-mapping resolution, and that a large majority of the 3,823 associated variants (from 2,143 loci) showed consistent effect sizes across ancestry dimensions. However, polygenic risk scores [eg. 10% smoking] developed in one ancestry performed poorly in others, highlighting the continued need to increase sample sizes of diverse ancestries to realize any potential benefit of polygenic prediction.

…To characterize the multifactorial genetic aetiology of tobacco and alcohol use, we computed genetic correlations of our EUR-stratified results with 1,141 medical, biomarker and behavioral phenotypes from the UK Biobank (Supplementary Tables 10 & 11). An affinity propagation clustering algorithm was used to aid interpretability by grouping UK Biobank phenotypes such that each of the 5 current phenotypes were exemplars (Supplementary Figure 5). SmkInit and AgeSmk clustered together, as did SmkCes and CigDay, with all 4 forming a broad higher-level smoking cluster. Phenotypes with high positive genetic correlations with SmkInit included addiction to any substance, neighbourhood material deprivation, diagnosis of chronic obstructive pulmonary disease, and a negative correlation with age at first sexual intercourse (|rg| = 0.57–0.64). For AgeSmk, the largest genetic correlations were with reproductive phenotypes such as age at first birth (rg = 0.69–0.71) and measures of years of education and attainment (rg = 0.58–0.69). CigDay and SmkCes were most highly positively correlated with respiratory and cardiovascular diseases and cancers (rg = 0.52–0.72), highlighting their genetic link to adverse disease outcomes. Finally, DrnkWk was most strongly correlated with problematic drinking behaviors (rg = 0.52–0.70), indicating extensive overlap in the genetic architecture of DrnkWk and measures of alcohol use, problems and alcohol use disorder. This is consistent with previous findings of strong but imperfect genetic correlations (for example, rg = 0.8) between alcohol consumption and alcohol use disorder from large-scale GWAS. We note, however, that genetic correlations can be difficult to interpret as they may be affected by genetic confounding, mediation effects or sampling bias.

We used the ancestry-stratified meta-analysis results to construct ancestry-specific polygenic risk scores in Add Health, an independent target sample of individuals of diverse ancestries from the United States (n = 2,199 AFR, 1,132 AMR, 525 EAS and 6,092 EUR).