“Versatile Detection of Diverse Selective Sweeps With Flex-Sweep”, M. Elise Lauterbur, Kasper Munch, David Enard2022-11-17 ()⁠:

Understanding the selection pressures influencing modern-day genomic diversity and their overall genomic impact is a major goal of evolutionary genomics. In particular, the contribution of selective sweeps to adaptation remains an open question, with persistent statistical limitations of sweep detection methods in terms of power and specificity. Sweeps with subtle genomic signals have been particularly challenging to detect. While many existing powerful methods are capable of detecting specific types of sweeps and/or those with obvious signals, their power comes at the expense of versatility. This means that these tools are likely failing to identify many sweeps. Thus it is valuable but difficult to be able to detect sweeps with diverse characteristics.

We present Flex-sweep, a versatile machine-learning tool designed to detect sweeps with a variety of subtle signals, including those that are thousands of generations old. It is especially valuable for detecting sweeps in non-model organisms, for which we neither have expectations about the characteristics of sweeps present in the genome nor outgroups with population-level sequencing to otherwise facilitate detecting very old sweeps.

We show that Flex-sweep is powerful at detecting selective sweeps with more subtle signals, even in the face of demographic model complexity and misspecification, recombination rate heterogeneity, and background selection. Flex-sweep is able to detect sweeps up to 5,000 generations old (~125,000 years in humans), including those that are weak, soft, and/or incomplete; it is also capable of detecting strong, complete sweeps up to 10,000 generations old.

Furthermore, we apply Flex-sweep to the 1000 genomes data set of Yoruba in Ibadan, Nigeria and, in addition to recovering previously identified selective sweeps, show that sweeps disproportionately occur within genic regions and close to regulatory regions. In addition, we show that virus-interacting proteins (VIPs) are strongly enriched for selective sweeps, recapitulating previous results that demonstrate the importance of viruses as a driver of adaptive evolution in humans.