“Crowdsourcing Big Data Research on Human History and Health: from Genealogies to Genomes and Back Again”, 2018-04-12 (; similar):
Genealogies are likely the first, centuries-old “big data”, with their construction as old as human civilization itself. Globalization, and the identity crisis that ensued, turned many to online services, building family trees and investigating connections to historical records and other family trees. An explosion has been underway since the beginning of the century in the number and usage of websites offering such genealogical services. About 130 million users combine to have created almost four billion profiles for family members across the three most popular websites of genealogy enthusiasts, Ancestry.com, MyHeritage, and Geni. More recent years have witnessed a similar rapid increase of genetic-based services that address the same need to learn about familial relationships and ancestry.
These vast amounts of crowdsourced—and often crowdfunded (as users often pay for these services)—data offers ample scientific research opportunities that would otherwise require expansive collection.
In a paper published today in Science, et al 2018 introduce a genealogical dataset based on processing 86 million public Geni profiles. Armed with this crowdsourced dataset, they address fundamental research questions.