More than 60 percent of Americans who have some European ancestry can be identified using DNA databases — even if they have not submitted their own DNA, researchers reported Thursday.
Enough people have done some kind of DNA test to make it possible to match much of the population, the researchers said. So even if you don’t submit your own DNA, if a cousin does, it could lead people to you.
They said their findings, published in the journal Science, raise concerns about privacy. Not only could police use this information, but so could other people seeking personal information about someone.
Earlier this year, police said they used DNA from a public database to catch former California police officer Joseph DeAngelo, suspected of being the “Golden State Killer”.
A distant cousin had taken a commercial DNA sequencing test, and those sequences were used to narrow the suspect list down to DeAngelo.
DeAngelo was caught when police got DNA off a tissue he threw into a trash can. It matched samples taken from the scenes of dozens of rapes and murders across California.
Police are making use of this tool, the team at genealogy website MyHeritage and at Columbia University in New York said.
“Between April to August 2018, at least 13 cases were reportedly solved by long range familial searches,” they wrote.
“Most of these investigations focused on cold cases, for which decades of investigation failed to identify the offender. Nonetheless, one case involved a crime from April 2018, suggesting that some law enforcement agencies have incorporated long-range familial DNA searches into active investigations.”
Yaniv Erlich, chief science officer of MyHeritage, and colleagues analyzed the DNA of 1.2 million people who had submitted their DNA to MyHeritage and another company for sequencing.
They were able to find third cousins or even closer relatives for 60 percent of those with mostly European ancestry. They were able to track relationships using publicly available genealogical records.
"Genetic genealogy databases act like a GPS system for anonymous DNA,” Erlich said in a statement.
“The family trees set a coordinate system, in which the DNA of each individual in these databases is like a beacon that illuminates hundreds of the individual’s relatives who are not in the database,” he added.
“Therefore, even if a specific individual is not in these databases, a relative of theirs could be, which is enough to identify them.”
It doesn’t take too many people to build up a database that can track down relatives in the rest of the population, said geneticist Shai Carmi of the Hebrew University of Jerusalem, who worked on the study.
“We found that once a genetic database covers roughly 2 percent of the adult population, a match of a third cousin or closer is expected for almost all persons of interest,” Carmi said in a statement.
Many people have paid to have their DNA sequenced.
“As of April 2018, more than 15 million people have undergone direct-to-consumer autosomal genetic tests, with about 7 million kits sold in 2017 alone,” the team wrote.
The team ran an experiment to see if they could identify a theoretical person using DNA, genealogical information, and public information such as age and addresses of people.
“We found that the suspect list can be pruned from basic demographic information,” they wrote.
“Our simulations indicate that localizing the target to within 100 miles will exclude 57 percent of the candidates on average,” they wrote. Narrowing down the theoretical person’s age to within five years excluded another 90 percent of possible matches, they said.
“Finally, inference of the biological sex of the target will halve the list to just around 16 to 17 individuals, a search space that is small enough for manual inspection,” they wrote.
“Moreover, the technique could implicate nearly any U.S. individual of European descent in the near future.”
People of purely African and Asian descent are less vulnerable because fewer people in those groups have had their DNA sequenced and put into a public database.