Long-read DNA analysis can give rise to errors, experts warn

On Jan 23, 2019

Advanced technologies that read long strings of DNA can produce flawed data that could affect genetic studies, research suggests.

New methods that can read lengthy sections of genetic material—categorised by a series of letters—are up to 99.8 per cent accurate, however, in a genome of more than 3 billion letters, this may equate to millions of mistakes in the results.

These errors may falsely indicate that an individual has a genetic difference that heightens their risk of a particular disease.

Researchers say data produced by these technologies should be interpreted with caution, as it may create problems for analysing genetic information from people and animals.

Previously, genetic sequencing technologies were focused on reading short strings of DNA. These sequences would be patched together, which is time consuming and labour intensive.

This approach is useful for reading individual genes but is inappropriate for entire organisms.

Experts from the University of Edinburgh’s Roslin Institute examined three recent studies reporting human genome sequences from long-read technologies. The data contained thousands of errors even after corrective software was used, they found.

Such mistakes could have major implications if these technologies are used in clinical studies to diagnose patients, the team suggests.

The findings are reported in a commentary in Nature Biotechnology. The Roslin Institute receives strategic funding from the Biotechnology and Biological Sciences Research Council.

Professor Mick Watson, of the University of Edinburgh’s Roslin Institute, said: “Long-read technologies are incredibly powerful but it is clear that we can’t rely on software tools to correct errors in the data—some hands-on expertise may still be required. This is important as we increasingly use genomic technologies to understand the world around us.”