New DNA sequencing technologies allow researchers to pursue different strategies for exploring the genome, but only if they have software that accurately interprets the data. Researchers led by Niranjan Nagarajan at the A*STAR Genome Institute of Singapore have developed an algorithm that could help a cutting-edge sequencing platform achieve its potential.
Most clinical genomics data is produced by so-called ‘short-read’ instruments, which generate billions of small stretches of nucleotide sequence information (or ‘reads’). These reads must be mapped back to the appropriate position in the genome — a complicated task, given that each read typically spans just 100–200 bases.
A new platform called the MinION takes a different approach, channeling individual DNA strands into protein nanopores and identifying each nucleotide as it passes through. This allows scientists to gather much longer reads that can, in principle, be mapped more concisely. Additionally, the platform is portable and relatively inexpensive, opening up new sequencing applications.
However, nanopore sequencing is more error-prone than short-read, and mapping has been further impeded by software problems. “When we heard about the MinION system, we were naturally excited and eager to explore applications,” says Nagarajan. “But, when we started mapping MinION data with existing tools, we found they didn’t perform very well.” The disparity is because these algorithms were developed with established platforms in mind, and are not a good fit for this new type of data.
To fill the gap, Nagarajan’s team developed new software called GraphMap that maps MinION data with remarkable speed and accuracy. Their system employs a ‘funneling’ approach to improve the efficiency of plotting each individual read’s position. “This progressively eliminates incorrect alignment locations and refines alignments, ensuring that a large space of candidate alignments can be considered,” he says.
GraphMap assigned locations for more than 90 per cent of the MinION reads collected over several experiments, overcoming the system’s higher error rate and mapping many nucleotides that would have been neglected by other software tools. They also consistently achieved high accuracy rates, assigning correct base assignments for more than 98 per cent of the mapped sequences. “We were able to show that GraphMap alignments enable accurate variant calling even in complex and rearranged regions of the human genome,” says Nagarajan.
In addition to pinpointing disease-related genomic variations, Nagarajan is also enthusiastic about applying MinION to detect harmful microbes. “GraphMap alignments enabled accurate species and strain identification,” says Nagarajan, “and we’re continuing to develop protocols for better and faster pathogen identification.”
The A*STAR-affiliated researchers contributing to this research are from the Genome Institute of Singapore and the Bioinformatics Institute. For more information about the team’s research, please visit the Nagarajan laboratory webpage.