Highlights

Above

An effective way to explore data

17 May 2016

The impressive computing power of a genetic algorithm can identify the signatures of small molecules within complex metabolomics datasets

Software developed at A*STAR has greatly improved laboratory data analysis so that molecules such as lipids (pictured) can be correctly identified in biological samples.

Systems biologists rely on the powerful analytical technique of liquid chromatography–mass spectroscopy (LC-MS) to analyze biological molecules but face a challenge in sorting through vast data sets. Now, A*STAR researchers have developed software based on a genetic algorithm that is highly adept at spotting the fingerprints of individual metabolites within the ocean of LC-MS data^1,2.

LC-MS studies produce vast amounts of complex information on biological molecules such as metabolites (metabolomics) or lipids (lipidomics), presenting a huge challenge for data analysts. “My team has analyzed ‘big’ omics data and developed mathematical models to improve the quality of various living cells for biotechnological and biomedical applications,” says Dong-Yup Lee from A*STAR’s Bioprocessing Technology Institute and National University of Singapore. “We realized that the available bioinformatics tools for metabolomics and lipidomics analysis were not suitable in terms of their throughput, capabilities and reliabilities.”

Lee’s team recognized the need to integrate several data analysis techniques to reveal how the overall phenotype of an organism might, for example, adapt to environmental changes. This was particularly challenging when trying to uncover the identities and amounts of small molecules such as metabolites which are quite vague.

“Unlike genomic and proteomics data — where the identity of a gene and its products can be unambiguously determined by base sequences — in LC-MS data, the fundamental information on small molecules is not fully captured,” explains Lee. “So we need to find clues that are hidden in the noisy background. Using this imperfect description of a suspect molecule, we compare its features against a known database. If we haven’t seen the molecule before, then clearly we can’t identify it.”

Furthermore, most LC-MS analyses include parameters that are chosen by experts for particular studies and which might not fit another situation. So to select the best parameter sets for LC-MS data processing, the team adapted a common artificial intelligence technique called a genetic algorithm (GA) inspired by natural Darwinian processes that maximize species survival. The parameters act as ‘genes’ in the GA, with various measures of the quality of metabolite identification collectively determining the ‘fitness’ of the overall algorithm.

The team that contributed to this research. Back row (from the left): Yeo Hock Chuan, Ang Kok Siong, Chin Ju Xin, Meiyappan Lakshmanan. Lee Dong-Yup is in the front row, second from the left.

The researchers successfully tested their GA with three metabolomics datasets, including data from cells expressing the antibody Immunoglobin G against the Rhesus D antigen. “We also analyzed a lipidomics dataset with no known working parameters, and the known lipids were identified very quickly,” says Lee. “This progress could shed light on the little-understood role that lipids play in regulating stem-cell differentiation and immunity.”

The A*STAR-affiliated researchers contributing to this research are from the Bioprocessing Technology Institute. For more information about the team’s research, please visit the “-omics” Technologies webpage.

Want to stay up-to-date with A*STAR’s breakthroughs? Follow us on Twitter and LinkedIn!

metabolomics lipidomics genetic algorithm LC-MS A*STAR Bioprocessing Technology Institute (A*STAR BTI) Bioprocessing Technology Institute (BIT) bioinformatics

References

Yeo, H.C., Chung, B. K. -S., Chong, W., Chin, J. X., Ang, K. S. et al. A genetic algorithm-based approach for pre-processing metabolomics and lipidomics LC–MS data. Metabolomics 12, 5 (2016). | article
Lee, T.C., Ho, Y. S., Yeo, H.C., Lin, J. P. Y. & Lee, D.-Y. Precursor mass prediction by clustering ionization products in LC-MS-based metabolomics. Metabolomics 9, 0 (2013). | article

Highlights

An effective way to explore data

Want to stay up-to-date with A*STAR’s breakthroughs? Follow us on Twitter and LinkedIn!

References

This article was made for A*STAR Research by Nature Research Custom Media, part of Springer Nature

Related Articles

Cultivated meat hits a savoury breakthrough

Driving science to new horizons

Mucus impossible: Giving nasal vaccines a new ride

Get the PDF deliveredto your inbox.

Get the PDF deliveredto your inbox.

Join our mailing list

Get the PDF delivered
to your inbox.

Get the PDF delivered
to your inbox.