Machine learning has helped to identify potential drug candidates to treat COVID-19.

© NIAID / Flickr

The battle of the bugs goes digital

19 Oct 2020

A new machine learning technique can predict the dynamics between microbes and therapeutics with unprecedented accuracy.

Human beings play host to over 100 trillion bacteria and viruses that play a critical role in everything from digestion to immune protection. Besides these microbial friends, there are also the foes: pathogenic microbes that cause infection and disease. Maintaining this delicate human microbiome is a central focus in creating the next wave of precision therapeutics.

Previously, the drug development process relied heavily on time-consuming, expensive and labor-intensive screening methods. In recent years, however, the study of microbe-drug associations has gone digital, thanks to the advent of advanced machine learning and deep learning. By complementing traditional techniques with advanced machine learning, data scientists can rapidly model how microorganisms will respond to clinical interventions.

Xiaoli Li, a machine learning expert from A*STAR’s Institute for Infocomm Research (I2R), is among the team that has created a novel technique capable of predicting the clinical efficacy of newly developed and repurposed drugs with unprecedented accuracy. They’ve named it GCNMDA, short for Graph Convolutional Network-based framework for predicting human Microbe-Drug Associations.

One of the challenges with existing computational frameworks is that they struggle to make sense of complex, multidimensional datasets. Microbial and drug databases, for instance, have intricate layers of relationships, redundancies and associations that are difficult to ‘teach’ machine learning networks. Integrating multiple biological data sources into a single heterogeneous network is another hurdle.

GCNMDA has been the first to successfully overcome these limitations, thanks to a powerful secret weapon—Graph Convolutional Network with an embedded conditional random field (CRF) layer. “The Graph Convolutional Network can learn accurate microbe and drug representations, while CRF is a probabilistic graphical model which possesses powerful capabilities for modeling pairwise relationships between nodes, such as microbe-drug associations,” explained Li, the study’s co-corresponding author.

This addition helped the technique to independently recognize semantic information such as similarities between groups of microbes and drugs, while simultaneously making accurate guesses as to microbe-drug associations. The GCNMDA’s predictions were so accurate that they significantly outperformed seven state-of-the-art computational systems.

In one case study, the team ran data on the SARS-CoV-2 virus and a suite of potential COVID-19 antivirals on GCNMDA, generating a list of the top 40 pharmaceuticals likely to be effective against the disease, including some drugs previously verified to be successful in clinical studies. In another case study, GCNMDA accurately identified potential microbe-drug associations for two antibiotic drugs, ciprofloxacin and moxifloxacin.

In the future, this technology could radically transform how researchers develop countermeasures against global health threats. “We can use GCNMDA as a screening tool to narrow down the search space for candidate compounds, which can be developed as vaccines and drugs against drug-resistant microbes,” Li said.

To enrich the predictive capabilities of the system, the team is feeding GCNMDA larger training datasets encompassing even more biological parameters. They also plan to tap into large volumes of unlabeled data, which could potentially lead to better predictive models.

The A*STAR-affiliated researchers contributing to this research are from the Institute for Infocomm Research (I2R).

Want to stay up to date with breakthroughs from A*STAR? Follow us on Twitter and LinkedIn!


Long, Y., Wu, M., Kwoh, C.K., Luo, J., Li, X. Predicting human microbe-drug associations via graph convolutional network with conditional random field. Bioinformatics, btaa598 (2020) | article

About the Researcher

Xiaoli Li

Principal Scientist

Institute for Infocomm Research
Xiaoli Li is the Head of Machine Intellection and a principal scientist at A*STAR’s Institute for Infocomm Research (I2R). He is co-director of KPMG-A*STAR joint lab and adjunct full professor at Nanyang Technological University, Singapore. Li has published more than 200 papers (cited over 10,000 times), including a number of award-winning publications in the fields of data analytics and artificial intelligence. Li has led over ten research projects in collaboration with industry partners across a range of sectors. He actively serves as an organizer and contributor to top global AI and data analytics conferences including AAAI, IJCAI, KDD, and ICDM.

This article was made for A*STAR Research by Wildtype Media Group