Predicting how chemical reactions unfold used to rely on trial and error at the lab bench. Computational modelling has transformed this, driving innovation in fields from pharmaceuticals to environmental engineering by reducing development time and costs while enhancing precision and efficiency.
Some reactions, particularly those involving liquids, remain exceptionally difficult to predict. Benjamin Chen and Xinglong Zhang, Scientists at A*STAR’s Institute of High Performance Computing (IHPC), highlight the complexity of predicting the behaviour of liquids in chemical reactions. “There is not one ‘best’ arrangement of liquids,” they explained. “An ensemble of arrangements may be present as the solvent molecules fluctuate over time.”
These infinite arrangements mean that there are high computational costs to adequately sample the representative structures in a liquid reaction via molecular dynamics simulations. For instance, quantum-mechanical calculations require about 10 minutes per time step. “This translates to the entire calculation requiring 20 years,” said Chen and Zhang.
While implicit solvation models are faster and less expensive for simulating solvent interactions, they’re often inaccurate, missing critical details like hydrogen bonding and molecular disorder. To address these shortcomings, the team proposed using machine learning interatomic potentials (MLIPs). These models learn from existing data to swiftly and accurately simulate complex chemical systems at the atomic level.
“MLIPs, unlike quantum-mechanical calculations, do not have to solve Schrödinger’s equations and obtain the wavefunction of the system,” said Zhang and Chen, noting that this makes MLIPs up to 10,000 times faster, reducing the time for one million timesteps from 20 years to 24 hours.
The researchers tested different MLIPs to identify the most effective one for studying interactions between water and different materials. They validated their accuracy by comparing the results with traditional calculations and real-world data, focusing on bulk water and water-metal interfaces.
Their approach, which combined machine learning with active learning, proved successful in conducting detailed and realistic studies of liquid catalysts over extended time and length scales. Active learning continuously refines MLIPs with only the most relevant data, making simulations both faster and more accurate without needing vast initial datasets.
“We showed that our simulations are nearly as accurate as first-principles simulations, which are the current gold standard,” said Chen and Zhang.
The researchers believe MLIPs will allow the study of more complex and larger systems. Such advanced models will be crucial for accurately representing real-world catalysts in different environments, such as nanoconfined solvents or dynamically changing catalysts, which can lead to the development of more efficient chemical processes.
Chen, Zhang and colleagues are currently collaborating with researchers at the University of Alabama, US, to optimise the use and training of MLIPs further.
The A*STAR-affiliated researchers contributing to this research are from the Institute of High Performance Computing (IHPC).