Navigating crowded public spaces at peak hours can sometimes feel like a frustrating maze, where every step forward is met with obstacles of bustling, unyielding people. If this is already challenging enough for humans, imagine its magnitude for robots.
Predicting the future path or movement of objects, or trajectory forecasting, is thus a critical in-built mechanism for robots designed to operate autonomously in dynamic, real-world settings.
“Predicting accurately where pedestrians will move helps robots to navigate in crowded areas without colliding into humans, and maintain their personal space,” explained Niraj Bhujel, a Research Scientist with A*STAR’s Institute for Infocomm Research (I2R). Additionally, robots equipped with trajectory forecasting can navigate more efficiently by planning a path that side-steps obstacles to minimise delays.
Graph Convolutional Networks, or GCNs, have emerged as powerful machine learning tools for programming trajectory forecasting in complex, highly dynamic environments. In essence, they give robots spatial awareness, but they are known to lack accuracy for predicting the future behaviour of moving objects in unfamiliar settings.
Bhujel and I2R colleague, Wei-Yun Yau, hypothesised that breaking crowd interaction data down into spatial and temporal factors can help next-generation machine learning models achieve more precise forecasting of the trajectory of pedestrians.
Their work culminated in the Disentangled Graph Convolutional Network (DGCN) which features neural message passing, a way in which information is shared and processed between different nodes in a network. When a node receives a message from its neighbours, it combines it with its own information to get a high-resolution picture of the robot’s surroundings.
“Such combinations provide a special lens to the model that shows where humans are and how they change their actions over time and space,” explained Bhujel, adding that the DGCN’s initial prediction is also frequently corrected to improve the reliability of the final prediction.
This innovative approach paid off, with validation data demonstrating that the DGCN showed superior accuracy at a range of prediction horizons compared to conventional GCNs. They also found that the model can effectively account for the influence of pedestrians or vehicles within a radius of up to 8 m without any loss of performance.
Building on positive momentum, Bhujel and team are currently working on answering unsolved questions on how to computationally capture an individual human's movement intention within crowded places. Together, these advancements have the potential to enhance the efficiency and safety of autonomous navigation systems as a means of integrating robots into human-centric spaces.
The A*STAR researchers contributing to this research are from the A*STAR’s Institute for Infocomm Research (I2R).