Your identity is woven from the threads of those around you—truly knowing you involves exploring the dynamics of your family, colleagues and community ties. Likewise, accurately categorising cell types and the tissues they inhabit requires an appreciation of the complex networks surrounding them.
Spatial omics maps the distribution and interactions of DNA, RNA and proteins within specific tissue regions. This method adds a crucial spatial dimension to molecular data, revealing not only the presence of these molecules but also their relationships and roles within biological systems, unlocking deeper insights into health and disease.
However, many existing algorithms overlook the spatial relationships between cells revealed by spatial omics, leading to misclassification. Shyam Prabhakar, Senior Group Leader at the A*STAR Genome Institute of Singapore (A*STAR GIS), has focused on improving the accuracy of cell typing.
In collaboration with Kok Hao Chen, a fellow Group Leader at GIS; and Hwee Kuan Lee, Deputy Director (Training and Talent) and Senior Principal Investigator at the A*STAR Bioinformatics Institute (A*STAR BII), they developed BANKSY—an innovative algorithm that integrates cell typing and tissue domain segmentation into a single, scalable framework for analysing spatial omics data effectively.
The team, which included researchers from the National University of Singapore and Veranome Biosystems in the US, posited that incorporating a cell's molecular profile alongside the expression patterns of nearby cells—the microenvironment—could enhance the accuracy of both cell classification and domain segmentation in BANKSY.
“The BANKSY study began with the idea that to define a cell type, we need to consider not only the properties of the cells themselves but also the characteristics of their neighbours,” Prabhakar explained, leading to a feature augmentation strategy that blends expression features from both.
By adjusting the weight of neighbourhood contributions—essentially how nearby cells influence classification—the researchers found that giving more emphasis to these neighbours helped identify broader tissue domains, while reducing that weight allowed for a focus on distinguishing specific cell types.
BANKSY employs a neighbourhood kernel in combination with a feature-augmentation approach to analyse data from various spatial omics technologies, including RNA sequencing and protein imaging. Prabhakar recalled initial hurdles when submitting BANKSY for publication, as reviewers dismissed it for being too simple, reflecting the misconception that complexity equates to quality. Nonetheless, the algorithm has demonstrated improved accuracy in clustering cell types and identifying tissue domains compared to existing methods.
With the ability to process millions of cells, BANKSY is ideal for large-scale spatial analyses in fields ranging from cancer to neurobiology. Prabhakar emphasised that this biologically inspired framework may set a new standard for spatial data analysis, enhancing our understanding of tissue organisation and cellular interactions.
The A*STAR-affiliated researchers contributing to this research are from the A*STAR Genome Institute of Singapore (A*STAR GIS) and A*STAR Bioinformatics Institute (A*STAR BII).