In the newsroom, headlines are constructed to distil the gist of a story in one glance, appealing to readers with limited time on their hands. However, relying solely on headlines can lead to a partial or skewed understanding of events. A similar limitation appears in general-purpose large language models (LLMs), such as GPT-4, particularly when summarising vast troves of multilingual news articles from local sources.
“Like someone who only listens to the loudest voices in the room, LLMs often ignore crucial, nuanced details in favour of the most repeated statements,” said Longyin Zhang, a Research Scientist at the A*STAR Institute for Infocomm Research (A*STAR I2R). “They also struggle to maintain accuracy regarding local entities and cultural nuances, sometimes confusing timelines and making up facts based on outdated information.”
Zhang and colleagues designed CLUST-Multi-lingual, Cross-lingual and Multi-documents Summarisation (CLUST-McMs), a two-stage artificial intelligence (AI) pipeline designed to produce more accurate and context-aware summaries of multilingual regional news. In the first stage, CLUST-McMs dynamically categorises articles based on specific events—such as an election period or the passage of a new law—rather than broad topical groupings typically used by general-purpose LLMs.
The second stage applies a data sharpening technique to guide the summarisation process. By balancing the volume and diversity of input information, the model filters out repetitive content and prioritises dense, information-rich sentences, helping to reduce bias in the final summary.
The team also introduced a localisation step to better capture cultural and contextual nuances, akin to the judgement of a local editor. “We train it through specific question-answering tasks to strictly cite facts and timestamps directly from the local source texts,” Zhang explained.
Using a specially curated dataset of news from Southeast Asia, the researchers reported that CLUST-McMs significantly outperformed GPT-4, effectively synthesising overlapping articles across multiple languages into a single, concise English-language summary. Based on three evaluation metrics, their smaller, targeted model delivered more accurate coverage and stronger fidelity to the original sources, highlighting the importance of smart data sharpening and localisation over relying solely on large, general-purpose models.
Moving forward, the team hopes to expand their work and localise multimodal models to understand regional news shared not just in written form, but also in audiovisual formats.
“The AI community needs to shift its focus from merely scaling up model sizes to making AI highly faithful to real-world facts and deeply culturally aware in localised contexts,” said Zhang. “Our goal is to mitigate the cultural biases inherently embedded in global models, ensuring the AI correctly interprets local visual cues, dialects and nuances.”
The A*STAR-affiliated researchers contributing to this research are from the A*STAR Institute for Infocomm Research (A*STAR I2R).