Have you ever wondered how facial recognition on a smartphone works with just a quick scan of its user’s face? The secret lies in a machine learning technique aimed at enabling models to perform tasks proficiently with a very limited amount of training data, known as few-shot learning (FSL).
“FSL empowers artificial intelligence (AI) models to leverage their existing knowledge to learn new concepts with just a few examples,” explained Ruohan Wang, a Senior Scientist at A*STAR’s Institute for Infocomm Research (I2R).
In FSL, the primary hurdles include accumulating knowledge within the model and applying it towards new tasks. To tackle both, many researchers turn to meta-learning, which involves training an AI model on a diverse collection of tasks to enhance its adaptability.
“Many researchers held a traditional view that AI models must directly learn how to reuse knowledge by being trained on many disparate tasks,” Wang noted. However, subsequent research revealed that focusing knowledge accumulation at a global level can significantly enhance an AI model's robustness.
“For instance, if you have scattered data from many different tasks, it would be advantageous to merge these tasks into a unified task and then train an AI model on this amalgamated data,” said Wang.

An example of a three-way-two-shot classification task used for few-shot learning. An algorithm is trained across multiple training tasks. In each training task, the algorithm is presented with a support set of six images: three classes of animals, with two examples provided for each class. The algorithm is then tasked to predict what class of animal each image in a subsequent query set belongs to. After, the model’s ability to generalise to new, unseen classes is evaluated using a test task, in which the model is presented with three new classes that do not overlap with those used in the training tasks.
Task merging simplifies meta-learning into pre-training and decouples knowledge accumulation from how the model intends to transfer such knowledge. Empirically, the benefits of pre-training are widely recognised for FSL, yet the theoretical reasons behind its success remain fuzzy.
Recognising this knowledge gap, Wang and researchers from the University College London, UK, delved into the inner workings of pre-training, demonstrating that pre-training is essentially a form of meta-learning that enhances how quickly the model learns and improves.
Despite the pre-training's advantages, one of it's practical limitations is that real-world datasets seldom come pre-labelled on a global scale, which makes it challenging to implement the method effectively.
To tackle this, Wang’s team developed Meta-Label Learning (MeLa), a system where the model independently infers global labels from available tasks before undergoing pre-training. Specifically, the MeLa algorithm effectively identifies hidden global labels that align with local task specifications, while also grouping diverse task data based on their similarities.
The team also employed data augmentation techniques to enhance the diversity and volume of training data for more robust meta-representations. In experiments, their approach outperformed other existing meta-learning models, which affirmed the team’s findings.
With these new insights into the mechanics of pre-training and how MeLa can facilitate it under less-than-ideal conditions, the team is now focusing on foundation models. These models, which are vast repositories of accumulated knowledge, present an opportunity to pioneer methods that transfer knowledge to a variety of AI applications.
The A*STAR-affiliated researchers contributing to this research are from the Institute for Infocomm Research (I2R).