Highlights

Above

Instead of analyzing images in all their complexity, Deep Negative Correlation Learning (DNCL) takes a 'divide and conquer' approach to machine vision.

© Shutterstock

Machine learning gets a new syllabus

4 Nov 2020

A new deep learning method increases the accuracy and range of applications for computer vision platforms.

It takes just a split second for you to count how many people are in the elevator because human neural networks make the process of recognizing, processing and interpreting information based on visual cues seem effortless. Unsurprisingly, it is much more tedious for computers to do the same, and the science of computer vision is so much more than simply plugging a camera into a computer.

Take, for example, the task of counting the number of people at a park, based on live-streamed video footage. Computational models serve as the analytical powerhouse, allowing the computer to make predictions based on the relationship between dependent variables (in this case, the number of people) and independent variables (images of the park). While current computer vision platforms can accurately count how many joggers there are on a track, problems creep up when it has to count people sitting close together, or when some are closer to the camera than others.

“Crowd counting and age estimation are challenging because they need the machine to have a high-level global understanding of the input images,” said study first author Le Zhang, a Scientist at A*STAR’s Institute for Infocomm Research (I2R). “For crowd counting, significant hurdles occur due to occlusions, scale variations and diverse crowd distributions. As for age estimation, one major difficulty is that different people age in different ways.”

To better ‘teach’ computers to accurately identify and classify objects from input images, Zhang and an international team of researchers have come up with a new computer vision training regime called Deep Negative Correlation Learning, or DNCL. This method first divides large training tasks into bite-sized sub-problems. Then, unlike former platform iterations, DNCL trains the system to recognize large pools of regression relationships at a time.

The researchers validated the system in a range of diverse and challenging real-world applications with exciting results. “We report four real-world applications in the paper: crowd counting, age estimation, image super-resolution and apparent personality analysis,” said Zhang. “Our method also inspires some interesting follow-up studies for low-level computer vision tasks.”

As the authors describe it, their ‘divide and conquer’ approach is a huge advancement in terms of efficiency, as it mimics an ensemble-learning system without increasing the number of parameters, and yields superior results such as super-resolution images with sharper edges.

“We are now generalizing this work to the classification scenario where the output targets the discrete category labels,” added Zhang, with future work set to tackle even more challenging computer vision applications.

The A*STAR-affiliated researchers contributing to this research are from the Institute for Infocomm Research (I2R) and the Institute of High Performance Computing (IHPC).

Want to stay up to date with breakthroughs from A*STAR? Follow us on Twitter and LinkedIn!

References

Zhang, L., Shi, Z., Cheng, M.-M., Liu, Y., Bian, J.-W., et al., Nonlinear regression via deep negative correlation learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) | article

About the Researcher

Le Zhang

Scientist

Institute for Infocomm Research
Le Zhang is a deep learning and computer vision scientist based at A*STAR’s Institute for Infocomm Research (I2R). After completing his BEng degree at the University of Electronic Science and Technology of China, Zhang was awarded an MSc and PhD from Nanyang Technological University, Singapore. Zhang currently serves on the organizing committee on several international conferences in the fields of artificial intelligence and computer vision. He is also on the editorial board of computer science publications, including IET Biometrics, Pattern Recognition and Neurocomputing.

This article was made for A*STAR Research by Wildtype Media Group