Highlights

In brief

Atari video games have helped researchers identify weaknesses in reinforcement learning models when it comes to real-world situations.

Expecting the unexpected from AI

17 Jun 2020

Researchers are simulating real-world complexity in machine learning models to ensure their safety before they are deployed in the wild.

When we think of artificial intelligence (AI) going rogue, prime examples from the movies include HAL 9000 from 2001: Space Odyssey and Skynet from The Terminator, which were mainframe computers that reacted to real-world problems in unexpected ways.

From industrial manufacturing to autonomous vehicles, machine learning models are becoming increasingly embedded in our lives. Researchers are thus exploring pre-emptive ways to avoid harm from unexpected AI decisions made by machine learning models deployed in real-world situations—an area of machine learning known as reinforcement learning (RL).

“While deep RL has indeed been very successful in achieving state-of-the-art performance in curated academic environments, it has yet to be thoroughly tested in the presence of real-world complexities,” said Abhishek Gupta, a Scientist at A*STAR's Singapore Institute of Manufacturing Technology (SIMTech) and one of the study’s senior authors.

The work, which was principally conducted by Nanyang Technological University (NTU) graduate student Xinghua Qu and jointly overseen by Gupta and A*STAR’s Chief AI Scientist Yew-Soon Ong, focused on the performance of vision-based AI, which is likely to be critical for the safe use of AI in applications such as autonomous vehicles.

Using six Atari video games, the classic game of Pong, the group simulated visual perturbations by altering a small number of pixels in selected frames in the game environment. They then examined how well the algorithm performed in the perturbed environment by measuring the 'accumulated reward,’ a barometer of how optimal an algorithm’s decisions are.

Stunningly, they found that a mere one-pixel change to input images was often enough to cause the accumulated reward to significantly plummet for all four algorithms tested, including widely used algorithms such as Deep Q Networks. These results indicate that although RL models thrive in familiar, standardized environments, they would be poorly equipped to handle an environment that is highly variable, like roads and heavily populated areas, potentially to the detriment of safety.

“Most of the work has focused on achieving highly accurate AI or deep learning models,” Gupta said. “However, this vulnerability needs to be considered before these AI are put into operational use, to ensure the integrity and reliability of AI deployment.”

The research team is now investigating more efficient techniques for generating adversarial perturbations in real-time RL applications. “This constitutes a critical step of knowing your enemies before defeating them,” Gupta said.

The A*STAR-affiliated researcher contributing to this research is from the Singapore Institute of Manufacturing Technology (SIMTech).

Want to stay up to date with breakthroughs from A*STAR? Follow us on Twitter and LinkedIn!

artificial intelligence machine learning A*STAR Singapore Institute of Manufacturing Technology (A*STAR SIMTech) reinforcement learning video games

References

Qu, X., Sun, Z., Ong, Y.S., Wei, P., Gupta, A. Minimalistic Attacks: How Little it Takes to Fool Deep Reinforcement Learning Policies. IEEE Transactions on Cognitive and Developmental Systems (2020) | article

About the Researcher

View articles

Abhishek Gupta

Scientist

Singapore Institute of Manufacturing Technology

View articles

Abhishek Gupta is a Scientist in A*STAR’s Singapore Institute of Manufacturing Technology (SIMTech). He holds a PhD in Engineering Science from the University of Auckland, New Zealand. His current research is on developing algorithms at the intersection of optimization and machine learning, with particular application to cyber-physical production systems.

Highlights

Expecting the unexpected from AI

Want to stay up to date with breakthroughs from A*STAR? Follow us on Twitter and LinkedIn!

References

About the Researcher

Abhishek Gupta

This article was made for A*STAR Research by Wildtype Media Group

Related Articles

Meet the meat-cutters

Training AI to plan step by step

F(AI)rness without the full picture

Get the PDF deliveredto your inbox.

Get the PDF deliveredto your inbox.

Join our mailing list

Get the PDF delivered
to your inbox.

Get the PDF delivered
to your inbox.