If you were shown two photos side by side, could you tell which one was generated by artificial intelligence (AI) and which was real? A few years ago, that would have been easy. Today, much less so. Deep learning models—a subtype of AI—have become remarkably adept at creating realistic images by repeatedly playing, and learning from, this game of spotting the fake.
This training method is called Generative Adversarial Network (GAN). It involves two neural networks locked in competition: a generator and a discriminator. “The generator tries to create realistic images, while the discriminator tries to spot the fakes,” explained Aye Phyu Phyu Aung, a Scientist at the A*STAR Institute for Infocomm Research (A*STAR I2R). “They keep improving their tactics until neither can easily win.”
However, GAN may run into issues of mode collapse, where the generator and discriminator become trapped in a narrow set of strategies. The result is subpar, repetitive outputs that make for an ineffective training regime for deep learning models.
To mitigate this, Aung and A*STAR I2R colleagues, including Senior Principal Scientist Xiaoli Li and Senior Scientist J. Senthilnath, teamed up with collaborators from Nanyang Technological University, Singapore; Singapore Management University; KTH Royal Institute of Technology, Sweden; and University of Nebraska–Lincoln, US. As generator/discriminator pairs act like opponents in a game, the researchers believed that adopting game theory principles could be the key to improving GAN.
One such concept, the Double Oracle (DO) algorithm, starts with a smaller version of the fake-spotting game instead of determining the best strategies for the whole game from the get-go. “DO solves a small restricted game, asks each player’s best response to find a better strategy, adds those strategies, and repeats until no improvement is possible,” said Aung.
The team further complemented DO with Neural Architecture Search (NAS), melding them into a framework dubbed DONAS. Scouring through a variety of player architectures, NAS identifies those that best match the optimal strategies determined by DO—much like selecting athletes with skillsets and playstyles that align with a coach’s tactical vision.
Testing revealed that DONAS effectively enhanced GAN’s performance, making it more robust against mode collapse. “We are able to get vastly different models, generating samples of diverse features and patterns,” said Aung. “Moreover, the trained models could create realistic images resembling those in a given dataset, outperforming other GAN approaches across several benchmarks.”
The researchers also observed similar improvements when they applied DONAS to another framework, which uses classifier/attacker pairs rather than generator/discriminator to analyse imaging datasets. Aung and the team have since continued to develop more robust and effective AI training frameworks, including a recently patented GAN-based module.
The A*STAR-affiliated researchers contributing to this research are from the A*STAR Institute for Infocomm Research (A*STAR I2R).