The announcement brings Apple into the fold of other major Silicon Valley giants growing their corporate machine learning research groups, including Microsoft, Facebook, and Google. While many would expect AI research from such well-funded research groups to be widely publicized and available under open-source agreements, the fruit-themed toymaker has mostly been reluctantly secretive about its academic pursuits in the field until recently.
On November 15th, the company published its first AI research paper titled “Learning from Simulated and Unsupervised Images through Adversarial Training.” While most AI research centers around real-world training, the paper instead proposes a way of improving the quality of synthetic training by having machines extract data from digital images and videos.
Researchers in the field are realizing that it is much more cost-effective to train machine learning models with synthetically generated imagery, as they can be issued with greater frequency. The other issue is that most real-world data sets require extensive annotation and labeling, while synthetic data sets are automatically labeled. However, the approach can also be risky because there is a variable quality difference with real-world visual data, undermining any guarantee that the training models will build the same neural networks.
Generative Adversarial Networks
Apple wants to use a method of simulated and unsupervised (S+U) learning that is very similar Generative Adversarial Networks. This is a system of two neural networks that compete against each other in a zero-sum game, but instead of using real-world images as inputs, it has substituted them for simulated ones. With this method, a simulated image gets sent through a “refiner,” while an unlabeled real-world image gets sent through a “discriminator.” By playing a minimax game to minimize the loss of a worst case accuracy scenario, the company is able to minimize the difference between initial synthetic images and refined ones.
Source: Arxiv.org
Example of minimax strategy
The idea was first introduced by Ian Goodfellow in 2014, and now Apple wants to build on existing research with a few modifications. The company's program is called SimGAN and tells the training models to use the full history of refined images. The goal here is to improve the realism of synthetic images while retaining their original labeling information.