Perspectives

Which is the real photo of our Global Analytics and Cognitive Leader and which is the output of an AI model called a GAN?

One of the photos above of our Global Analytics and Cognitive Lead, Costi Perricos, is not real and is the output of an AI model called a GAN. Whilst you think about which of the photos is the fake, read on to understand how this photo was created.
 

What makes GANs different?

Machine learning models are commonly created to make a prediction, such as when a car part may break to where the best location is to open a new restaurant. Whilst these two predictions don’t have anything in common (unless cars often break down in a drive-through!), the underlying principles driving these models are similar: given enough historical data informative predictions can (usually) be made.

For example, if I wanted to train a model to classify if a photo was of a cat or a dog I would need to gather a large labelled training dataset of cat and dog pictures. This discriminative model can be trained using a convolutional neural network, which at its heart is a type of AI that excels at spotting visual patterns. Once the model has been trained, an unseen cat or dog photo can be shown to the model and a prediction can be made.

But what if the desired outcome wasn’t a predictive model, but the creation of new data that resembles the training data. This is the domain of Generative Adversarial Networks.

Generative Adversarial Networks (GANs) also require training data, however we only require examples of the desired category (in our example cats). The model is created to accept a series of numbers as an input, and once supplied with these numbers the model will output a novel cat image, one which doesn’t exist in the real world.

This exact model already exists, and the results are stunningly good (thiscatdoesnotexist.com). You may not have realised, but all of the cat pictures in this article came from this model and do not actually exist!
 

How are GANs trained?

To train a GAN, one model is not trained but actually two. The first model is a Generator (the G in GAN). Using a set of numerical inputs the generator creates a new image, one that has not existed before. This can be likened to a money forger, who attempts to make fake money that looks like real money.

The second model is the discriminator, whose role is to differentiate between the training (real) images, and the images created by the generator. This can be likened to a detective, who attempts to differentiate between what is real and what is fake.

If these two models are trained at the same time against each other an arms race results, where the generator creates ever more lifelike images and the discriminator becomes better at spotting fakes. This adversarial relationship is the A in GAN, and enables the production of incredibly realistic outputs.

Not only are the results from GANS visually realistic, but since the generator produces an image from a series of numbers the generation of the pictures is somewhat controllable. The below video from NVIDIA shows this neatly, where the input numbers can be varied to generate a smooth output.

https://www.youtube.com/watch?v=6E1_dgYlifc1
 

GANs and me - the good, the bad, and the outright fake

Machine learning models are dependent on large, high quality datasets. These datasets are expensive and time intensive to curate, may not be perfectly designed for the intended model, and could contain sensitive real data. GANs have the potential of addressing some of these issues through the creation of synthetic datasets, where a GAN can create a large synthetic dataset from a smaller real dataset.

As a data scientist this excites me, as GANs could facilitate larger and more cost-effective training datasets, enabling faster AI development. Whilst GANs are not the silver bullet to creating machine learning datasets (as after all a GAN requires a dataset itself to train) I can see them becoming a useful tool in accelerating many projects in the future. Additionally, it should become easier to train models on more sensitive datasets (such as medical records), as the output of the GAN is less sensitive due to it not actually being real! I can see this having huge societal benefits, as many more data scientists (including myself) could develop models to detect various health conditions from “fake” medical records, lowering the barrier to entry as actual patient data is not used.

However, the creativity of GANs does result in societal questions. In a short span of time stunningly realistic photos of human faces have become possible, effectively creating unlimited free photos of human faces on demand. It is not a question of if GANs will disrupt the modelling industry, but rather do we let it? Many other creative industries could face similar competition if comparable computing time is devoted to them, and society should consider these questions sooner rather than later.

Lastly we should be concerned about fakery, as after all GANs are optimised to be great forgers. Of great concern are deepfakes, where a GAN learns a person’s face and then can use this knowledge to swap the face of an existing image. This could be maliciously, for example altering criminal evidence to frame someone at the scene of the crime or to generate fake news. Whilst most AI developments originate from a desire to make a positive impact, considerations should always be made to whether the technology will be misused.

https://en.wikipedia.org/wiki/File:Deepfake_example.gif2

What does the future hold?

As you have seen in this blog, even though the technology is relatively new GANs can be used to generate incredibly realistic outputs. However currently each trained GAN is quite specialised, as it is only able to create data in the domain of which it was trained.

In the future, not only will the accuracy of GANS improve I anticipate that they will be able to become more generalist, being able to produce data from multiple domains. However even if this is realised, I don’t think AI will be replacing our Global Analytics and Cognitive Lead any time soon.

__________________________________________________________________________

1 NIVIDIA StyleGAN2
2 Man of Steel produced by DC Entertainment and Legendary Pictures, distributed by Warner Bros. Pictures. Modification done by Reddit user "derpfakes"


Image References

- Cat Pictures - thiscatdoesnotexist.com
- Gan improvements of faces over time - Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation, 2018.
https://www.youtube.com/watch?v=6E1_dgYlifc (embed – credit NIVIDIA StyleGAN2)
https://en.wikipedia.org/wiki/File:Deepfake_example.gif (embed – credit Man of Steel produced by DC Entertainment and Legendary Pictures, distributed by Warner Bros. Pictures. Modification done by Reddit user "derpfakes".)
- Will source the references for the Forger/Detective diagram ASAP

Did you find this useful?