Article

Frown, step back, wrinkle and sigh … for DALL·E 2?

7 min read

Introduction

Will Durant’s saying, “Every science begins as philosophy and ends as art”, fits very well for DALL·E. With just a little over a year since the launch of DALL·E the advancement of the system to be launched as DALL·E 2 in April 2022 is nothing short of a brilliant conclusion from what was so recently just a hypothesis. In September 2022, access to DALL·E 2 was opened for everyone and the occasion deserves a moment of appreciation for the incredible advancement of this AI system.

DALL·E 2 is the second iteration of a text-to-image AI art generator based on machine learning from OpenAI. The generative tool makes assumptions, based on its training on a database of 650 million captioned images, to generate images from scratch or edit existing images. DALL·E 2 has many more features compared to its predecessor, which took a lot of time to produce low quality images.

The main reason for the hype surrounding DALL·E 2 is that it might just be the most advanced AI generator in the market. There are legitimate concerns about the tool, but so far, the art and design community has primarily been exploring ways to take advantage of the tool.


The system & it’s shiny new features

Very simply put, DALL·E 2 creates original and realistic images from text descriptions. The complexity of the system lies in its ability to combine various concepts, attributes, and styles. DALL·E could also do that, but the advancement of DALL·E 2 is in its ability to enhance and edit images. DALL·E 2 has many features including:

  • “outpainting” which allows the system to extend the borders of an uploaded image, making use of image synthesis to get it as close as possible to a realistic extension of the existing image
  • “inpainting” to fill in or replace a specific part of an uploaded image with AI generated imagery. It can add or remove specific items while taking into consideration shadows, textures, reflections, and other features
  • DALL·E 2 can also merge several photos into one image. The system blends the styles of the images to generate a visual bridge between them

    Source, OpenAI

 

  • diffusion or latent diffusion is a technique that allows DALL·E 2 to interpret associations between text and images. Diffusion is a process in which the system starts by creating a pattern of random dots and then progressively alters the pattern as it recognises the specifics from the prompt. Due to this process, DALL·E 2 can visually represent almost any text prompts, in any required style
  • DALL·E 2 creates almost photorealistic images with about four times better resolution than DALL·E
  • the AI generator is also able to imitate the styles of different artists

Source, Open AI

  • DALL·E 2 can add to an existing image to create new compositions or create variations of an existing image

     Source, Open AI


The many and varied uses of DALL·E 2

All DALL·E 2 generated images are owned by OpenAI, but the company allows users exclusive rights to "reproduce and display their images even with the bottom right corner colour swatch watermark removed if they comply with the content policy. OpenAI has established a credit system pricing model where each user is given a certain amount of monthly credits that can be topped up by payment. Open access to DALL·E 2 means that the creative community can now experiment their ideas with one of the most advanced text-to-image generators and potentially speed up their workflows.

DALL·E 2 can be used to create images in any artistic styles by entering text prompts which offers a wide variety of uses, including advertising and marketing. Brands, such as Nestle and Heinz, are already experimenting with the tool to use it for these purposes. For example, Heinz launched a “Draw Ketchup” campaign for which it had DALL·E 2 generate images of ketchup bottles using terms such as “ketchup,” “ketchup art,” “ketchup in space” (figure 4) and then curated and shared consumers’ prompts on its social channels.

Source, OpenAI

Architects are also making extensive use of DALL·E 2 to visualise their ideas and work. The AI generator is being introduced for use in the creation of backgrounds for gaming. Plastic surgeons are also dipping their toes in the system to envisage what their work could look like. However, so far, it is mainly being rightly used to amplify creative potential through use for artistic experimentation and to generate and test ideas.


Concerns and defences

OpenAI has explained that DALL·E 2’s training data was filtered to remove violent, hateful, or sexual content to limit the system’s exposure and ability to generate such content. DALL·E 2 also has limitations to ensure it cannot be used to create images that show self-harm or illegal acts, and it is prohibited to create images of public figures. The company has also stated that they have improved their filters to reject any attempts to generate such content, which also violates their content policy, and they are continuing to improve how they deal with misuse through detection and response techniques. The company implements automated and human monitoring systems to guard against prohibited content and misuse. Aside from these attempts and claims, OpenAI has been transparent enough to recognise that there is more work to be done in this area.

However, OpenAI recently removed the ban on uploading and editing real human faces. DALL·E 2 can produce photorealistic images, and the removal of the ban opens a significantly large risk that people's images could be edited without their permission to create deepfakes, commit fraud and other criminal activity.

OpenAI has given usage rights, including the right to sell, reprint and use the images created with DALL·E on merchandise. This brings to light concerns about copyright implications of training an AI model on existing images. For example, Getty Images has banned the use of AI generated content on its library due to unclear copyright. Some creators are concerned about their jobs as the input of simple text in a prompt box could eventually eliminate the jobs of designers and illustrators.

OpenAI has stated that learning from real-world use is an important part of developing and deploying trustworthy AI, and responsibly scaling a powerful and complex system. Therefore, they have been careful to slowly open DALL·E 2 as they learned more about the technology’s capabilities and limitations and gained confidence in their safety systems. The company continues to make attempts to improve the system such as the recent change to generate images that reflect the world’s diverse population if race or gender is not specified in the text prompt.


Are we impressed?

DALL·E 2 currently has over 1.5 million users, including artists, creative directors, authors and architects generating over 2 million images every day.

A popular discussion about the images created by DALL·E 2 is whether it can be or should be considered art. There is a solid argument that as the artist is a machine, the work lacks the inherent depth and is not drawn from experiences or feelings, and that the AI generator is using a database to draw reference from to reproduce and that brings into question the originality of the work. The other side of the argument is that the system is not generating images on its own and the idea, concept or composition is coming from a human through the prompts, so maybe we could consider the human the artist and the system as just the generator synonymous with other tools that have long been in use.

DALL·E 2 is going to be the topic of discussion – both positive, for the opportunities it opens and the efficiency it offers, and, negative, for the harmful potential, the loss of certain skills or jobs and issues such as copyright. If guided by someone with sufficient artistic sense, DALL·E 2 can offer a lot. The AI system is also creating a new opportunity of AI prompt writer. OpenAI is currently working on an Application Programming Interface(API) for prompts that it promises to release more widely and, in the meantime, prompts for specific styles or images are currently being sold on various platforms.

With the wider release of DALL·E 2, and the eager use of it to amplify creative potential, it is inevitable that the technology is here to stay. With time and further advancement of technology, there will be laws and policies to support the positive use of this AI system and AI more widely as well. In the meantime, it seems a likely occurrence, that you will be walking through an art exhibition, and will be found appreciating an AI generated artwork. If you frown, step back, wrinkle and sigh, like Richard Gilmore from Gilmore Girls, in appreciation of an art piece, you might also be impressed by the fact that it is not possible to discern that it is the work of a machine and that it can make you feel such emotion.

-------------------------------------------------------------------------------------------------

References

1https://arstechnica.com/information-technology/2022/09/openai-image-generator-dall-e-now-available-without-waitlist/

2https://www.creativebloq.com/news/dalle2-access-open-to-all

3https://www.creativebloq.com/news/weirdest-ai-art-from-dall-e-2
 

 

Fullwidth SCC. Do not delete! This box/component contains JavaScript that is needed on this page. This message will not be visible when page is activated.

Did you find this useful?