AI illustrator draws imaginative pictures to go with text captions
A neural network uses text captions to create outlandish images – such as armchairs in the shape of avocados – demonstrating it understands how language shapes visual culture.
OpenAI, an artificial intelligence company that recently partnered with Microsoft, developed the neural network, which it calls DALL-E. It is a version of the company’s GPT-3 language model that can create expansive written works based on short text prompts, but DALL-E produces images instead.
“The world isn’t just text,” says Ilya Sutskever, co-founder of OpenAI. “Humans don’t just talk: we also see. A lot of important context comes from looking.”
DALL-E is trained using a set of images already associated with text prompts, and then uses what it learns to try to build an appropriate image when given a new text prompt.
It does this by trying to understand the text prompt, then producing an appropriate image. It builds the image element-by-element based on what has been understood from the text. If it has been presented with parts of a pre-existing image alongside the text, it also considers the visual elements in that image.
“We can give the model a prompt, like ‘a pentagonal green clock’, and given the preceding [elements], the model is trying to predict the next one,” says Aditya Ramesh of OpenAI.
For instance, if given an image of the head of a T. rex, and the text prompt “a T. rex wearing a tuxedo”, DALL-E can draw the body of the T. rex underneath the head and add appropriate clothing.
The neural network, which is described today on the OpenAI website, can trip up on poorly worded prompts and struggles to position objects relative to each other – or to count.
Your Comment :