OpenAI Says DALL-E Is Generating Over 2 Million Images a Day—and That’s Just Table Stakes

The venerable stock image site, Getty, boasts a catalog of 80 million images. Shutterstock, a rival of Getty, offers 415 million images. It took a few decades to build up these prodigious libraries.

Now, it seems we’ll have to redefine prodigious. In a blog post last week, OpenAI said its machine learning algorithm, DALL-E 2, is generating over two million images a day. At that pace, its output would equal Getty and Shutterstock combined in eight months. The algorithm is producing almost as many images daily as the entire collection of free image site Unsplash.

And that was before OpenAI opened DALL-E 2 to everyone. Until last week, access was restricted, and there was a waiting list of folks eager to get their hands on the algorithm. The purpose of the blog post in which the number appeared was to announce that DALL-E 2 is now open to the public at large.

Similar algorithms, meanwhile, are already freely available. The pace, then, will likely accelerate from here.

Of course, it’s worth pointing out this is an imperfect comparison. The quality of your average Shutterstock or Getty image is mostly higher out of the gate, and the sites also offer editorial images of current events. Meanwhile, DALL-E 2 and other algorithms generate multiple images for every prompt, image quality varies widely, and the best work requires polishing by a practiced hand.

Still, it’s clear DALL-E and others are unprecedented image-making machines.

A Phased Rollout

DALL-E 2, released earlier this year, has been the talk of the tech world.

Unlike its predecessor, which OpenAI first unveiled in 2021 and produced noticeably imperfect creations, DALL-E 2 makes photorealistic images with a text prompt. Users can mix and match elements, like requesting an astronaut riding a horse, and styles, like a sea otter in the style of Girl With a Pearl Earring by Johannes Vermeer.

To limit misuse and better filter the algorithm’s output, OpenAI pursued a phased release. Trained on millions of online images and captions, DALL-E 2 and other algorithms like it are susceptible to biases in their datasets as well as misuse by users. OpenAI published a paper on DALL-E 2 in April and previewed the algorithm for 200 artists, researchers, and other users. They expanded the preview by 1,000 users a week the next month and then extended access to the algorithm in beta, with pricing, to a million people.

“Responsibly scaling a system as powerful and complex as DALL-E—while learning about all the creative ways it can be used and misused—has required an iterative deployment approach,” the company wrote in the blog post.

During the rollout, OpenAI took feedback from users and translated it into technical fixes to reduce bias and filters to prevent inappropriate content. They’re also employing a team of moderators to keep tabs on things. How well this approach will scale, as millions of images becomes tens of millions and more, remains to be seen, but the team was confident enough in the product so far to move ahead with a full release.

AI Art on the Rise

The rest of the AI world didn’t stand still in the meantime. Competitors rapidly followed on DALL-E’s heels. First there was DALL-E Mini—now Craiyon—a lower quality but free image generator seized upon by the internet to manufacture memes. Higher quality algorithms include Midjourney and Stable Diffusion. Google even got in on the game with its Imagen algorithm (though the company has so far kept it under wraps).

Add these to DALL-E 2’s output, and the volume of AI art is set to grow fast.

Earlier this summer, Stable Diffusion said its algorithm was already producing two million images a day during testing. When the platform hit a million users in mid-September, Stable Diffusion founder Emad Mostaque tweeted, “We should break a billion images a day sooner rather than later I imagine, especially once we turn on animation, etc.”

The rapid emergence of AI art has not been without controversy.

A piece of AI art generated in Midjourney by Jason Allen recently won the blue ribbon for digital art at the Colorado State Fair. It’s not hard to see why. The piece is beautiful and evocative. But many artists took to Twitter to voice their displeasure.

Some worry the algorithms will reduce the amount of work open to graphic designers. The combination of quality, speed, and volume with limited specialized skill required may mean companies choose a quick algorithmic creation over hiring a designer.

Recently, Ars Technica reported Shutterstock was already home to thousands of images by AI. Not long after it appeared the site was taking some down. Meanwhile, Getty banned AI art on its platform altogether, citing copyright worries. The legal landscape is still murky and liable to shift.

“At the moment, AI-generated content will be reviewed no differently than any other type of digital illustration submission,” Shutterstock told Quartz last week. “This may change at a moment’s notice as we learn more about synthetic imagery.”

Others worry that the ability to closely mimic the style of a working artist could negatively impact the value and visibility of their work. It’s also unclear what’s owed to the artists whose creations helped train the algorithms. Developers expressed a similar concern last year when OpenAI released coding algorithms trained on open repositories of code.

But not everyone agrees AI art will so easily replace skilled designers and artists after the novelty dies down. And the community may further iron out issues, like deciding to train algorithms only on works in the public domain or allowing artists to opt out. (There’s already a tool for artists to see if their creations are included in training data.)

When the Dust Settles

Machine learning algorithms have generated weird, novel, but not particularly useful images and text for years. AI that could generate quality work first really hit the scene in 2020 when OpenAI released the natural language algorithm GPT-3. The algorithm could produce text that was, at times, nearly indistinguishable from something written by a human. GPT-3 is now, like DALL-E 2, a paid product and also a prodigious producer.

But even as big as GPT-3 was, DALL-E 2 may take the lead. “We’ve seen much more interest than we had anticipated, much bigger than it was for GPT-3,” OpenAI’s vice president of product and partnerships, Peter Welinder, told MIT Technology Review in July.

The trend isn’t likely to end with images. Already, AI developers are eyeing video. Just last week, Meta released perhaps the most advanced such algorithm yet. Though its output is still far from perfect, the pace of improvement suggests we won’t have long to wait.

When the dust settles, it’s possible that these algorithms establish themselves as a new form of art. Photography faced similar resistance upon its arrival in the nineteenth century, and ultimately, rather than supplanting existing art forms, it joined their ranks.

“What has been true since at least the advent of photography is still true now: The art that is half-born is the most exciting. Nobody knows what a new art form can be until somebody figures out what it is,” Stephen March recently wrote in the Atlantic. “Figuring out what AI art is will be tremendously difficult, tremendously joyful. Let’s start now.”

Image Credit: OpenAI



* This article was originally published at Singularity Hub

Post a Comment

0 Comments