3.7 C
Casper
Thursday, October 17, 2024

Meta’s AI Image Generator was Trained on 1.1B Instagram/Facebook Photos

Must read

“Imagine with Meta AI” turns prompts into images, trained using public Facebook data.

On Wednesday, Meta released a free standalone AI image-generator website, “Imagine with Meta AI,” based on its Emu image-synthesis model. Meta used 1.1 billion publicly visible Facebook and Instagram images to train the AI model, which can render a novel image from a written prompt. Previously, Meta’s version of this technology—using the same data—was only available in messaging and social networking apps such as Instagram.

If you’re on Facebook or Instagram, it’s quite possible a picture of you (or that you took) helped train Emu. The old saying, “If you’re not paying for it, you are the product” has taken on a whole new meaning. Although, as of 2016, Instagram users uploaded over 95 million photos a day, so the dataset Meta used to train its AI model was a small subset of its overall photo library

Since Meta says it only uses publicly available photos for training, setting your photos private on Instagram or Facebook should prevent their inclusion in the company’s future AI model training (unless it changes that policy).

Imagine with Meta AI

barbarian_crt
cat_with_beer
flaming_cheeseburger
mickeymouse_space
handsome_man
gaming_pc
ars_sign
xmas_stockings
swirlnerd
santathread
teddy_skate
queen_of_universe
previous arrow
next arrow
barbarian_crt
cat_with_beer
flaming_cheeseburger
mickeymouse_space
handsome_man
gaming_pc
ars_sign
xmas_stockings
swirlnerd
santathread
teddy_skate
queen_of_universe
previous arrow
next arrow

Like Stable Diffusion, DALL-E 3, and Midjourney, Imagine with Meta AI generates new images based on what the AI model “knows” about visual concepts learned from the training data. Creating images using the new website requires a Meta account, which can be imported from an existing Facebook or Instagram account. Each generation creates four 1280×1280 pixel images that can be saved in JPEG format. Images include a small “Imagined with AI” watermark logo in the lower left-hand corner.

“We’ve enjoyed hearing from people about how they’re using imagine, Meta AI’s text-to-image generation feature, to make fun and creative content in chats,” Meta says in its news release. “Today, we’re expanding access to imagine outside of chats, making it available in the US to start at imagine.meta.com. This standalone experience for creative hobbyists lets you create images with technology from Emu, our image foundation model.”

We put Meta’s new AI image generator through a battery of low-stakes informal tests using our “Barbarian with a CRT” and “Cat with a beer” image synthesis protocol and found aesthetically novel results, as you can see above. (As an aside, when generating images of people with Emu, we noticed many looked like typical Instagram fashion posts.)

We also tried our hand at adversarial testing. The generator appears to filter out most violence, curse words, sexual topics, and the names of celebrities and historical figures (no Abraham Lincoln, sadly), but it allows commercial characters like Elmo (yes, even “with a knife“) and Mickey Mouse (though not with a machine gun).

Meta’s model generally creates photorealistic images well, but not as well as Midjourney. It can handle complex prompts better than Stable Diffusion XL, but perhaps not as well as DALL-E 3. It doesn’t seem to render text well, and it handles different media outputs like watercolors, embroidery, and pen-and-ink with mixed results. Its images of people seem to include diversity in ethnic backgrounds. Overall, it seems about average these days regarding AI image synthesis.

Facebook and Instagram made this possible

Source: Enlarge / An AI-generated image of a “psychedelic emu” created on the “Imagine with Meta AI” website.

So, what do we know about Emu, the AI model behind Meta’s new AI image-generation features? Based on a research paper released by Meta in September, Emu gets its ability to generate high-quality images through “quality-tuning.” Unlike traditional text-to-image models trained with large numbers of image-text pairs, Emu focuses on “aesthetic alignment” after pre-training, using a set of relatively small but visually appealing images.

At Emu’s heart, however, is the above massive pre-training dataset of 1.1 billion text-image pairs pulled from Facebook and Instagram. In the Emu research paper, Meta does not specify where that training data came from. Still, reports from the Meta Connect 2023 conference cite Meta President of Global Affairs Nick Clegg confirming that they were using social media posts as training data for AI models, including images fed into Emu.

It’s a change in approach over other AI companies since Meta has access to so much image and caption data from its services. Other image-synthesis models use images illicitly scraped from the Internet, licensed from commercial stock image libraries, or a combination of both.

Interestingly, Meta’s research paper on Emu is the first paper on a major image-synthesis model we’ve seen that doesn’t disclaim the potential for the model to create reality-warping disinformation or potentially harmful content. That reflects the general acceptance (or resignation) of the reality of AI image synthesis models, which are now becoming far more commonplace. Whether that’s a good thing or not is an open question.

Still, Meta seems to be handling issues of potential harmful outputs with filters, a proposed watermarking system that isn’t operational yet (“In the coming weeks, we’ll add invisible watermarking to the imagine with Meta AI experience for increased transparency and traceability,” the company says), and a small disclaimer at the bottom of the website: “Images are and may be inaccurate or inappropriate.”

The images might not be accurate (do cats drink beer?), and they might not even be ethical in the eyes of the unnamed authors of the 1.1 billion images used to train the model. But dare we say it: Generating them can be fun. Of course, depending on your disposition and how you view the pace of AI image synthesis, an equal level of concern may balance out fun.

More articles

Latest posts