The realm of generative AI is witnessing rapid advancements, with 2023 marking a significant stride in the domain. Meta, formerly Facebook, has introduced Emu, a groundbreaking foundational model for image generation, at this year’s Meta Connect event. This technology underpins numerous AI experiences across Meta’s app family, notably in Instagram’s AI image editing tools. These tools enable users to transform photos by altering their visual style or background. Moreover, the Imagine feature in Meta AI facilitates the generation of photorealistic images within messages or group chats.
Breakthroughs in Video Generation: Emu Video
Emu Video emerges as a pivotal development, utilizing the Emu model for text-to-video generation. This innovative approach, based on diffusion models, offers a simple yet efficient method for creating high-quality videos. The process involves two phases: initially generating images from text prompts and subsequently creating videos conditioned on both text and images. This factorized methodology allows for efficient training of video generation models. Emu Video’s superiority is evident, as it only requires two diffusion models to produce 512×512 videos at 16 fps, a stark contrast to previous methods requiring multiple models. Human evaluations have shown a strong preference for Emu Video, with its performance outshining previous technologies in both quality and adherence to text prompts.
Revolutionizing Image Editing: Emu Edit
Meta’s Emu Edit represents a paradigm shift in image editing, focusing on precise pixel-level alterations. This tool enables intricate editing tasks such as local and global modifications, background adjustments, and color and geometric transformations. Emu Edit stands out by ensuring that only pixels relevant to the editing instructions are altered, maintaining the integrity of the untargeted portions of the image. To train Emu Edit, Meta has developed an extensive dataset comprising 10 million synthesized samples, each including an input image, an editing task description, and the targeted output image. The model exhibits exceptional performance in terms of instruction faithfulness and image quality.
The Future of Generative AI at Meta
These advancements in generative AI hint at a future where creative expression is more accessible and diverse. Emu Video and Emu Edit could potentially revolutionize how people create and share media. They offer tools for everyone from professional artists to casual users, enabling new forms of expression and creativity. While they are not substitutes for professional creators, they provide a platform for enhanced self-expression and creative exploration.
Media reports emphasize the streamlined process of Emu Video and the precise pixel-level editing capability of Emu Edit. The technology’s simplicity and efficiency are highlighted, along with its potential to revolutionize video and image editing. However, Meta approaches the deployment of these AI solutions cautiously, given the rigorous scrutiny from regulators. Meta has clarified that its AI capabilities will not be available for marketing or political campaigns on Facebook and Instagram. Nevertheless, the platform’s basic advertising regulations currently do not specifically address AI.
Image source: Shutterstock