OpenAI’s New Model Makes Realistic Images And Text Try It Free

OpenAI has integrated its 4o model into ChatGPT, allowing for the direct generation of images within the chatbot’s framework. This enhancement means users no longer need to access OpenAI’s Dall-E image generation model separately, although Dall-E continues to be an option for those who prefer it. Moreover, OpenAI’s Sora AI video generator is now accessible within ChatGPT as well.

These innovative features are currently available to all ChatGPT users, including free, Plus, Team, and Pro subscribers. Enterprise and education users can expect access next week.

OpenAI generated image - A candid paparazzi-style photo of Karl Marx hurriedly walking through the parking lot of the Mall of America.

OpenAI generated image - Realistic photograph of a horse galloping from right to left across a vast, calm ocean surface.

OpenAI generated image - photorealistic image of farmer's market in toronto on a saturday in summer 2006.

In the past, Dall-E 3 was the image generator available to paid ChatGPT subscribers, while free users had access to a basic version through Microsoft Copilot.

The new model has received acclaim as one of the leading image generators, especially in its premium iteration. While all ChatGPT users can now utilize image generation with the 4o model, those on the free plan should anticipate certain limitations, such as caps on file uploads and data analysis, as highlighted by CNET.

Regardless, ChatGPT users will benefit from more realistic images complemented by clearer text, following a year-long training initiative known as “reinforcement learning from human feedback” (RLHF) for the GPT-4o model, according to the Wall Street Journal.

After unveiling GPT-4o in May 2024, OpenAI employed over 100 “human trainers” to refine the model by addressing various errors, particularly those involving hands and faces, as stated by Gabriel Goh, the project’s lead researcher.

This latest model will also empower ChatGPT to create images with transparent backgrounds. This feature is particularly advantageous for business users and creatives, enabling them to design logos or other graphics, as noted by Jackie Shannon, the multimodal product lead at ChatGPT, in her comments to WSJ.

Despite these enhancements, the updated GPT-4o model still presents challenges. It retains a tendency to “hallucinate,” a common shortcoming seen across AI technologies. Maintaining consistent editing within ChatGPT remains another hurdle, but OpenAI has assured users of forthcoming updates, potentially as soon as next week.

Ethics and legal concerns continue to be significant issues for OpenAI. The company asserts that the model was developed using “publicly available data” and proprietary data acquired through partnerships with companies like Shutterstock, as reported by WSJ.

Images produced through the ChatGPT platform using the 4o model will not bear AI watermarks. However, the images will include C2PA metadata to indicate their AI-generated nature, aligning with industry standards.