• About Us
  • Contact Us
  • Advertise
  • Privacy Policy
  • Guest Post
No Result
View All Result
Digital Phablet
  • Home
  • NewsLatest
  • Technology
    • Education Tech
    • Home Tech
    • Office Tech
    • Fintech
    • Digital Marketing
  • Social Media
  • Gaming
  • Smartphones
  • AI
  • Reviews
  • Interesting
  • How To
  • Home
  • NewsLatest
  • Technology
    • Education Tech
    • Home Tech
    • Office Tech
    • Fintech
    • Digital Marketing
  • Social Media
  • Gaming
  • Smartphones
  • AI
  • Reviews
  • Interesting
  • How To
No Result
View All Result
Digital Phablet
No Result
View All Result

Home » I Tested Gemini’s Wild New Native Image Generation Feature

I Tested Gemini’s Wild New Native Image Generation Feature

Maisah Bustami by Maisah Bustami
March 14, 2025
in AI
Reading Time: 4 mins read
A A
I Tested Gemini's Wild New Native Image Generation Feature
ADVERTISEMENT

Select Language:

The term ‘natively multimodal‘ has been making waves in the AI community for over a year, yet companies have only recently begun to fully harness the multimodal capabilities of their AI models. Google has now unveiled its latest “Gemini 2.0 Flash Experimental” model, which includes the ability to generate and edit images directly.

You may be asking yourself, what’s the fuss about image generation? True, AI-generated images have been a feature in many popular chatbots like ChatGPT for some time. However, image generation in platforms like ChatGPT or Gemini typically involves sending prompts to specialized diffusion models such as Dall-E 3 or Imagen 3. These models are specifically trained to create images and function as add-ons to the primary AI model, rather than being integrated within it.

In contrast, language-vision models like Gemini are inherently multimodal. They possess the unique capability to understand, create, and alter both text and images natively. Up until now, no tech company has provided this level of functionality to users. OpenAI introduced its own image generation feature with GPT-4o in 2024, but it was never made publicly available.

With native image generation, you benefit from enhanced consistency since multimodal models are trained on extensive datasets that include various forms of content. This leads to a better grasp of concepts and a broader general knowledge base.

In addition to generating images, you can effortlessly edit them using simple prompts. For instance, you can upload an image and request the model to add sunglasses, insert legible text, remove objects, and more. Unlike diffusion models, which regenerate the entire image each time you make a request, natively multimodal models ensure consistency across multiple edits.

Native Image Generation with Gemini 2.0 Flash Experimental

As of now, the native image generation feature is not available to the general public. The Gemini 2.0 Flash Experimental model with this capability can only be accessed through Google’s AI Studio (visit) at no cost.

Having tried out the model on AI Studio, I found it to be a thrilling experience. To begin, I created a visual guide showcasing the consistency of Gemini’s image generation capabilities by asking it to illustrate the steps for making an omelet, generating an image for each step.

The results were impressively consistent, with no noticeable glitches. Even small details, like the bowl, remained the same between images. The images can be downloaded in a resolution of 1024 x 680, allowing you to produce visual guides on a variety of topics.

Next, I requested Gemini to create an aesthetically pleasing table and then to display the table from a central camera angle. It executed this task flawlessly. I then asked Gemini to add a PlayStation to the table and give me a closer look. Once again, it delivered beautifully, capturing the PS5’s reflection in a nearby mirror.

  • creating another image of a table with Gemini native

Native Image Editing with Gemini 2.0 Flash Experimental

To showcase Gemini’s image editing feature, I uploaded an image from my gallery and instructed Gemini 2.0 to remove a wine glass from the table. Afterward, I requested it to add mushrooms to a pizza and was impressed by the outcome. Then, I asked Gemini to include a croissant, and it delivered once again, demonstrating the full potential of AI image editing, thanks to Gemini’s native multimodal capabilities.

  • editing pizza image using Gemini
  • editing another pizza image using Gemin

Then, I uploaded a personal image and asked Gemini to add sunglasses, followed by text that read “Beebom” on my shirt. Both requests were executed adeptly.

  • editing images using Gemini
  • editing another image using Gemini

Lastly, I asked Gemini to colorize an image, which it executed beautifully. The end result was even more stunning than the original, free from glitches or distortions.

colorizing images with Gemini

There are countless possibilities you can explore with Gemini’s new multimodal capabilities. Google has done an impressive job integrating native image generation and editing, and I plan to use it extensively in the upcoming weeks to push its boundaries.

Following the launch of Veo 2 for video generation and Imagen 3 for specialized image generation, it seems that Google is outpacing OpenAI in several areas beyond just text generation. It’ll be interesting to see how OpenAI responds to reclaim its position as a leader with ChatGPT.

Arjun Sha

Enthusiastic about Windows, ChromeOS, Android, and issues regarding security and privacy. I enjoy tackling everyday computing challenges.


ChatGPT Add us on ChatGPT Perplexity AI Add us on Perplexity
Tags: AI image editingAI image generationand It Was Like Talking to a Real PersonGeminiGoogle’s Lightweight Gemma 3 Open Model Nearly Matches DeepSeek R1GPT-4oHow to Use Photoshop’s AI Generative Fill Tool Right NowI Tried Sesame AI’s Voice CompanionImagen 3OpenAI Releases Its Next-Generation GPT-4.5 Model to ChatGPT Pro UsersVeo 2visitWhat Is China’s Manus AI Agent? Explained
ADVERTISEMENT
Maisah Bustami

Maisah Bustami

Maisah is a writer at Digital Phablet, covering the latest developments in the tech industry. With a bachelor's degree in Journalism from Indonesia, Maisah aims to keep readers informed and engaged through her writing.

Related Posts

Google’s Gemini Live AI Is Going All In On Mobile Apps
News

Google’s Gemini Live AI Is Going All In On Mobile Apps

August 21, 2025
GPT-4o Returns To ChatGPT After OpenAI Reverses Decision
News

GPT-4o Returns To ChatGPT After OpenAI Reverses Decision

August 11, 2025
How to Get Taliferro Visiting in Fields of Mistria by Completing and Solving
Gaming

How to Get Taliferro Visiting in Fields of Mistria by Completing and Solving

July 30, 2025
Dangerous Chrome Extensions Could Be Harming Your Device Discover Them.jpg
Technology

Dangerous Chrome Extensions Could Be Harming Your Device: Discover Them

July 9, 2025
Next Post
Black Mirror Season 7 Trailer Hints At USS Callister Return

Black Mirror Season 7 Trailer Hints At USS Callister Return

  • About Us
  • Contact Us
  • Advertise
  • Privacy Policy
  • Guest Post

© 2025 Digital Phablet

No Result
View All Result
  • Home
  • News
  • Technology
    • Education Tech
    • Home Tech
    • Office Tech
    • Fintech
    • Digital Marketing
  • Social Media
  • Gaming
  • Smartphones

© 2025 Digital Phablet