Gemini 3.1 Flash — Fast multimodal Google AI
Google DeepMind · Gemini

Gemini 3.1 Flash

Fast Gemini for high-throughput multimodal apps

Gemini 3.1 Flash brings the latest Gemini quality at high speed and low cost, with dedicated Audio, Image and Live variants for real-time products.

Key features

Gemini 3.1 Flash · AI Models

Context window 1M tokens
Max output 64K tokens
Released April 15, 2026 (Audio), Feb 26, 2026 (Image)
Pricing Lower-cost Gemini tier
Key features

Gemini 3.1 Flash

Gemini 3.1 Flash brings the latest Gemini quality at high speed and low cost, with dedicated Audio, Image and Live variants for real-time products.

Key features

  • Sub-second latency for chat and tool-using agents.
  • Audio, Image and Live variants for real-time multimodal UX.
  • 1M-token context window.
  • Significantly cheaper than Gemini 3.1 Pro.
Best for

Best for

Pick Flash for chatbots, voice agents, image-aware UX and any product where fast multimodal responses matter more than absolute peak intelligence.

Frequently Asked Questions

Is Gemini 3.1 Flash multimodal?

Yes — Gemini 3.1 Flash natively handles text, audio, image and video with sub-second latency, plus dedicated Audio, Image and Live variants for real-time applications.

How does Gemini 3.1 Flash compare to Gemini 3.1 Pro?

Flash trades some reasoning depth for much faster responses and lower cost. Use Pro for the hardest tasks; use Flash for high-throughput products where speed matters more than peak intelligence.

Open Chat

Gemini 3.1 Flash brings the latest Gemini quality at high speed and low cost, with dedicated Audio, Image and Live variants for real-time products.

Open Chat