Gemini 3.1 Flash-Lite
Flash-Lite is Google's lowest-cost Gemini variant, tuned for high-volume, low-latency workloads like routing, classification and embedded agents.
Smallest, fastest, cheapest Gemini model
Flash-Lite is Google's lowest-cost Gemini variant, tuned for high-volume, low-latency workloads like routing, classification and embedded agents.
Flash-Lite is Google's lowest-cost Gemini variant, tuned for high-volume, low-latency workloads like routing, classification and embedded agents.
Use Flash-Lite when you need Gemini quality at the lowest possible cost — moderation, intent classification, batch processing.
Flash-Lite is ideal for moderation, intent classification, batch summarisation and any high-volume API workload where cost per query dominates.
Yes — it accepts text, audio and image inputs natively, making it useful for lightweight multimodal classification and routing pipelines.
Flash-Lite is Google's lowest-cost Gemini variant, tuned for high-volume, low-latency workloads like routing, classification and embedded agents.