Overview
gemini-3.5-flash is Google’s newest Flash-tier model on Kyma. It pairs a 1M-token context window with multimodal input (text, image, audio, and video) and fast reasoning, making it a strong pick for long-context and mixed-media workloads that still need low latency.
Specs
| Field | Value |
|---|---|
| Model ID | gemini-3.5-flash |
| Best for | Long context, multimodal input, fast reasoning |
| Context window | 1M |
| Max output tokens | 8K |
| Input modalities | Text, image, audio, video |
| Output modalities | Text |
| Tool calling | Yes |
| Structured outputs | Yes |
| Reasoning | Yes |
| Vision | Yes |
| Prompt caching | Yes |
| Speed | Fast |
| Cost band | Premium |
| Release stage | Stable |
Use this when
- You need a very large (1M) context window for long documents, transcripts, or codebases.
- Your input mixes text with images, audio, or video.
- You want fast responses without dropping to a weaker model.
- You build agents that need tool calling and structured JSON outputs.
Pick something else when
- You want the cheapest long-context option: use
gemini-2.5-flash. - You need the strongest tool-heavy agent behavior: use
kimi-k2.6. - You want top reasoning over speed: use
deepseek-v4-pro.