Gemini 2.5 Flash - Kyma API

Overview

gemini-2.5-flash is the best choice when context length is the main constraint. It gives you 1M context and multimodal input without going to a premium flagship route.

Specs

Field	Value
Model ID	`gemini-2.5-flash`
Best for	Long context, fast throughput, multimodal analysis
Context window	1M
Max output tokens	8K
Input modalities	Text, image, audio, video
Output modalities	Text
Tool calling	Yes
Structured outputs	Yes
Prompt caching	Yes
Speed	Fast
Cost band	Cheap
Release stage	Stable

Use this when

You need extremely long context.
You want a cheap long-context model for analysis or extraction.
You need multimodal input across more than just images.

Pick something else when

You want the best default quality: use qwen-3.6-plus.
You need stronger agentic coding behavior: use kimi-k2.6.
You need deeper reasoning than throughput: use deepseek-r1.

Example

from openai import OpenAI

client = OpenAI(base_url="https://kymaapi.com/v1", api_key="ky-...")

response = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Summarize this long document set and extract the key decisions."}]
)

MiniMax M2.7 Gemini 3 Flash

Documentation Index

​Overview

​Specs

​Use this when

​Pick something else when

​Example

Overview

Specs

Use this when

Pick something else when

Example