Documentation Index
Fetch the complete documentation index at: https://docs.kymaapi.com/llms.txt
Use this file to discover all available pages before exploring further.
Start here
If you do not want to think too hard about model choice:- Start with
qwen-3.6-plusfor the best default across general work, coding, and reasoning. - Switch to
kimi-k2.6if your workflow looks like an agent with tools, long sessions, or screenshots. - Switch to
deepseek-v4-proif you need top reasoning with 1M context. - Switch to
deepseek-v4-flashif you want V4-tier behavior at the lowest price. - Switch to
gemini-2.5-flashif context length is the main problem.
Quick picks
| If you need… | First pick | Why | Second pick |
|---|---|---|---|
| One default model | qwen-3.6-plus | Best overall quality and safest default | deepseek-v4-flash |
| Best value | deepseek-v4-flash | V4-tier quality, 1M context, native reasoning, lowest V4 price | deepseek-v3 |
| Top reasoning | deepseek-v4-pro | 1.6T MoE flagship, 1M context, native reasoning | deepseek-r1 |
| Deep reasoning | deepseek-r1 | Best for logic, math, hard analysis | deepseek-v4-pro |
| Tool-heavy agents | kimi-k2.6 | Strong tool use, long context, multimodal | glm-5.1 |
| Long-running coding agents | glm-5.1 | Better for repo-scale engineering and sustained execution | minimax-m2.5 |
| Agentic coding | minimax-m2.5 | Strong engineering workflow fit for typical coding agents | glm-5.1 |
| Fast coding loops | qwen-3-32b | Lower latency while staying strong on code | qwen-3-coder |
| 1M context | gemini-2.5-flash | Cheapest long-context option | gemini-3-flash |
| Vision / screenshots | gemma-4-31b | Cheapest solid multimodal option | kimi-k2.6 |
| Cheap bulk automation | glm-4.5-air | Low-cost agentic path | glm-4.7-flash |
| Cheap long-context throughput | glm-4.7-flash | Fast, efficient, long context | gemini-2.5-flash |
| Default image generation | recraft-v4 | #1 HF Arena, design-quality, $0.054 | flux-2-pro |
| Photoreal / hero shots | flux-2-pro | BFL 32B, multi-reference, gen+edit | recraft-v4-pro |
| Multi-reference blend | flux-2-pro | Up to 10 source images via image_urls | — |
| Image edit / inpaint | flux-kontext-pro | Image-to-image editor, requires image_url | flux-2-pro |
| Logos, typography, packaging | ideogram-v3 | Best legible-text image model | recraft-v4-vector |
| Native SVG / vector | recraft-v4-vector | True paths + layers, edit in Figma | recraft-v4-vector-pro |
| Print-ready design (4MP) | recraft-v4-pro | V4 quality at 4MP for print | recraft-v4-vector-pro |
| Hero-quality TTS (narration) | eleven-multilingual-v2 | 29 languages, expressive, brand-safe | eleven-turbo-v2-5 |
| Real-time voice agent | eleven-flash-v2-5 | ~75ms time-to-first-byte, 32 lang | eleven-turbo-v2-5 |
| Music generation | elevenlabs-music | Prompt-driven, lyrics support, 1s..5min | — |
| Sound effects | elevenlabs-sfx | Whoosh, explosion, ambient — 0.5..22s | — |
Choose by constraint
I want the safest default
I want the safest default
Pick
qwen-3.6-plus.Use it when you want one model that is strong at general work, coding, reasoning, and multilingual tasks without forcing a lot of tradeoff thinking.I care most about value
I care most about value
Pick
deepseek-v4-flash.It is the best first stop when you want strong quality at lower cost — 1M context, native reasoning, MIT license. If you need an even cheaper lane for routine workloads, look at gpt-oss-120b or glm-4.5-air. If you have a stable production workload already on deepseek-v3, it stays available.I need top reasoning at flagship quality
I need top reasoning at flagship quality
Pick
deepseek-v4-pro.It is the V4 flagship — 1.6T MoE, 1M context, native reasoning. Best fit for complex coding, multi-step analysis, and research-grade work where quality wins over latency.I need deep reasoning
I need deep reasoning
Pick
deepseek-r1.Use it for difficult analysis, logic, math, and multi-step planning where you want a slower, deeper chain-of-thought trace. If it feels too slow, fall back to deepseek-v4-pro or qwen-3.6-plus.I need the best model for an agent
I need the best model for an agent
I need a long-running coding agent
I need a long-running coding agent
Start with
glm-5.1.It is the better fit when the work is repo-scale, multi-file, and long-horizon. If your agent is more typical day-to-day coding than sustained engineering execution, fall back to minimax-m2.5.I need agentic coding specifically
I need agentic coding specifically
Start with
minimax-m2.5.It fits engineering workflows well for normal coding-agent usage. If you want a newer productivity-oriented variant, try minimax-m2.7. If you want a stronger long-horizon engineering model, move up to glm-5.1.I need a fast coding model
I need a fast coding model
Pick
qwen-3-32b.It is the best fit for tight edit-run-debug loops. If you want a more code-specialized model with more context, try qwen-3-coder.I need long context
I need long context
Pick
gemini-2.5-flash for the safest long-context default.If you want a newer preview path, use gemini-3-flash. If you want cheaper long-context throughput without needing multimodal input, look at glm-4.7-flash.I need vision or screenshot understanding
I need vision or screenshot understanding
Start with
gemma-4-31b.It is the cheapest strong multimodal option. If you also need stronger agent behavior, upgrade to kimi-k2.6.I need the cheapest route for bulk work
I need the cheapest route for bulk work
Start with
glm-4.5-air.It is the cheaper agentic lane for repeated automation tasks. If the workload is more about long context and throughput than agent behavior, try glm-4.7-flash.Tradeoffs that matter
| Model | Main strength | Main tradeoff |
|---|---|---|
qwen-3.6-plus | Best overall default | Not the cheapest or fastest |
deepseek-v4-pro | Top reasoning, 1M context, native reasoning | Premium price, preview stage |
deepseek-v4-flash | Best value V4, 1M context, native reasoning | Preview stage |
deepseek-v3 | Stable previous-gen flagship | Smaller context, no native reasoning |
deepseek-r1 | Best chain-of-thought reasoning | Slower |
kimi-k2.6 | Best tool-heavy agent behavior | Premium cost |
minimax-m2.5 | Strong engineering workflows | Less general-purpose than Qwen flagship |
qwen-3-32b | Fast coding | Shorter context |
gemini-2.5-flash | 1M context at good price | Not the strongest default for coding agents |
gemma-4-31b | Cheap vision | Weaker than flagship text models on hard reasoning |
glm-5.1 | Long-running coding agents and repo-scale engineering | Premium lane, text-only, less battle-tested in Kyma than Qwen/Kimi |
glm-4.5-air | Cheap agentic bulk tasks | Lower ceiling than flagship models |
glm-4.7-flash | Cheap long-context throughput | Preview-stage model |
Use by task
| Task | First pick | Fallback |
|---|---|---|
| General chat / assistant | qwen-3.6-plus | deepseek-v4-flash |
| Coding assistant | qwen-3.6-plus | qwen-3-32b |
| Autonomous coding agent | kimi-k2.6 | minimax-m2.5 |
| Repo-scale engineering work | glm-5.1 | deepseek-v4-pro |
| Math / reasoning / hard analysis | deepseek-v4-pro | deepseek-r1 |
| Long document summarization | gemini-2.5-flash | glm-4.7-flash |
| Data extraction / structured output | qwen-3-32b | glm-4.5-air |
| Screenshot / image understanding | gemma-4-31b | kimi-k2.6 |
| Cheap automation | glm-4.5-air | gpt-oss-120b |
Still not sure?
- Use alias
bestforqwen-3.6-plus - Use alias
agentforkimi-k2.6 - Use alias
reasoningfordeepseek-r1 - Use alias
long-contextforgemini-2.5-flash