Which model should I use?

Start here

If you do not want to think too hard about model choice:

Start with qwen-3.6-plus for the best default across general work, coding, and reasoning.
Switch to kimi-k2.6 if your workflow looks like an agent with tools, long sessions, or screenshots.
Switch to deepseek-v4-pro if you need top reasoning with 1M context.
Switch to deepseek-v4-flash if you want V4-tier behavior at the lowest price.
Switch to gemini-2.5-flash if context length is the main problem.

Quick picks

If you need…	First pick	Why	Second pick
One default model	`qwen-3.6-plus`	Best overall quality and safest default	`deepseek-v4-flash`
Best value	`deepseek-v4-flash`	V4-tier quality, 1M context, native reasoning, lowest V4 price	`deepseek-v3`
Top reasoning	`deepseek-v4-pro`	1.6T MoE flagship, 1M context, native reasoning	`deepseek-r1`
Deep reasoning	`deepseek-r1`	Best for logic, math, hard analysis	`deepseek-v4-pro`
Tool-heavy agents	`kimi-k2.6`	Strong tool use, long context, multimodal	`glm-5.1`
Long-running coding agents	`glm-5.1`	Better for repo-scale engineering and sustained execution	`minimax-m2.5`
Agentic coding	`minimax-m2.5`	Strong engineering workflow fit for typical coding agents	`glm-5.1`
Fast coding loops	`qwen-3-32b`	Lower latency while staying strong on code	`qwen-3-coder`
1M context	`gemini-2.5-flash`	Cheapest long-context option	`gemini-3-flash`
Vision / screenshots	`gemma-4-31b`	Cheapest solid multimodal option	`kimi-k2.6`
Cheap bulk automation	`glm-4.5-air`	Low-cost agentic path	`glm-4.7-flash`
Cheap long-context throughput	`glm-4.7-flash`	Fast, efficient, long context	`gemini-2.5-flash`
Default image generation	`recraft-v4`	#1 HF Arena, design-quality, $0.054	`flux-2-pro`
Photoreal / hero shots	`flux-2-pro`	BFL 32B, multi-reference, gen+edit	`recraft-v4-pro`
Multi-reference blend	`flux-2-pro`	Up to 10 source images via `image_urls`	—
Image edit / inpaint	`flux-kontext-pro`	Image-to-image editor, requires `image_url`	`flux-2-pro`
Logos, typography, packaging	`ideogram-v3`	Best legible-text image model	`recraft-v4-vector`
Native SVG / vector	`recraft-v4-vector`	True paths + layers, edit in Figma	`recraft-v4-vector-pro`
Print-ready design (4MP)	`recraft-v4-pro`	V4 quality at 4MP for print	`recraft-v4-vector-pro`
Hero-quality TTS (narration)	`eleven-multilingual-v2`	29 languages, expressive, brand-safe	`eleven-turbo-v2-5`
Real-time voice agent	`eleven-flash-v2-5`	~75ms time-to-first-byte, 32 lang	`eleven-turbo-v2-5`
Music generation	`elevenlabs-music`	Prompt-driven, lyrics support, 1s..5min	—
Sound effects	`elevenlabs-sfx`	Whoosh, explosion, ambient — 0.5..22s	—

These are decision shortcuts, not absolute rankings. If your workload changes, your best model changes too.

Choose by constraint

I want the safest default

Pick qwen-3.6-plus.Use it when you want one model that is strong at general work, coding, reasoning, and multilingual tasks without forcing a lot of tradeoff thinking.

I care most about value

Pick deepseek-v4-flash.It is the best first stop when you want strong quality at lower cost — 1M context, native reasoning, MIT license. If you need an even cheaper lane for routine workloads, look at gpt-oss-120b or glm-4.5-air. If you have a stable production workload already on deepseek-v3, it stays available.

I need top reasoning at flagship quality

Pick deepseek-v4-pro.It is the V4 flagship — 1.6T MoE, 1M context, native reasoning. Best fit for complex coding, multi-step analysis, and research-grade work where quality wins over latency.

I need deep reasoning

Pick deepseek-r1.Use it for difficult analysis, logic, math, and multi-step planning where you want a slower, deeper chain-of-thought trace. If it feels too slow, fall back to deepseek-v4-pro or qwen-3.6-plus.

I need the best model for an agent

Start with kimi-k2.6.It is the best first pick when your workload uses tools, screenshots, or long multi-step sessions. If you want a text-only engineering alternative for repo-scale work, try glm-5.1.

I need a long-running coding agent

Start with glm-5.1.It is the better fit when the work is repo-scale, multi-file, and long-horizon. If your agent is more typical day-to-day coding than sustained engineering execution, fall back to minimax-m2.5.

I need agentic coding specifically

Start with minimax-m2.5.It fits engineering workflows well for normal coding-agent usage. If you want a newer productivity-oriented variant, try minimax-m2.7. If you want a stronger long-horizon engineering model, move up to glm-5.1.

I need a fast coding model

Pick qwen-3-32b.It is the best fit for tight edit-run-debug loops. If you want a more code-specialized model with more context, try qwen-3-coder.

I need long context

Pick gemini-2.5-flash for the safest long-context default.If you want a newer preview path, use gemini-3-flash. If you want cheaper long-context throughput without needing multimodal input, look at glm-4.7-flash.

I need vision or screenshot understanding

Start with gemma-4-31b.It is the cheapest strong multimodal option. If you also need stronger agent behavior, upgrade to kimi-k2.6.

I need the cheapest route for bulk work

Start with glm-4.5-air.It is the cheaper agentic lane for repeated automation tasks. If the workload is more about long context and throughput than agent behavior, try glm-4.7-flash.

Tradeoffs that matter

Model	Main strength	Main tradeoff
`qwen-3.6-plus`	Best overall default	Not the cheapest or fastest
`deepseek-v4-pro`	Top reasoning, 1M context, native reasoning	Premium price, preview stage
`deepseek-v4-flash`	Best value V4, 1M context, native reasoning	Preview stage
`deepseek-v3`	Stable previous-gen flagship	Smaller context, no native reasoning
`deepseek-r1`	Best chain-of-thought reasoning	Slower
`kimi-k2.6`	Best tool-heavy agent behavior	Premium cost
`minimax-m2.5`	Strong engineering workflows	Less general-purpose than Qwen flagship
`qwen-3-32b`	Fast coding	Shorter context
`gemini-2.5-flash`	1M context at good price	Not the strongest default for coding agents
`gemma-4-31b`	Cheap vision	Weaker than flagship text models on hard reasoning
`glm-5.1`	Long-running coding agents and repo-scale engineering	Premium lane, text-only, less battle-tested in Kyma than Qwen/Kimi
`glm-4.5-air`	Cheap agentic bulk tasks	Lower ceiling than flagship models
`glm-4.7-flash`	Cheap long-context throughput	Preview-stage model

Use by task

Task	First pick	Fallback
General chat / assistant	`qwen-3.6-plus`	`deepseek-v4-flash`
Coding assistant	`qwen-3.6-plus`	`qwen-3-32b`
Autonomous coding agent	`kimi-k2.6`	`minimax-m2.5`
Repo-scale engineering work	`glm-5.1`	`deepseek-v4-pro`
Math / reasoning / hard analysis	`deepseek-v4-pro`	`deepseek-r1`
Long document summarization	`gemini-2.5-flash`	`glm-4.7-flash`
Data extraction / structured output	`qwen-3-32b`	`glm-4.5-air`
Screenshot / image understanding	`gemma-4-31b`	`kimi-k2.6`
Cheap automation	`glm-4.5-air`	`gpt-oss-120b`

Still not sure?

Use alias best for qwen-3.6-plus
Use alias agent for kimi-k2.6
Use alias reasoning for deepseek-r1
Use alias long-context for gemini-2.5-flash

See model aliases and all models for the canonical live catalog.

Models

Documentation Index

​Start here

​Quick picks

​Choose by constraint

​Tradeoffs that matter

​Use by task

​Still not sure?

Start here

Quick picks

Choose by constraint

Tradeoffs that matter

Use by task

Still not sure?