Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kymaapi.com/llms.txt

Use this file to discover all available pages before exploring further.

Kyma currently serves 50 models across language, image, video, and audio. All models are verified working and accessible through the same /v1 API. Language models are pay-per-token; image and video are flat per-call or pay-per-second depending on SKU; audio is per-character (TTS), per-minute (transcription / understand), or flat per call (music, voice clone, voice design). Live list and pricing available via GET /v1/models and GET /v1/credits/pricing.
curl https://kymaapi.com/v1/models

How to choose quickly

  • Start with qwen-3.6-plus if you want the best default for general work, coding, and reasoning.
  • Use kimi-k2.6 for tool-heavy agents, long coding sessions, and image-aware workflows.
  • Use deepseek-v4-pro for top reasoning and complex coding with 1M context.
  • Use deepseek-v4-flash when you want V4-family quality at the cheapest price.
  • Use gemini-2.5-flash when you need 1M context or cheap long-context throughput.
  • Use qwen-3-32b when latency matters and you still want strong coding quality.
  • Use glm-5.1 when you need a long-running coding agent for repo-scale engineering work.

Filter the catalog

The live GET /v1/models endpoint now supports capability filters so agents can select models programmatically instead of hardcoding a shortlist.
# Tool-capable models for coding agents
curl "https://kymaapi.com/v1/models?recommended_for=coding&tools=true&supported_parameters=tools,structured_outputs"

# Fast, cheap models with at least 128K context
curl "https://kymaapi.com/v1/models?latency_tier=fast&cost_tier=cheap&min_context_window=128000"

# Vision-capable models
curl "https://kymaapi.com/v1/models?vision=true&input_modalities=text,image"

Qwen 3.6 Plus

#1 most popular. Closed-weight, highest quality overall. 131K context.
model="qwen-3.6-plus"

DeepSeek V4 Flash

Best value V4. 1M context, MIT, native reasoning. $0.19/M input.
model="deepseek-v4-flash"

DeepSeek V4 Pro

Top reasoning. 1.6T MoE flagship, 1M context, complex coding.
model="deepseek-v4-pro"

Kimi K2.5

Best for agents. Multimodal agentic model. 262K context.
model="kimi-k2.6"

Capability Guide

NeedBest first pickWhy
General defaultqwen-3.6-plusBest overall quality, strong multilingual reasoning
Tool-heavy agentskimi-k2.6Strong tool use, long context, multimodal
Top reasoningdeepseek-v4-pro1.6T MoE flagship, 1M context, native reasoning
Best valuedeepseek-v4-flashV4-tier quality at the lowest price, 1M context
Long-running coding agentsglm-5.1Better fit for repo-scale engineering and multi-step execution
Fast codingqwen-3-32bLower latency while staying strong on code and math
1M contextgemini-2.5-flashCheapest long-context option on Kyma
Visiongemma-4-31bReliable image + text workflows
Image generationrecraft-v4#1 HF Arena. Design-quality default for brand and illustration.
Video generationkling-3-proPremium cinematic clips, hero brand video

Tier 1 — Highest Quality

Model IDNameContextSpeedBest For
qwen-3.6-plusQwen 3.6 Plus131KMediumGeneral, #1 traffic
deepseek-v4-proDeepSeek V4 Pro1MMediumTop reasoning, complex coding
deepseek-v4-flashDeepSeek V4 Flash1MFastBest value, long context
deepseek-v3DeepSeek V3160KMediumPrevious-gen flagship, stable
deepseek-r1DeepSeek R164KSlowReasoning, analysis
kimi-k2.6Kimi K2.6262KMediumAgentic coding, multimodal
gemma-4-31bGemma 4 31B128KMediumMultimodal, vision
qwen-3-32bQwen 3 32B32KFastCode, math, multilingual
llama-3.3-70bLlama 3.3 70B128KFastGeneral, most popular open model
minimax-m2.5MiniMax M2.5196KMediumAgentic coding (SWE-bench 80.2%)
glm-5.1GLM 5.1203KMediumLong-running coding agents, repo-scale engineering

Tier 2 — High Quality

Model IDNameContextSpeedBest For
minimax-m2.7MiniMax M2.7205KMediumAgentic coding, productivity, debugging
gpt-oss-120bGPT-OSS 120B128KMediumWriting, general intelligence
qwen-3-coderQwen 3 Coder131KMediumCode generation
gemini-2.5-flashGemini 2.5 Flash1MFastLong context
gemini-3-flashGemini 3 Flash1MFastUltra-long context
glm-4.5-airGLM 4.5 Air131KFastCheap agentic bulk tasks
glm-4.7-flashGLM 4.7 Flash203KFastCheap long-context throughput

Image Generation

Async endpoint at POST /v1/images/generations — pay per image, no token billing. See the Image Generation guide for prompting tips and full examples.
ModelBest ForCost / imageInput
recraft-v4Default — design-quality, brand assets$0.054text
recraft-v4-pro4MP print-ready design$0.338text
recraft-v4-vectorNative SVG — logos, icons$0.108text
recraft-v4-vector-pro4MP SVG, print-ready$0.405text
flux-2-proPhotoreal, multi-reference blend0.0410.041–0.101text + image(s)
flux-kontext-proImage edit, inpaint, refine$0.054text + image
ideogram-v3Typography, logos, packaging$0.108text

Video Generation

Async endpoint at POST /v1/videos/generations — pay per second of generated footage. See the Video Generation guide for prompting tips and the full per-SKU breakdown.
ModelBest ForCost / sec5s clipAudioInput
kling-2.5-proBudget cinematic, b-roll$0.0945$0.4725text + image
kling-3-proPremium cinematic, hero video$0.1512$0.7560text + image
kling-3-pro-audioCinematic w/ diegetic sound$0.2268$1.1340nativetext + image
seedance-2-proAction, multi-shot, social$0.40959$2.04795bundledtext + image
seedance-2-fastSocial shorts, rapid iteration$0.326565$1.63283bundledtext + image
For the live canonical list, use GET /v1/models. Disabled models are intentionally omitted from this page.
Models are updated regularly. Use GET /v1/models for the latest list.