Documentation Index
Fetch the complete documentation index at: https://docs.kymaapi.com/llms.txt
Use this file to discover all available pages before exploring further.
Pricing & access
Is Kyma really free?
Is Kyma really free?
What happens when I hit the rate limit?
What happens when I hit the rate limit?
429 response. Wait a few seconds and retry. Limits are tier-based: Tier 0 = 30 RPM, up to Tier 4 = 300 RPM. See Rate Limits.Can I increase my rate limit?
Can I increase my rate limit?
Models & capabilities
Which model should I use?
Which model should I use?
qwen-3.6-plus if you want the safest default. Use kimi-k2.6 for tool-heavy agents, deepseek-r1 for hard reasoning, and gemini-2.5-flash for long-context tasks. See Which model should I use? for the full decision guide.Are these models truly open source?
Are these models truly open source?
Do you support vision or multimodal inputs?
Do you support vision or multimodal inputs?
gemma-4-31b is the cheapest strong multimodal option, and kimi-k2.6 is the better pick if you also need tool-heavy agent behavior.Can I generate images?
Can I generate images?
flux-1.1-ultra (cinematic photo), flux-kontext-pro (image edit), ideogram-v3 (typography / logos), and recraft-v3 (vector / illustration) — through async POST /v1/images/generations. Pricing is per-image, 0.108 depending on model. See the Image Generation guide for prompts and examples.What is the maximum context window?
What is the maximum context window?
GET /v1/models for the live catalog and context windows.Technical limits
Does Kyma have a hidden output token limit?
Does Kyma have a hidden output token limit?
Are models on Kyma different from the original?
Are models on Kyma different from the original?
Why did my response get cut short?
Why did my response get cut short?
- You didn’t set
max_tokens— defaults to 4096 tokens. Set it higher for longer outputs (e.g.,max_tokens: 8192). - Model’s output limit — each model has a maximum output capacity (check
max_output_tokensin/v1/models). For long generation, use models likeglm-5.1(65K) ordeepseek-r1(32K). finish_reason: length— the model hit its limit naturally. Increasemax_tokensor use a model with higher output capacity.
finish_reason is length, the model itself stopped.What is the default max_tokens if I don't set it?
What is the default max_tokens if I don't set it?
max_tokens explicitly in your request if you need longer outputs. Kyma forwards your value directly to the model.Where can I see all model limits?
Where can I see all model limits?
GET /v1/models — each model includes max_output_tokens, context_window, gateway_output_limit (always null), and max_tokens_passthrough (always true). For a gateway-level overview, call GET /v1/capabilities.Compatibility
Is Kyma compatible with the OpenAI SDK?
Is Kyma compatible with the OpenAI SDK?
base_url to https://kymaapi.com/v1 and use your ky- API key. All OpenAI SDK features work.Does Kyma work with the Anthropic SDK?
Does Kyma work with the Anthropic SDK?
/v1/messages. See our Anthropic guide.Can I use Kyma with LangChain?
Can I use Kyma with LangChain?
Can I use Kyma in Cursor, Claude Code, or Continue.dev?
Can I use Kyma in Cursor, Claude Code, or Continue.dev?
ky- API key and https://kymaapi.com/v1 as the base URL.Can I use Kyma with Google Antigravity?
Can I use Kyma with Google Antigravity?
Reliability
What happens when a model's provider is down?
What happens when a model's provider is down?
X-Kyma-Fallback response header to know if a fallback was used.What are the X-Kyma-Fallback headers?
What are the X-Kyma-Fallback headers?
X-Kyma-Fallback: true— a fallback was usedX-Kyma-Fallback-Layer: 1|2|3— 1 = same model different source, 3 = different model
X-Kyma-Fallback-Layer to decide whether to retry. Layer 1 means you got the same model you asked for; Layer 3 means an equivalent-quality substitute.What is Kyma's uptime?
What is Kyma's uptime?
API keys & account
How do API keys work?
How do API keys work?
ky-. You can create multiple keys in the Dashboard. Each key shares the same account balance and rate limits.Is my data private?
Is my data private?
Can I use this in production?
Can I use this in production?