FAQ

Pricing & access

Is Kyma really free?

Yes to start. You get

0.50 free credits on signup. No credit card required. That's enough to try Kyma on lower-cost models. For heavier usage, buy credits starting at

What happens when I hit the rate limit?

You’ll get a 429 response. Wait a few seconds and retry. Limits are tier-based: Tier 0 = 30 RPM, up to Tier 4 = 300 RPM. See Rate Limits.

Can I increase my rate limit?

Yes. Rate limits increase automatically as you purchase credits. Tier 1 unlocks at $10 purchased (60 RPM). See Rate Limits for all tiers.

Models & capabilities

Which model should I use?

Start with qwen-3.6-plus if you want the safest default. Use kimi-k2.6 for tool-heavy agents, deepseek-r1 for hard reasoning, and gemini-2.5-flash for long-context tasks. See Which model should I use? for the full decision guide.

Are these models truly open source?

Yes. All models served by Kyma are open-source or open-weight models from Meta, Alibaba, Google, OpenAI, and others. Kyma itself is a managed service — not open source.

Do you support vision or multimodal inputs?

Yes. gemma-4-31b is the cheapest strong multimodal option, and kimi-k2.6 is the better pick if you also need tool-heavy agent behavior.

Can I generate images?

Yes. Kyma serves four image-generation models — flux-1.1-ultra (cinematic photo), flux-kontext-pro (image edit), ideogram-v3 (typography / logos), and recraft-v3 (vector / illustration) — through async POST /v1/images/generations. Pricing is per-image,

0.054 to

0.108 depending on model. See the Image Generation guide for prompts and examples.

What is the maximum context window?

Up to 1M tokens with Gemini models. Many other active models support 128K-262K. Check GET /v1/models for the live catalog and context windows.

Technical limits

Does Kyma have a hidden output token limit?

No. Kyma does not impose any output token cap on the gateway. The max_tokens parameter from your request is forwarded directly to the model. The max_output_tokens value shown in /v1/models reflects the model creator’s published specification, not a Kyma restriction. You can verify this via GET /v1/capabilities.

Are models on Kyma different from the original?

No. Kyma serves the exact same model weights published by the creator (Alibaba, Google, DeepSeek, Meta, etc.). The models run on high-performance inference infrastructure. Kyma does not fine-tune, quantize, or modify any model.

Why did my response get cut short?

Three common causes:

You didn’t set max_tokens — defaults to 4096 tokens. Set it higher for longer outputs (e.g., max_tokens: 8192).
Model’s output limit — each model has a maximum output capacity (check max_output_tokens in /v1/models). For long generation, use models like glm-5.1 (65K) or deepseek-r1 (32K).
finish_reason: length — the model hit its limit naturally. Increase max_tokens or use a model with higher output capacity.

Kyma never truncates output. If finish_reason is length, the model itself stopped.

What is the default max_tokens if I don't set it?

4096 tokens. Always set max_tokens explicitly in your request if you need longer outputs. Kyma forwards your value directly to the model.

Where can I see all model limits?

Call GET /v1/models — each model includes max_output_tokens, context_window, gateway_output_limit (always null), and max_tokens_passthrough (always true). For a gateway-level overview, call GET /v1/capabilities.

Compatibility

Is Kyma compatible with the OpenAI SDK?

Yes. Just change base_url to https://kymaapi.com/v1 and use your ky- API key. All OpenAI SDK features work.

Does Kyma work with the Anthropic SDK?

Yes. Kyma supports the Anthropic Messages API at /v1/messages. See our Anthropic guide.

Can I use Kyma with LangChain?

Yes. Use the OpenAI provider in LangChain with Kyma’s base URL.

Can I use Kyma in Cursor, Claude Code, or Continue.dev?

Yes. Configure Kyma as an OpenAI-compatible provider with your ky- API key and https://kymaapi.com/v1 as the base URL.

Can I use Kyma with Google Antigravity?

The official Google Antigravity IDE does not support custom API endpoints — it only works with Gemini, Claude Sonnet, and GPT-OSS via Google’s auth. However, you can use the community open-antigravity fork which supports custom OpenAI-compatible endpoints like Kyma. See our Antigravity guide for setup instructions.

Reliability

What happens when a model's provider is down?

Kyma automatically retries your request on backup infrastructure. Most outages are invisible to you — your request succeeds with the same model from a different source. If the exact model is unavailable everywhere, Kyma can substitute an equivalent-quality model. Check the X-Kyma-Fallback response header to know if a fallback was used.

What are the X-Kyma-Fallback headers?

When a fallback occurs, these headers are included in the response:

X-Kyma-Fallback: true — a fallback was used
X-Kyma-Fallback-Layer: 1|2|3 — 1 = same model different source, 3 = different model

Use X-Kyma-Fallback-Layer to decide whether to retry. Layer 1 means you got the same model you asked for; Layer 3 means an equivalent-quality substitute.

What is Kyma's uptime?

Kyma runs on redundant infrastructure with auto-failover. Individual provider outages don’t affect you because requests are automatically rerouted. Check Status for real-time model availability.

API keys & account

How do API keys work?

Keys start with ky-. You can create multiple keys in the Dashboard. Each key shares the same account balance and rate limits.

Is my data private?

Kyma does not store your prompts or responses. We only log metadata (model, latency, tokens) for usage tracking.

Can I use this in production?

Yes. Higher tiers unlock automatically as you purchase credits. Tier 1+ gets 60–300 RPM, credits never expire, and you can switch models without changing your integration.

Getting Started

More

Pricing & access

Models & capabilities

Technical limits

Compatibility

Reliability

API keys & account

Getting Started

More

Documentation Index

​Pricing & access

​Models & capabilities

​Technical limits

​Compatibility

​Reliability

​API keys & account

Pricing & access

Models & capabilities

Technical limits

Compatibility

Reliability

API keys & account