Documentation Index
Fetch the complete documentation index at: https://docs.kymaapi.com/llms.txt
Use this file to discover all available pages before exploring further.
Best Model for This
| Model | Why | Cost per 1K messages |
|---|---|---|
qwen-3.6-plus | Best quality, handles nuance well | ~$0.30 |
llama-3.3-70b | Fastest response, great all-rounder | ~$0.80 |
gemini-2.5-flash | Cheapest at scale, 1M context | ~$0.20 |
Quick Start
Tips & Best Practices
- Stream always — users perceive streamed responses as 3-5x faster even at the same token speed.
- Cap history length — trim
messagesto the last 10-20 turns or ~4K tokens to keep latency low and costs predictable. - System prompt sets personality — define tone, scope, and any constraints in the first system message.
- Use
qwen-3.6-plusfor quality,llama-3.3-70bfor speed — swap the model string; the code is identical.
Cost Estimate
| Volume | Model | Monthly cost |
|---|---|---|
| 1K messages/day | qwen-3.6-plus | ~$9/month |
| 1K messages/day | llama-3.3-70b | ~$24/month |
| 10K messages/day | gemini-2.5-flash | ~$60/month |
Next Steps
- Streaming — deeper streaming patterns
- Model Aliases — use
best,fast,balancedshortcuts - Prompt Caching — cache system prompts to cut costs by up to 90%