Overview
llama-3.3-70b is the balanced open-model choice on Kyma. It is not the absolute best at any one thing, but it is dependable, broadly useful, and easy to justify as a default in cost-sensitive systems.
Specs
| Field | Value |
|---|---|
| Model ID | llama-3.3-70b |
| Best for | General work, coding, balanced workloads |
| Context window | 128K |
| Max output tokens | 8K |
| Input modalities | Text |
| Output modalities | Text |
| Tool calling | Yes |
| Structured outputs | Yes |
| Prompt caching | Yes |
| Speed | Medium |
| Cost band | Balanced |
| Release stage | Stable |
Use this when
- You want a broadly useful open model without flagship pricing.
- You need a balanced baseline for general workloads.
- You want a good fallback for mixed use cases.
Pick something else when
- You want the best overall default: use
qwen-3.6-plus. - You want better value for reasoning/coding: use
deepseek-v3. - You want lower latency coding loops: use
qwen-3-32b.