Llama 3.3 70B - Kyma API

Overview
Specs
Use this when
Pick something else when
Example

Overview

llama-3.3-70b is the balanced open-model choice on Kyma. It is not the absolute best at any one thing, but it is dependable, broadly useful, and easy to justify as a default in cost-sensitive systems.

Specs

Field	Value
Model ID	`llama-3.3-70b`
Best for	General work, coding, balanced workloads
Context window	128K
Max output tokens	8K
Input modalities	Text
Output modalities	Text
Tool calling	Yes
Structured outputs	Yes
Prompt caching	Yes
Speed	Medium
Cost band	Balanced
Release stage	Stable

Use this when

You want a broadly useful open model without flagship pricing.
You need a balanced baseline for general workloads.
You want a good fallback for mixed use cases.

Pick something else when

You want the best overall default: use qwen-3.6-plus.
You want better value for reasoning/coding: use deepseek-v3.
You want lower latency coding loops: use qwen-3-32b.

Example

from openai import OpenAI

client = OpenAI(base_url="https://kymaapi.com/v1", api_key="ky-...")

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Summarize this design proposal and suggest three improvements."}]
)

Gemini 3 Flash GPT-OSS 120B

⌘I