Skip to main content

Overview

llama-3.3-70b is the balanced open-model choice on Kyma. It is not the absolute best at any one thing, but it is dependable, broadly useful, and easy to justify as a default in cost-sensitive systems.

Specs

FieldValue
Model IDllama-3.3-70b
Best forGeneral work, coding, balanced workloads
Context window128K
Max output tokens8K
Input modalitiesText
Output modalitiesText
Tool callingYes
Structured outputsYes
Prompt cachingYes
SpeedMedium
Cost bandBalanced
Release stageStable

Use this when

  • You want a broadly useful open model without flagship pricing.
  • You need a balanced baseline for general workloads.
  • You want a good fallback for mixed use cases.

Pick something else when

  • You want the best overall default: use qwen-3.6-plus.
  • You want better value for reasoning/coding: use deepseek-v3.
  • You want lower latency coding loops: use qwen-3-32b.

Example

from openai import OpenAI

client = OpenAI(base_url="https://kymaapi.com/v1", api_key="ky-...")

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Summarize this design proposal and suggest three improvements."}]
)