Skip to main content

Decision Tree

Not sure where to start? Follow this guide:
β†’ Llama 3.3 70B (llama-3.3-70b)The most popular open source model. Great at coding, reasoning, writing, and general tasks. 128K context window. Ultra-fast via Kyma.
model="llama-3.3-70b"
β†’ Qwen 3 32B (qwen-3-32b)Top coding model. Excellent at code generation, debugging, math, and multilingual tasks. 32K context.
model="qwen-3-32b"
β†’ Qwen 3 235B (qwen-3-235b-cerebras)The largest model available on Kyma. 235 billion parameters. Best for complex analysis, research, and nuanced reasoning. Still ultra-fast.
model="qwen-3-235b-cerebras"
β†’ Llama 4 Scout (llama-4-scout)512K context window β€” can process entire books, codebases, or large datasets in a single request.
model="llama-4-scout"
β†’ Gemma 4 31B (gemma-4-31b)Google’s newest open model. Supports image understanding. 128K context.
model="gemma-4-31b"
β†’ Llama 3.1 8B (llama-3.1-8b)Smallest model, lowest latency. Perfect for classification, extraction, or simple Q&A where speed matters most.
model="llama-3.1-8b"
β†’ Kimi K2 (kimi-k2)Purpose-built for agentic coding and tool use. Excels at multi-step reasoning and function calling.
model="kimi-k2"

Full Model Comparison

ModelContextSpeedBest ForQuality
⭐ Llama 3.3 70B128KFastGeneral, code, reasoning⭐⭐⭐⭐⭐
⭐ Qwen 3 32B32KFastCode, math, multilingual⭐⭐⭐⭐⭐
⭐ Qwen 3 235B32KFastComplex reasoning⭐⭐⭐⭐⭐
⭐ Gemma 4 31B128KMediumMultimodal, vision⭐⭐⭐⭐
πŸ”₯ Llama 4 Scout512KFastLong documents⭐⭐⭐⭐
πŸ”₯ Kimi K2128KFastAgentic coding⭐⭐⭐⭐
πŸ”₯ GPT-OSS 120B128KMediumWriting, general⭐⭐⭐⭐
πŸ”₯ Gemma 4 26B MoE128KFastEfficient general⭐⭐⭐⭐
Gemini 3 Flash1MMediumUltra-long context⭐⭐⭐⭐
Gemini 2.5 Flash1MMediumLong context⭐⭐⭐
GPT-OSS 20B128KFastSimple tasks⭐⭐⭐
Gemma 3 27B128KMediumGeneral⭐⭐⭐
Llama 3.1 8B8KFastestQuick tasks⭐⭐⭐
Llama 3.1 8B (alt)8KFastestQuick tasks⭐⭐⭐
Start with Llama 3.3 70B β€” it’s the best default choice for most applications. Switch to a specialized model only when you have a specific need.

Use by task

TaskRecommended ModelWhy
ChatbotLlama 3.3 70BBest general quality + fast
Code generationQwen 3 32BTop coding benchmarks
Code reviewKimi K2Built for agentic code tasks
SummarizationLlama 4 Scout512K context for long docs
Data extractionLlama 3.1 8BFast + structured output
Creative writingGPT-OSS 120BStrong writing capabilities
Research analysisQwen 3 235BHighest reasoning quality
Image understandingGemma 4 31BMultimodal support