GET /v1/models to get the latest list programmatically.
Recommended Models ⭐
These models offer the best balance of quality, speed, and capabilities.Llama 3.3 70B
Best all-rounder. General tasks, coding, reasoning. 128K context. Ultra-fast.
Qwen 3 32B
Best for coding. Code generation, math, multilingual. 32K context.
Gemma 4 31B
Best multimodal. Vision capable, Google’s newest. 128K context.
Qwen 3 235B
Highest quality. 235B params, complex reasoning. Ultra-fast.
All Models
| Model ID | Name | Context | Speed | Best For |
|---|---|---|---|---|
llama-3.3-70b | Llama 3.3 70B | 128K | ⚡ Fast | General, code, reasoning |
llama-4-scout | Llama 4 Scout 17B | 512K | ⚡ Fast | Long documents, analysis |
llama-3.1-8b | Llama 3.1 8B | 8K | ⚡⚡ Fastest | Quick tasks, classification |
qwen-3-32b | Qwen 3 32B | 32K | ⚡ Fast | Code, math, multilingual |
kimi-k2 | Kimi K2 | 128K | ⚡ Fast | Agentic coding, tool use |
gpt-oss-120b | GPT-OSS 120B | 128K | Medium | General intelligence, writing |
gpt-oss-20b | GPT-OSS 20B | 128K | ⚡ Fast | Fast general tasks |
gemma-4-31b | Gemma 4 31B | 128K | Medium | Multimodal, vision |
gemma-4-26b-moe | Gemma 4 26B MoE | 128K | ⚡ Fast | Efficient inference |
gemini-3-flash | Gemini 3 Flash | 1M | Medium | Ultra-long context |
gemini-2.5-flash | Gemini 2.5 Flash | 1M | Medium | Long context |
gemma-3-27b | Gemma 3 27B | 128K | Medium | General |
llama-3.1-8b-cerebras | Llama 3.1 8B | 8K | ⚡⚡ Fastest | Ultra-fast inference |
qwen-3-235b-cerebras | Qwen 3 235B | 32K | ⚡ Fast | Complex reasoning |
Models are updated regularly. New models are added as they become available from providers.