Skip to main content

Overview

qwen-3-32b is for when you want strong Qwen-style coding quality but with lower latency than the flagship path. It is especially useful in tight edit-run-debug loops.

Specs

FieldValue
Model IDqwen-3-32b
Best forCoding, math, multilingual tasks, low-latency workflows
Context window32K
Max output tokens8K
Input modalitiesText
Output modalitiesText
Tool callingYes
Structured outputsYes
Prompt cachingYes
SpeedFast
Cost bandBalanced
Release stageStable

Use this when

  • You need faster iteration on code tasks.
  • You want a cheaper/faster Qwen option than qwen-3.6-plus.
  • You do not need extremely long context.

Pick something else when

  • You want the strongest overall quality: use qwen-3.6-plus.
  • You need longer context for agent sessions: use kimi-k2.5.
  • You need 1M context: use gemini-2.5-flash.

Example

from openai import OpenAI

client = OpenAI(base_url="https://kymaapi.com/v1", api_key="ky-...")

response = client.chat.completions.create(
    model="qwen-3-32b",
    messages=[{"role": "user", "content": "Refactor this TypeScript function for readability and speed."}]
)