GLM 4.7 Flash - Kyma API

Overview

glm-4.7-flash is the efficiency-oriented GLM option on Kyma. It is aimed at routine, high-volume workloads where long context and low cost matter more than flagship depth.

Specs

Field	Value
Model ID	`glm-4.7-flash`
Best for	Cheap long context, bulk throughput, routine tasks
Context window	203K
Max output tokens	65K
Input modalities	Text
Output modalities	Text
Tool calling	Yes
Structured outputs	Yes
Prompt caching	Yes
Speed	Fast
Cost band	Cheap
Release stage	Preview

Use this when

You need cheap long-context throughput.
You want a faster, lighter GLM option for routine tasks.
You are comfortable using a preview-stage model for volume work.

Pick something else when

You want the stronger GLM flagship: use glm-5.1.
You want the safer stable cheap GLM pick: use glm-4.5-air.
You need better multimodal support: use gemini-2.5-flash or kimi-k2.6.

Example

from openai import OpenAI

client = OpenAI(base_url="https://kymaapi.com/v1", api_key="ky-...")

response = client.chat.completions.create(
    model="glm-4.7-flash",
    messages=[{"role": "user", "content": "Summarize these logs and group recurring issues by type."}]
)

GLM 4.5 Air Sonar

​Overview

​Specs

​Use this when

​Pick something else when

​Example

Overview

Specs

Use this when

Pick something else when

Example