Skip to main content

Overview

glm-4.7-flash is the efficiency-oriented GLM option on Kyma. It is aimed at routine, high-volume workloads where long context and low cost matter more than flagship depth.

Specs

FieldValue
Model IDglm-4.7-flash
Best forCheap long context, bulk throughput, routine tasks
Context window203K
Max output tokens65K
Input modalitiesText
Output modalitiesText
Tool callingYes
Structured outputsYes
Prompt cachingYes
SpeedFast
Cost bandCheap
Release stagePreview

Use this when

  • You need cheap long-context throughput.
  • You want a faster, lighter GLM option for routine tasks.
  • You are comfortable using a preview-stage model for volume work.

Pick something else when

  • You want the stronger GLM flagship: use glm-5.1.
  • You want the safer stable cheap GLM pick: use glm-4.5-air.
  • You need better multimodal support: use gemini-2.5-flash or kimi-k2.5.

Example

from openai import OpenAI

client = OpenAI(base_url="https://kymaapi.com/v1", api_key="ky-...")

response = client.chat.completions.create(
    model="glm-4.7-flash",
    messages=[{"role": "user", "content": "Summarize these logs and group recurring issues by type."}]
)