Skip to main content

Overview

deepseek-v4-flash is the fast variant in DeepSeek’s V4 lineup (April 2026). 284B total / 13B active MoE, MIT license. Same V4 family behavior at a fraction of the cost — built for general work, coding, and long-context bulk tasks where you want value over flagship quality.

Specs

FieldValue
Model IDdeepseek-v4-flash
Best forGeneral work, coding, long context, value
Context window1,000,000 tokens
Max output tokens65,536
Input modalitiesText
Output modalitiesText
Tool callingYes
Structured outputsYes
ReasoningYes
Prompt cachingYes
SpeedFast
Cost bandCheap
Release stagePreview

Pricing

Per 1M tokens
Input$0.189
Output$0.378

Use this when

  • You want strong V4-family behavior at the lowest possible price.
  • You need 1M context for long documents or large repos but don’t want flagship cost.
  • You’re running high-volume coding or extraction workloads.
  • You want a default cheap model that still handles tools and reasoning.

Pick something else when

Example

from openai import OpenAI

client = OpenAI(base_url="https://kymaapi.com/v1", api_key="ky-...")

response = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[{"role": "user", "content": "Summarize the key tradeoffs in this RFC, then list the open questions."}]
)