Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kymaapi.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

whisper-v3-turbo is OpenAI’s Whisper Large v3 Turbo — the fast variant of Whisper Large v3, distilled to skip half the decoder layers without losing meaningful accuracy. On Kyma it serves both the transcribe alias and the whisper-v3-turbo SKU at 228x realtime inference speed. Right pick for transcripts, voice-agent input, podcast captions, and any pipeline where speed and cost matter more than the last 1% of WER.

Specs

FieldValue
Model IDwhisper-v3-turbo
CreatorOpenAI
LicenseMIT (model weights)
Best forTranscription, captions, voice agents
Max file size25 MB
Max duration~30 min mono 16kHz mp3
Input modalitiesAudio (mp3, wav, m4a, ogg, webm, flac)
Output modalitiesText
Pricing modePer minute
Min billable1 minute (rounded up)

Pricing

Cost
Per minute$0.0009
1-hour file$0.054
5-second clip$0.0009 (rounds up to 1 min)

Use this when

  • You need accurate transcripts at ~50x cheaper than full multimodal LLM analysis.
  • You’re feeding a voice agent or building real-time captions where end-to-end latency matters.
  • You want the OpenAI Whisper API shape with no code changes — just swap the base_url.

Pick something else when

  • You need to know the mood or background sound, not just the words: use gemini-3-flash-audio.
  • Your file is longer than ~30 minutes — split the audio first, or wait for the upcoming Files API path.

Example

curl -X POST https://kymaapi.com/v1/audio/transcriptions \
  -H "Authorization: Bearer $KYMA_API_KEY" \
  -F "file=@meeting.mp3" \
  -F "model=whisper-v3-turbo" \
  -F "response_format=verbose_json"
Response includes the full transcript, per-segment timestamps, and detected language. See endpoint reference for all parameters.

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://kymaapi.com/v1",
    api_key="kyma-...",
)

with open("meeting.mp3", "rb") as f:
    result = client.audio.transcriptions.create(
        model="whisper-v3-turbo",
        file=f,
        response_format="verbose_json",
    )

print(result.text)
for segment in result.segments:
    print(f"[{segment.start:.2f}-{segment.end:.2f}] {segment.text}")

Aliases that resolve here

  • transcribe — auto-tracks the current best ASR model on Kyma. Today that’s this SKU.
If you want stable behavior across alias changes, pin whisper-v3-turbo directly. If you want to ride future upgrades automatically, use transcribe.

See also