Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kymaapi.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

gpt-4o-mini-transcribe-2025-12-15 is OpenAI’s premium speech-to-text model, surfaced on Kyma through the transcribe-quality alias. It’s the right pick when accuracy on real-world audio matters more than raw cost — conversational dialogue, noisy environments, and multilingual code-switching (Vietnamese ↔ English mixing, Mandarin ↔ English, etc.) where Whisper Turbo’s distilled decoder gives ground. Ships alongside the default transcribe alias (which resolves to Whisper Large v3 Turbo). Two aliases, one decision per request: pick transcribe for high-volume bulk transcription, transcribe-quality when accuracy is the constraint.

Specs

FieldValue
Model IDgpt-4o-mini-transcribe-2025-12-15
Aliastranscribe-quality
Creator / ProviderOpenAI
Best forNoisy / conversational / code-switching audio, multilingual dictation
Max file size25 MB (multipart upload)
Input modalitiesAudio (mp3, wav, m4a, ogg, webm, flac)
Output modalitiesText
Pricing modePer minute
Min billable1 minute (rounded up)

Pricing

Cost
Per minute$0.00405
1-hour file$0.243
5-second clip$0.00405 (rounds up to 1 min)
~4.5× the cost of default Whisper Turbo ($0.0009/min) — that’s the premium you pay for OpenAI’s accuracy on hard audio. For high-volume bulk transcription, default to transcribe; reserve transcribe-quality for the cases that need it.

Use this when

  • Audio contains code-switching (e.g. Vietnamese + English in the same utterance) and Whisper Turbo is producing garbled output on the non-primary language.
  • Background noise, low-quality recording, or far-field microphones — accuracy matters more than throughput.
  • Conversational dictation where capitalization, punctuation, and proper nouns must be right first time.

Pick something else when

  • High-volume bulk transcription where ~99% accuracy is fine — use whisper-v3-turbo at ~4.5× cheaper.
  • The audio is over 25 MB — Quality tier currently only supports multipart upload. Use transcribe with the JSON audio_url mode for files up to 100 MB.
  • You need full per-segment timestamps reliably under outage — Quality tier does not fall back to Vertex Gemini (the default transcribe does), so a 5xx from OpenAI surfaces directly.

No fallback

The default transcribe alias has an automatic Whisper → Vertex Gemini fallback chain on upstream outage. The Quality tier opts out by design: you explicitly chose OpenAI for accuracy, and silently swapping to a different provider would defeat that contract. OpenAI outages return as 502 transcription_failed so your client knows to retry or downgrade. See the Fallback chain section on the endpoint reference for the full rules.

Concurrency limits

Routed through the openai audio sub-pool. Per-tier caps:
TierConcurrent slots
Tier 01
Tier 12
Tier 24
Tier 38
Tier 418
Saturating the OpenAI sub-pool does not affect Groq (transcribe) or any other audio capability. See Rate Limits — Audio limits.

Example

curl -X POST https://kymaapi.com/v1/audio/transcriptions \
  -H "Authorization: Bearer $KYMA_API_KEY" \
  -F "file=@interview.mp3" \
  -F "model=transcribe-quality" \
  -F "response_format=verbose_json"

Python (raw HTTP — OpenAI SDK doesn’t expose Kyma aliases)

import os
import requests

with open("interview.mp3", "rb") as f:
    resp = requests.post(
        "https://kymaapi.com/v1/audio/transcriptions",
        headers={"Authorization": f"Bearer {os.environ['KYMA_API_KEY']}"},
        files={"file": f},
        data={"model": "transcribe-quality", "response_format": "verbose_json"},
    )
result = resp.json()
print(result["text"])
The Python OpenAI SDK pins to whisper-1 internally for transcription so passing model="transcribe-quality" through the SDK won’t reach Kyma’s alias resolver. Raw requests works fine.

Aliases that resolve here

  • transcribe-quality — premium ASR tier on Kyma.
If you want stable behavior across alias changes, pin gpt-4o-mini-transcribe-2025-12-15 directly. If you want to ride future Quality-tier upgrades (e.g. if Kyma promotes a newer OpenAI STT SKU), use the alias.

See also