Speech-to-text. Multipart file upload, OpenAI Whisper API compatible. Synchronous, billed per minute.
Synchronous endpoint. Upload an audio file, get the transcript back in one call. Compatible with the OpenAI Whisper API — drop-in replacement forDocumentation Index
Fetch the complete documentation index at: https://docs.kymaapi.com/llms.txt
Use this file to discover all available pages before exploring further.
https://api.openai.com/v1/audio/transcriptions.
multipart/form-data upload.
mp3, wav, m4a, ogg, webm, flac. Max 25 MB. ~30 minutes of mono 16kHz mp3 fits comfortably.transcribe (recommended — auto-tracks the current best ASR model) or a pinned SKU like whisper-v3-turbo. See Audio models.en, vi, ja). Optional — Whisper auto-detects when omitted. Supplying it improves accuracy on short clips.json, verbose_json, text, srt, vtt. JSON formats embed a billing block in the response body. text returns the bare transcript and srt / vtt return subtitle files; for those three, billing rides on X-Kyma-* response headers so the body stays a clean transcript or subtitle file.200 OK with the transcript and a Kyma billing block.
"English").response_format is verbose_json.response_format is text, srt, or vtt, the body is a plain transcript or subtitle file (no JSON envelope) and billing comes back on response headers:
| Header | Meaning |
|---|---|
X-Kyma-Model | The model SKU that served the request |
X-Kyma-Duration-Sec | Detected audio duration in seconds |
X-Kyma-Billable-Minutes | Minutes charged |
X-Kyma-Cost-USD | Final cost in USD |
X-Kyma-Balance-USD | Remaining account balance |
srt returns a SubRip subtitle file (application/x-subrip; charset=utf-8); vtt returns a WebVTT file (text/vtt; charset=utf-8). Both are built from the same per-segment timestamps verbose_json exposes, so the timing matches across formats.
| Model | Per minute |
|---|---|
whisper-v3-turbo | $0.0009 |
| Status | error.code | When |
|---|---|---|
400 | invalid_request | Missing file field, or not multipart/form-data |
400 | not_a_transcription_model | model is not a transcription SKU |
401 | auth_error | Missing or invalid API key |
402 | billing_error | Insufficient credits |
404 | not_enabled | Audio gate not enabled on this account |
413 | invalid_request | File > 25 MB |
502 | provider_error | Upstream transcription failed |
transcribe aliaswatch-cli - open-source CLI that uses these endpoints to give any agent eyes and ears for any social video