Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kymaapi.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

minimax-voice-clone is a voice-cloning service, not a direct TTS model. Upload reference audio of someone speaking (10 seconds to 5 minutes) and get back a voice_id. That ID then becomes a usable voice on /v1/audio/speech with any MiniMax voice model. Right for brand voice cloning when you have voice talent, character voices for games and animation, or custom narrator profiles for podcasts.

Specs

FieldValue
Model IDminimax-voice-clone
CreatorMiniMax
Best forBrand voice cloning, character voices, custom narrators
Reference audio10 sec – 5 min, MP3/WAV/M4A, max 10 MB
Outputvoice_id (kyma-namespaced)
Pricing modePer call (flat)

Pricing

Cost
Per cloned voice$2.10 flat (one-time)
The voice_id is then free to use in /v1/audio/speech — only the speech calls themselves bill at MiniMax’s per-character rate.

Use this when

  • You have voice talent and want their voice usable across all your TTS workflows.
  • You’re building a character cast for a game or animation.
  • You want a single brand voice spanning multiple languages and use cases.

Pick something else when

  • You don’t have a reference recording → minimax-voice-design ($4.20/voice from text description).
  • You’re fine with stock MiniMax voices → skip cloning, browse system voices via GET /v1/audio/voices and use directly.

Example

# Step 1: clone the voice
curl -X POST https://kymaapi.com/v1/audio/voice-clone \
  -H "Authorization: Bearer $KYMA_API_KEY" \
  -F file=@reference.mp3 \
  -F name="brand-narrator"
# → { "voice_id": "kyma_3f8e2a1b4c9d7e60", ... }

# Step 2: use the voice
curl -X POST https://kymaapi.com/v1/audio/speech \
  -H "Authorization: Bearer $KYMA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimax-speech-hd",
    "input": "Welcome to the show.",
    "voice_id": "kyma_3f8e2a1b4c9d7e60"
  }' \
  --output intro.mp3

Ownership

Cloned voice IDs are gated per Kyma user. If user A passes user B’s voice_id to /v1/audio/speech, the request returns 403 voice_not_owned.

See also