202 with a job_id immediately; poll GET /v1/jobs/{job_id} until status is succeeded.
Pick a model
| Model | Best for | Cost / sec | Default 5s clip | Audio | Input |
|---|---|---|---|---|---|
kling-2.5-pro | Budget cinematic clips, b-roll | $0.0945 | $0.4725 | — | text + image |
kling-3-pro | Premium cinematic, hero brand video | $0.1512 | $0.7560 | — | text + image |
kling-3-pro-audio | Cinematic w/ diegetic sound, talking heads | $0.2268 | $1.1340 | native | text + image |
seedance-2-pro | Action, multi-shot, social w/ synced audio | $0.40959 | $2.04795 | bundled | text + image |
seedance-2-fast | Social shorts, rapid iteration, UI motion | $0.326565 | $1.63283 | bundled | text + image |
image_url to switch from text-to-video into image-to-video.
kling-2.5-pro
Cheapest cinematic tier. Photoreal humans, smooth motion, 5–10s clips. The right pick when you need a lot of cinematic b-roll without paying flagship prices.- Cost: 0.4725 for 5s)
- Modes: text-to-video (default), image-to-video (pass
image_url) - Best for: brand b-roll, character shots on a budget
kling-3-pro
Flagship Kling. Sharper than 2.5 Pro, photoreal humans, smooth motion. Use this for hero shots and premium brand video where the quality needs to stand up at full screen.- Cost: 0.7560 for 5s)
- Modes: text-to-video, image-to-video
- Best for: hero brand video, character/face shots, premium cinematic
kling-3-pro-audio instead.
kling-3-pro-audio
Kling 3 Pro with native audio. Same visuals askling-3-pro plus synchronized ambient sound and dialogue. About 50% more expensive per second for the audio track.
- Cost: 1.1340 for 5s)
- Audio: native (ambient + dialogue baked into the video)
- Best for: talking-head shots, atmospheric scenes, anything that needs diegetic sound
seedance-2-pro
ByteDance flagship. Multi-shot composition, dynamic camera moves, native audio bundled. 720p output. Best when motion and energy matter — action, product demos, fast-cut social.- Cost: 2.04795 for 5s)
- Resolution: 720p
- Audio: bundled
- Best for: action, multi-shot scenes, social with synced audio, product motion
seedance-2-fast
Seedance 2 fast tier. Quicker generation, ~20% cheaper thanseedance-2-pro. Same family behavior, native audio bundled. Right for rapid iteration and short social clips where turn-around beats absolute fidelity.
- Cost: 1.63283 for 5s)
- Resolution: 720p
- Audio: bundled
- Best for: social shorts, UI motion, rapid iteration, product demos
Image-to-video (I2V)
Every model above accepts animage_url. When present, Kyma routes the request to the model’s image-to-video variant — the image becomes the first frame and the prompt drives the motion.
Billing flow
- POST creates a job and places a hold for
estimated_cost(per-second rate × duration, markup applied). - On
succeeded, the hold is finalized as ausagetransaction at the actual cost. - On
failedorexpired, the hold is fully refunded — you only pay for clips you receive.
GET /v1/jobs/{id}: charged_amount is the final billed amount, estimated_cost is what was held up front.
Idempotency
Passidempotency_key to make POST safe to retry. The same (api_key, idempotency_key) pair always returns the same job — no duplicate charges, no duplicate generations.
See also
POST /v1/videos/generations— full request/response reference- Image Generation — pay-per-image SKUs on the same async pattern
- Pricing — Video generation — per-second cost table