Kling 3 Pro (Audio)

Overview

kling-3-pro-audio is kling-3-pro with native audio. Same photoreal humans, same smooth motion, same sharp output — plus synchronized ambient sound and dialogue. About 50% more expensive per second than the silent variant.

Specs

Field	Value
Model ID	`kling-3-pro-audio`
Creator	Kuaishou
Best for	Cinematic clips needing diegetic sound, talking-head shots, ambient atmosphere
Default duration	5 seconds
Max duration	10 seconds
Input modalities	Text, image (I2V)
Output modalities	Video, audio
Resolution	1080p
Audio	Native (ambient + dialogue, baked into the video)
Pricing mode	Per second

Pricing

	Cost
Per second	$0.2268
Default 5s clip	$1.1340

Use this when

The clip needs diegetic sound — espresso machine hiss, footsteps, dialogue, ambient cafe murmur.
You’re producing talking-head shots or atmospheric scenes.
You’d otherwise have to add Foley in post.

Pick something else when

The clip is silent or you’ll add audio in post: use kling-3-pro and save ~33% per second.
You want fast-cut social with bundled audio: use seedance-2-pro or seedance-2-fast.

Example

curl -X POST https://kymaapi.com/v1/videos/generations \
  -H "Authorization: Bearer $KYMA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kling-3-pro-audio",
    "prompt": "a barista pulling an espresso shot, the machine hisses, ambient cafe murmur",
    "duration": 5
  }'

The endpoint is async — POST returns 202 with a job_id; poll GET /v1/jobs/{id} until status is succeeded.

Models

Kling 3 Pro (Audio)

Overview

Specs

Pricing

Use this when

Pick something else when

Example

See also

​Overview

​Specs

​Pricing

​Use this when

​Pick something else when

​Example

​See also

Overview

Specs

Pricing

Use this when

Pick something else when

Example

See also