Documentation Index
Fetch the complete documentation index at: https://docs.kymaapi.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
veo-3 is Google’s flagship Veo 3 tier — 1080p output with native audio: synchronized dialogue, ambient sound, and lip-sync. Best for hero brand video, talking-head shots, premium cinematic content where audio matters.
Fast budget tier without audio is veo-3-fast.
Specs
| Field | Value |
|---|
| Model ID | veo-3 |
| Creator | Google |
| Backend | Vertex AI (veo-3.0-generate-001, us-central1) |
| Best for | Hero brand video, talking-head, premium cinematic with native audio |
| Resolution | 1080p |
| Audio | Yes — native dialogue + ambient + lip-sync |
| Aspect ratios | 16:9 (default), 9:16 |
| Duration | 4, 6, or 8 seconds (Vertex enum) |
| First-frame I2V | Yes — pass image_url |
| Pricing mode | Per second × duration |
| Default latency | ~60–180s end-to-end (heavier inference than 720p) |
| Output | Blob-hosted MP4 (Vercel CDN, durable URL) |
Pricing
Per second of generated video. List = provider cost × 1.35.
| Variant | Provider $/s | Kyma list $/s | 8s clip |
|---|
veo-3 | $0.40 | $0.540 | $4.32 |
Live source: GET https://kymaapi.com/v1/pricing.
Compared to other video models on Kyma
| Strength | veo-3 | veo-3-fast | kling-3-pro-audio | seedance-2-pro |
|---|
| Native audio | ★★★★★ | — | ★★★★★ | ★★★★★ |
| Lip-sync quality | ★★★★★ | n/a | ★★★★ | ★★★★ |
| Cost $/8s | $4.32 | $1.08 | $1.81 | $2.43 |
| Resolution | 1080p | 720p | configurable | 720p |
| Best for | hero / talking-head | drafts | cinematic + audio | action + audio |
Use this when
- You need native audio + 1080p + photoreal lip-sync.
- The brief is talking-head, hero brand video, premium cinematic.
- Per-clip cost is acceptable for the deliverable.
Pick something else when
Example — text-to-video with audio
curl -X POST https://kymaapi.com/v1/videos/generations \
-H "Authorization: Bearer $KYMA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "veo-3",
"prompt": "A barista hands a customer a coffee saying \"have a great day\", soft cafe ambience",
"duration": 8
}'
Example — image-to-video with audio
curl -X POST https://kymaapi.com/v1/videos/generations \
-H "Authorization: Bearer $KYMA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "veo-3",
"prompt": "The subject turns to camera and waves, friendly background music",
"image_url": "https://example.com/portrait.jpg",
"duration": 6
}'
Async — poll GET /v1/jobs/{id} until succeeded (~60–180s). Output is a durable Vercel blob URL.