VideoDB Documentation

Pages

Generative Media Quickstart

⁠

Welcome! This guide walks developers through the fastest path to creating images, music, sound effects, voices, and short video clips using the VideoDB Python SDK. It also covers transcript translation, automated dubbing, and YouTube search utilities.

Audience: Python developers who already have a VideoDB account and want to add generative features for their media workflows or localization for an application.

⁠

1. Installation

pip install --upgrade videodb

The SDK supports Python ≥ 3.8 on Linux, macOS, and Windows.

⁠

2. Authentication & First Collection

import videodb

API_KEY = "YOUR_API_KEY" # ▶️ Replace with the key from https://console.videodb.io

conn = videodb.connect(api_key=API_KEY)

# The default collection to store assets

coll = conn.get_collection()

print("Connected to collection:", coll.id)

If your organisation uses multiple collections, you can pass a collection_id argument instead of calling get_collection().

⁠

At‑a‑Glance Cheat Sheet

# Image

coll.generate_image(prompt, aspect_ratio='1:1', callback_url=None)

# Music

coll.generate_music(prompt, duration=5, callback_url=None)

# SFX

coll.generate_sound_effect(prompt, duration=2, config={}, callback_url=None)

# Voice

coll.generate_voice(text, voice_name='Default', config={}, callback_url=None)

# Dub

coll.dub_video(video_id, language_code, callback_url=None)

# Video

coll.generate_video(prompt, duration=5, callback_url=None)

# YouTube

conn.youtube_search(query, result_threshold=10, duration='medium')

# Translate

video.translate_transcript(language, additional_notes='', callback_url=None)

⁠

3. Generative End‑points

Each generative call is asynchronous: the SDK returns an asset object ( Audio, Video, Image) immediately.

Call .generate_url() (or .play() for video) to fetch the finished file.

Optionally supply a callback_url to receive a webhook when rendering completes.

generate_image()

generate_image()

Parameter

Type

Required

Default

Notes

prompt

str

Yes

—

Text description of the desired image.

aspect_ratio

Literal['1:1','9:16','16:9','4:3','3:4'] | None

'1:1'

Any other ratio raises ValueError.

callback_url

str | None

None

POSTed JSON when ready.

There are no rows in this table

⁠

# returns image object

image = coll.generate_image(

prompt="Green neon jellyfish photography",

aspect_ratio="9:16",

)

print(image.generate_url())

⁠

generate_music()

generate_music()

Parameter

Type

Required

Default

Notes

prompt

str

Yes

—

Musical style & mood.

duration

int

Total seconds. Values <1 or >300 raise ValueError.

callback_url

str | None

None

There are no rows in this table

⁠

# returns Audio object

music = coll.generate_music(prompt="Upbeat electronic background", duration=10)

⁠

generate_sound_effect()

generate_sound_effect()

Parameter

Type

Required

Default

Notes

prompt

str

Yes

—

duration

int

2 second

Short SFX ≤30 s recommended.

config

dict

{}

Model‑specific options such as prompt_influence.

callback_url

str | None

None

There are no rows in this table

⁠

generate_voice() (Text‑to‑Speech)

generate_voice() (Text‑to‑Speech)

Parameter

Type

Required

Default

Notes

text

str

Yes

—

Up to 5 000 characters.

voice_name

str

'Default'

Check

Voice catalogue (built‑in presets)

⁠

below.

config

dict

{}

Provider‑specific keys such as stability, style, similarity_boost.

callback_url

str | None

None

There are no rows in this table

⁠

Config Parameters

"stability": 0.0, # Lower = more emotional variation; higher = more monotone

"similarity_boost": 1.0, # Higher = closer match to original voice

"style": 0.0 # Higher = more exaggerated speaking style

Voice catalogue (built‑in presets)

Name

Voice Style

Accent

Gender

Aria

Expressive

American

Female

Roger

Confident

American

Male

Sarah

Soft

American

Young Female

Laura

Upbeat

American

Young Female

Charlie

Natural

Australian

Male

George

Warm

British

Middle-aged Male

Callum

Intense

Transatlantic

Male

River

Confident

American

Non-binary

Liam

Articulate

American

Young Male

Charlotte

Seductive

Swedish

Young Female

Alice

Confident

British

Middle-aged Female

Matilda

Friendly

American

Middle-aged Female

Will

Friendly

American

Young Male

Jessica

Expressive

American

Young Female

Eric

Friendly

American

Middle-aged Male

Chris

Casual

American

Middle-aged Male

Brian

Deep

American

Middle-aged Male

Daniel

Authoritative

British

Middle-aged Male

Lily

Warm

British

Middle-aged Female

Bill

Trustworthy

American

Old Male

There are no rows in this table

⁠

generate_video()

generate_video()

Parameter

Type

Required

Default

Notes

prompt

str

Yes

—

duration

int

Must be 5 ‑ 8 s inclusive. Invalid values raise ValueError.

callback_url

str | None

None

There are no rows in this table

⁠

# returns a video object

clip = coll.generate_video(prompt="Cinematic lion close‑up", duration=7)

clip.play()

⁠

4. Dub video

Dub video into the language you provide. Returns a new video object that can be used for downstream tasks.

dubbed = coll.dub_video(video_id=video.id, language_code="hi")

dubbed.play()

dub_video() parameters

dub_video() parameters

Parameter

Type

Required

Default

Notes

video_id

str

Yes

—

Must belong to caller's collection.

language_code

str

Yes

—

ISO 639‑1. Supported languages listed in docs.

callback_url

str | None

None

There are no rows in this table

⁠

5. YouTube Utilities

Search youtube directly from your python SDK.

results = conn.youtube_search(

query="learn python programming",

result_threshold=3,

duration="long"

)

youtube_search()

youtube_search()

Parameter

Type

Required

Default

Notes

query

str

Yes

—

result_threshold

int | None

Max results. None returns all.

duration

str

'medium'

Duration filter:

short | medium | long

There are no rows in this table

⁠

youtube_search() returns a list of dicts containing at minimum title and link keys.

6. Transcript Translation

Upload

video = coll.upload(url="https://youtu.be/…")

video.play()

Index spoken words (required once)

video.index_spoken_words()

Translate transcript

fr_text = video.translate_transcript(language="fr")

translate_transcript() parameters

translate_transcript() parameters

Parameter

Type

Required

Default

Notes

language

str

Yes

—

ISO 639‑1 code.

additional_notes

str

Style guidance for the model.

callback_url

str | None

None

There are no rows in this table

⁠

Callback Workflow (Optional)

⁠

All generative calls accept callback_url. VideoDB will POST a JSON payload when processing finishes:

{

"asset_id": "abc123",

"status": "completed",

"url": "https://cdn.videodb.io/..."

}

Error Handling & Webhooks

⁠

All generative methods can raise ValueError, VideoDBAPIError, or VideoDBRateLimitError.

400 – invalid parameters

401 – bad API key

429 – rate limit (check Retry-After header)

Next Steps

Checkout

Dynamic Video Streams⁠

to create powerful video editing and creation workflow automations.

Checkout

Editing Agent⁠

on our open source VideoDB Director framework for inspiration.

🎥 Must-See Tutorials: Check out these powerful demos and see GenAI integration in action:

🌟

Text-to-Movie Generation⁠

⁠

🎙️ Integrating AI Voiceovers⁠

⁠

🤖 Creating Voice Cloning Agents⁠

⁠

We have some initial usage limits—DM us if you’d like additional access to fully explore before making your decision.

1. Installation

2. Authentication & First Collection

3. Generative End‑points

generate_image()

generate_music()

generate_sound_effect()

generate_voice() (Text‑to‑Speech)

Voice catalogue (built‑in presets)

generate_video()

4. Dub video

dub_video() parameters

5. YouTube Utilities

youtube_search()

6. Transcript Translation

translate_transcript() parameters

Callback Workflow (Optional)

Error Handling & Webhooks

Next Steps

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.