Generate Audio with AI

Generate audio using AI

curl --request POST \
  --url https://api.videodb.io/collection/{collection_id}/generate/audio/ \
  --header 'Content-Type: application/json' \
  --header 'x-access-token: <api-key>' \
  --data '
{
  "prompt": "Generate upbeat background music",
  "audio_type": "music",
  "callback_url": "https://webhook.example.com/callback"
}
'

{
  "success": true,
  "status": "processing",
  "data": {
    "id": "job-123",
    "output_url": "https://api.videodb.io/async-response/job-123"
  }
}

POST

collection

{collection_id}

generate

audio

Generate audio using AI

curl --request POST \
  --url https://api.videodb.io/collection/{collection_id}/generate/audio/ \
  --header 'Content-Type: application/json' \
  --header 'x-access-token: <api-key>' \
  --data '
{
  "prompt": "Generate upbeat background music",
  "audio_type": "music",
  "callback_url": "https://webhook.example.com/callback"
}
'

{
  "success": true,
  "status": "processing",
  "data": {
    "id": "job-123",
    "output_url": "https://api.videodb.io/async-response/job-123"
  }
}

Generate music, sound effects, and AI voices from text. The SDK provides specialized methods for each audio type with appropriate parameters.

Music Generation

Create background music and ambient soundtracks from text prompts.

import videodb

conn = videodb.connect(api_key="your_api_key")
coll = conn.get_collection()

music = coll.generate_music(
    prompt="Upbeat electronic dance music",
    duration=10
)

print(music.id)
print(music.generate_url())

Sound Effect Generation

Generate sound effects and environmental audio from descriptions.

import videodb

conn = videodb.connect(api_key="your_api_key")
coll = conn.get_collection()

sfx = coll.generate_sound_effect(
    prompt="Heavy rain with distant thunder",
    duration=5
)

print(sfx.id)
print(sfx.generate_url())

Voice Generation

Convert text to natural-sounding speech in multiple voices.

import videodb

conn = videodb.connect(api_key="your_api_key")
coll = conn.get_collection()

voice = coll.generate_voice(
    text="Welcome to our video demo",
    voice_name="Default"
)

print(voice.id)
print(voice.generate_url())

Default duration is 5 seconds for music, 2 seconds for sound effects
Audio IDs have an a- prefix (e.g., a-a1b2c3d4)
Generation is asynchronous; use callback_url for webhook notification
All audio returns an Audio object with generate_url() method

Generative Media Guide

Learn about all AI generation capabilities

Voiceovers Tutorial

Add narration to silent footage with AI voices

Authorizations

x-access-token

string

header

required

API key for authentication (sk-xxx format)

Path Parameters

collection_id

string

required

Example:

"default"

Body

application/json

prompt

string

required

Example:

"Generate upbeat background music"

audio_type

enum<string>

required

Available options:

speech,

sound_effect,

music

Example:

"music"

callback_url

string

Example:

"https://webhook.example.com/callback"

Response

200 - application/json

Audio generation started

success

boolean

Example:

true

status

enum<string>

Available options:

processing,

done,

failed

Example:

"processing"

data

object

Show child attributes

Generate Video with AI Generate Text with AI

⌘I

Overview

Authentication

Collections

Videos

Streaming & Playback

Transcription

Indexing & Search

Scene & Frame Analysis

Advanced Operations

Audio

Images

AI Generation

Timeline & Compilation

Billing

Downloads

Utilities

Assets

Editor

Transcode

Meeting

Capture

RTStream

RTStream Transcription

RTStream Scene Indexing

RTStream Events & Alerts

Music Generation

Sound Effect Generation

Voice Generation

Generative Media Guide

Voiceovers Tutorial

Authorizations

Path Parameters

Body

Response

Overview

Authentication

Collections

Videos

Streaming & Playback

Transcription

Indexing & Search

Scene & Frame Analysis

Advanced Operations

Audio

Images

AI Generation

Timeline & Compilation

Billing

Downloads

Utilities

Assets

Editor

Transcode

Meeting

Capture

RTStream

RTStream Transcription

RTStream Scene Indexing

RTStream Events & Alerts

​Music Generation

​Sound Effect Generation

​Voice Generation

Generative Media Guide

Voiceovers Tutorial

Authorizations

Path Parameters

Body

Response

Music Generation

Sound Effect Generation

Voice Generation