> ## Documentation Index
> Fetch the complete documentation index at: https://docs.videodb.io/llms.txt
> Use this file to discover all available pages before exploring further.

# AI Voiceovers

> Add professional narration to silent footage

> **Case: Automatically Creating Voiceover for Silent Footage of the Underwater World**

<a href="https://colab.research.google.com/github/video-db/videodb-cookbook/blob/main/examples/AI_Voiceover.ipynb" target="_blank">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab" noZoom />
</a>

## Overview

Voiceovers are the secret sauce that turns silent footage into captivating stories. They add depth, emotion, and excitement, elevating the viewing experience.

Traditionally, this workflow required stitching together multiple tools: one for script writing (LLM), one for voice generation (TTS), and another for video editing.

**VideoDB** simplifies this by bringing everything under one roof. In this tutorial, we will:

1. **Upload** a silent video.
2. **Analyze** the video to understand its visual content.
3. **Generate** a narration script using VideoDB's text generation.
4. **Generate** a professional AI voiceover using VideoDB's voice generation.
5. **Merge** them instantly into a final video.

## Setup

### Installing VideoDB

<CodeGroup>
  ```python Python theme={null}
  !pip install videodb
  ```

  ```javascript Node.js theme={null}
  npm install videodb
  ```
</CodeGroup>

### API Keys

<Note>
  You only need your VideoDB API Key. Get your API key from [VideoDB Console](https://console.videodb.io). Free for first 50 uploads, no credit card required.
</Note>

## Implementation

### Step 1: Connect to VideoDB

Connect to VideoDB using your API key to establish a session.

<CodeGroup>
  ```python Python theme={null}
  import videodb

  # Set your API key
  api_key = "your_api_key"

  # Connect to VideoDB
  conn = videodb.connect(api_key=api_key)
  coll = conn.get_collection()
  ```

  ```javascript Node.js theme={null}
  import { connect } from 'videodb';

  // Connect to VideoDB
  const conn = await connect({ apiKey: process.env.VIDEO_DB_API_KEY });
  const coll = await conn.getCollection();
  ```
</CodeGroup>

### Step 2: Upload Video

We'll upload the silent underwater footage directly from YouTube.

<CodeGroup>
  ```python Python theme={null}
  # Upload a video by URL
  video = coll.upload(url='https://youtu.be/RcRjY5kzia8')
  ```

  ```javascript Node.js theme={null}
  // Upload a video by URL
  const video = await coll.uploadURL({ url: 'https://youtu.be/RcRjY5kzia8' });
  ```
</CodeGroup>

### Step 3: Analyze Visuals

We need to know what is happening in the video to write a script for it. We'll use `index_scenes()` to analyze the visual content.

<CodeGroup>
  ```python Python theme={null}
  video_scenes_id = video.index_scenes()
  ```

  ```javascript Node.js theme={null}
  const videoScenesId = await video.indexScenes();
  ```
</CodeGroup>

Let's view the description of first scene from the video

<CodeGroup>
  ```python Python theme={null}
  video_scenes = video.get_scene_index(video_scenes_id)

  import json
  print(json.dumps(video_scenes[0], indent=2))
  ```

  ```javascript Node.js theme={null}
  const videoScenes = await video.getSceneIndex(videoScenesId);

  console.log(JSON.stringify(videoScenes[0], null, 2));
  ```
</CodeGroup>

**Output:**

```json theme={null}
{
  "description": "The scene immerses the viewer in a vibrant, fluid expanse dominated by myriad blue and aqua forms. These countless, somewhat irregular shapes are densely packed, giving the impression of an immense, teeming mass in constant, gentle motion. Each form possesses a darker core that gradually lightens towards its edges, creating a translucent, almost glowing effect, as if illuminated from within. The varying shades, ranging from deep sapphire to brilliant turquoise, blend and shift across the frame, conjuring the image of a vast underwater environment. It evokes a colossal school of luminous marine creatures, perhaps fish or jellyfish, drifting together in a mesmerizing, organic dance, filling the visual field with their shimmering presence and dynamic, watery energy.",
  "end": 15.033,
  "metadata": {},
  "scene_metadata": {},
  "start": 0.0
}
```

### Step 4: Generate Script

Now, we use VideoDB's `generate_text` method to write a voiceover script based on the scene descriptions we just retrieved.

<CodeGroup>
  ```python Python theme={null}
  # Construct a prompt with the scene context
  scene_context = "\n".join([f"- {scene['description']}" for scene in video_scenes])

  prompt = f"""
  Here is a visual description of a video about the underwater world:
  {scene_context}

  Based on this, write a short, engaging voiceover script in the style of a nature documentary narrator (like David Attenborough).
  Keep it synced to the flow of the visuals described.
  Return ONLY the raw text of the narration, no stage directions or titles.
  """

  # Generate the script using VideoDB
  script_response = coll.generate_text(
      prompt=prompt,
      model_name="pro")

  script_text = script_response["output"]

  print("--- Generated Script ---")
  print(script_text)
  ```

  ```javascript Node.js theme={null}
  // Construct a prompt with the scene context
  const sceneContext = videoScenes.map(scene => `- ${scene.description}`).join('\n');

  const prompt = `
  Here is a visual description of a video about the underwater world:
  ${sceneContext}

  Based on this, write a short, engaging voiceover script in the style of a nature documentary narrator (like David Attenborough).
  Keep it synced to the flow of the visuals described.
  Return ONLY the raw text of the narration, no stage directions or titles.
  `;

  // Generate the script using VideoDB
  const scriptResponse = await coll.generateText(
      prompt,
      "pro"
  );

  const scriptText = scriptResponse.output;

  console.log("--- Generated Script ---");
  console.log(scriptText);
  ```
</CodeGroup>

### Step 5: Generate Voiceover Audio

We can now turn that text into speech using `generate_voice`. This returns an Audio object directly, so we don't need to save or upload files manually.

<CodeGroup>
  ```python Python theme={null}
  # Generate speech directly as a VideoDB Audio Asset
  audio = coll.generate_voice(
      text=script_text,
      voice_name="Default")

  print(f"Generated Audio Asset ID: {audio.id}")
  ```

  ```javascript Node.js theme={null}
  // Generate speech directly as a VideoDB Audio Asset
  const audio = await coll.generateVoice(
      scriptText,
      "Default"
  );

  console.log(`Generated Audio Asset ID: ${audio.id}`);
  ```
</CodeGroup>

### Step 6: Compose the Video

We have the video and the generated voiceover. Now we merge them using the Timeline Editor.

<CodeGroup>
  ```python Python theme={null}
  from videodb.editor import Timeline, Track, Clip, VideoAsset, AudioAsset

  # Create a timeline
  timeline = Timeline(conn)

  # 1. Create a Video Track
  video_track = Track()
  video_asset = VideoAsset(id=video.id)
  # Add the video clip
  video_clip = Clip(asset=video_asset, duration=float(video.length))
  video_track.add_clip(0, video_clip)

  # 2. Create an Audio Track for the voiceover
  audio_track = Track()
  # Use the audio object we generated in Step 5
  audio_asset = AudioAsset(id=audio.id)
  audio_clip = Clip(asset=audio_asset, duration=float(audio.length))
  audio_track.add_clip(0, audio_clip)

  # Add tracks to timeline
  timeline.add_track(video_track)
  timeline.add_track(audio_track)
  ```

  ```javascript Node.js theme={null}
  import { EditorTimeline, Track, Clip, EditorVideoAsset, EditorAudioAsset } from 'videodb';

  // Create a timeline
  const timeline = new EditorTimeline(conn);

  // 1. Create a Video Track
  const videoTrack = new Track();
  const videoAsset = new EditorVideoAsset({ id: video.id });
  // Add the video clip
  const videoClip = new Clip({ asset: videoAsset, duration: parseFloat(video.length) });
  videoTrack.addClip(0, videoClip);

  // 2. Create an Audio Track for the voiceover
  const audioTrack = new Track();
  // Use the audio object we generated in Step 5
  const audioAsset = new EditorAudioAsset({ id: audio.id });
  const audioClip = new Clip({ asset: audioAsset, duration: parseFloat(audio.length) });
  audioTrack.addClip(0, audioClip);

  // Add tracks to timeline
  timeline.addTrack(videoTrack);
  timeline.addTrack(audioTrack);
  ```
</CodeGroup>

### Step 7: Review and Share

Generate the final stream URL and watch your AI-narrated video!

<CodeGroup>
  ```python Python theme={null}
  from videodb import play_stream

  stream_url = timeline.generate_stream()
  play_stream(stream_url)
  ```

  ```javascript Node.js theme={null}
  const streamUrl = await timeline.generateStream();
  console.log(streamUrl);
  ```
</CodeGroup>

**Output:**

<iframe className="w-full aspect-video rounded-xl" src="https://www.youtube.com/embed/gsU14KgORgg" title="AI-Narrated Underwater Video" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowFullScreen />

## Conclusion

Congratulations! You have successfully automated the process of creating custom and personalized voiceovers based on a simple prompt and raw video footage using VideoDB.

By leveraging advanced AI technologies, you can enhance the storytelling and immersive experience of your video content. Experiment with different prompts and scene analysis techniques to further improve the quality and accuracy of the voiceovers. Enjoy creating captivating narratives with AI-powered voiceovers using VideoDB!

<Card icon="notebook" title="Explore Full Notebook" href="https://colab.research.google.com/github/video-db/videodb-cookbook/blob/main/examples/AI_Voiceover.ipynb">
  Open the complete implementation in Google Colab with all code examples.
</Card>

## Related Tutorials

<CardGroup cols={2}>
  <Card title="Video Dubbing" icon="globe" href="/examples-and-tutorials/content-factory/dubbing">
    Dub videos into multiple languages with AI voice synthesis
  </Card>

  <Card title="Faceless Video Creator" icon="wand" href="/examples-and-tutorials/content-factory/faceless-video-creator">
    Build complete faceless videos with AI scripts, voiceovers, and multi-layer composition
  </Card>
</CardGroup>
