VideoDB Documentation

Pages

Examples and Tutorials

Adding AI Generated Voiceovers with VideoDB and LOVO

Case: Automatically Creating a Fun & Chirpy Voiceover for Silent Footage of the Underwater World

⁠

Overview

Crafting AI content and blending results from multiple tools can be super-difficult and time-intensive. Manual execution of these tasks is even more challenging. But what if you could accomplish everything with just a few lines of code?

Unlock the full potential of VideoDB's seamless integration with leading AI technologies like OpenAI and LOVO to effortlessly create engaging AI generated voiceovers. This tutorial showcases the powerful technical capabilities of VideoDB platform to streamline the process of adding dynamic AI generated narrations to a silent footage.

Let’s experience the ease of use and limitless possibilities as you begin a fun experiment to create an exciting narration for

this⁠

footage, all within the VideoDB environment.

⁠

Setup

⁠

📦 Installing packages

Ensure you have the necessary packages installed:

%pip install openai

%pip install videodb

🔑 API Keys

Before proceeding, ensure access to

VideoDB⁠

OpenAI⁠

, and

LOVO⁠

API key. If not, sign up for API access on the respective platforms.

Get your API key from

VideoDB Console⁠

. ( Free for first 50 uploads, No credit card required ) 🎉

import os

os.environ["OPENAI_API_KEY"] = ""

os.environ["LOVO_API_KEY"] = ""

os.environ["VIDEO_DB_API_KEY"] = ""

Note for Free-tier Users of LOVO

Users without a paid plan of LOVO might face some issues while using API. Please reach out to their support team in case of issues with the LOVO API malfunction.

🎙️ LOVO's Speaker ID

You will also need LOVO's SpeakerID of a Voice that you want to use.

For this demo, we’ll choose a cheerful voice from Lovo's Voice Library. You can choose this based on the style of voiceovers you wish to create.

Checkout

Lovo Guide⁠

if you want to select custom speaker.

speaker_id = "640f477d2babeb0024be422b"

Implementation

⁠

🌐 Step 1: Connect to VideoDB

Begin by establishing a connection to VideoDB using your API key:

from videodb import connect

# Connect to VideoDB using your API key

conn = connect()

coll = conn.get_collection()

🎥 Step 2: Upload Video

# Upload a video by URL (replace the url with your video)

video = conn.upload(url='https://youtu.be/RcRjY5kzia8')

🔍 Step 3: Analyze Scenes and Generate Scene Descriptions

Start by analyzing the scenes within your Video using VideoDB's scene indexing capabilities. This will provide context for generating the script prompt.

video.index_scenes()

Let's view the description of first scene from the video

scenes = video.get_scenes()

print(f"{scenes[0]['start']} - {scenes[0]['end']}")

print(scenes[0]["response"])

Output:

0 - 9.033333333333333

The image presents a close-up, textured pattern reminiscent of organic forms. Dominated by a cool color palette of blue and turquoise hues, it gives the impression of looking at a magnified cluster of cells, crystalline structures, or perhaps a zoomed-in portion of aquatic flora. The shapes appear irregular yet symmetrical, like petals or leaves clustered tightly together. Light and shadow play across the surfaces, creating a dynamic interplay that suggests depth and complexity. The overall effect is one of natural beauty, with a soothing, almost hypnotic visual rhythm. The image is abstract enough to allow for multiple interpretations, depending on the viewer's perspective.

🎤 Step 4: Create Voiceover Script with LLM

Combine scene descriptions with a script prompt, instructing LLM to create a playful narration suitable for kids.

This script prompt can be refined and tweaked to generate the most suitable output. Check out

these examples⁠

to explore more use cases.

import openai

client = openai.OpenAI()

script_prompt = "Here's the data from a scene index for a video about the underwater world. Study this and then generate a synced script based on the description below. Make sure the script is in the language, voice and style of Santa Claus"

full_prompt = script_prompt + "\n\n"

for scene in scenes:

full_prompt += f"- {scene}\n"

openai_res = client.chat.completions.create(

model="gpt-3.5-turbo",

messages=[{"role": "system", "content": full_prompt}],

)

voiceover_script = openai_res.choices[0].message.content

🎤 Step 5: Generate Voiceover Audio with LOVO

Due to Lovo API’s limitation, we will have to create chunks of voiceover script of 500 characters each.

chunk_size = 500

chunks = [voiceover_script[i:i+chunk_size] for i in range(0, len(voiceover_script), chunk_size)]

Utilize the generated script to synthesize a cheerful and fun narration for kids using Lovo's API:

import requests

import time

# Call Lovo API to generate voiceover

url = "https://api.genny.lovo.ai/api/v1/tts"

headers = {

"accept": "application/json",

"content-type": "application/json",

"X-API-KEY": os.environ.get("LOVO_API_KEY")

}

outputs = []

# Initiate TTS Job for each Chunk

for chunk in chunks:

payload = {

"text": chunk,

"speaker": speaker_id

}

lovo_res = requests.request("POST", url, json=payload, headers=headers)

print(lover_res.json())

job_id = lovo_res.json()["id"]

outputs.append({"job_id": job_id})

# Keep Polling the API to check outputs

poll_time = 1

for output in outputs:

completed = False

while not completed:

lovo_res = requests.request("GET", f"{url}/{output['job_id']}", headers=headers)

lovo_res = lovo_res.json()['data'][0]

completed = lovo_res["status"] == "succeeded"

if completed:

output["audio_url"] = lovo_res["urls"][0]

completed = True

break

else:

time.sleep(poll_time)

🎬 Step 6: Add Voiceover to Video with VideoDB

In order to use the voiceover generated above, let's upload the audio file (voiceover) to VideoDB first

for output in outputs:

audio = coll.upload(url=output["audio_url"])

output["audio_id"] = audio.id

output["audio_length"] = float(audio.length)

print("Audio Uploaded with id", audio.id)

Finally, add the AI-generated voiceover to the original footage using the

timeline feature⁠

from videodb.timeline import Timeline

from videodb.asset import VideoAsset, AudioAsset

# Create a timeline object

timeline = Timeline(conn)

# Add the video asset to the timeline for playback

video_asset = VideoAsset(asset_id=video.id)

timeline.add_inline(asset=video_asset)

seek = 0

for output in outputs:

audio_asset = AudioAsset(asset_id=output["audio_id"])

timeline.add_overlay(start=seek, asset=audio_asset)

seek += output['audio_length']

Overview

Setup

📦 Installing packages

🔑 API Keys

Note for Free-tier Users of LOVO

🎙️ LOVO's Speaker ID

Implementation

🌐 Step 1: Connect to VideoDB

🎥 Step 2: Upload Video

🔍 Step 3: Analyze Scenes and Generate Scene Descriptions

🎤 Step 4: Create Voiceover Script with LLM

🎤 Step 5: Generate Voiceover Audio with LOVO

🎬 Step 6: Add Voiceover to Video with VideoDB

🪄 Step 7: Review and Share

🎉 Conclusion

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.