videodb
VideoDB Documentation
videodb
VideoDB Documentation
Examples and Tutorials

icon picker
Overlay a Word-Counter on Video Stream

Introduction

With an endless stream of new video content on our feeds, engaging the audience with dynamic visual elements can make educational and promotional videos much more impactful. VideoDB's suite of features allows you to enhance videos with programmatic editing.
In this tutorial, we'll explore how to create a video that visually counts and displays instances of a specified word as it's spoken. We'll use VideoDB’s to index spoken words, and then apply audio and to show a counter updating in real-time with synchronized audio cues.

Setup

📦 Installing packages

%pip install videodb

🔑 API Keys

Before proceeding, ensure access to and set up
light
Get your API key from . ( Free for first 50 uploads, No credit card required ) 🎉
import os

os.environ["VIDEO_DB_API_KEY"] = ""

Steps

🌐 Step 1: Connect to VideoDB

Establish a session for uploading videos. Import the necessary modules from VideoDB library to access functionalities.
from videodb import connect

conn = connect()
coll = conn.get_collection()

🗳️ Step 2: Upload Video

Upload and play the video to ensure it's correctly loaded. We’ll be using for the purpose of this tutorial.
video = coll.upload(url="https://www.youtube.com/watch?v=Js4rTM2Z1Eg")
video.play()

📝 Step 3: Indexing Spoken Words

Index the video to identify and timestamp all spoken words.
video.index_spoken_words()

🔍 Step 4: Keyword Search

Search within the video for the keyword ("education" in this example), and note each occurrence.
from videodb import SearchType

result = video.search(query="education", search_type=SearchType.keyword)


🎼 Step 5: Setup Timeline and Audio

Initialize the timeline and retrieve an audio asset to use for each word occurrence.
from videodb.timeline import Timeline
from videodb.asset import AudioAsset
from videodb import MediaType

timeline = Timeline(conn)

audio = conn.upload(url="https://github.com/video-db/videodb-cookbook-assets/raw/main/audios/twink.mp3", media_type=MediaType.audio)

audio_asset = AudioAsset(
asset_id=audio.id,
start=0,
end=1.7,
disable_other_tracks=False,
fade_in_duration=1,
fade_out_duration=0,
)


💬 Step 6: Overlay Text and Audio

Add text and audio overlays at each instance where the word is spoken.
info
Note: Adding the ‘padding’ is an optional step. It helps in adding a little more context to the exact instance identified, thus resulting in a better compiled output.
from videodb.asset import TextAsset, TextStyle, VideoAsset, AudioAsset

seeker = 0
counter = 0
padding = 1.5

for shot in result.shots:
duration = shot.end - shot.start + 2 * padding
# VideoAsset for each Shot
video_asset = VideoAsset(
asset_id=shot.video_id, start=shot.start - padding, end=shot.end + padding
)

# TextAsset that displays count
text_asset = TextAsset(
text=f"Count-{counter}",
duration=duration,
style=TextStyle(
font="Do Hyeon",
fontsize = "(h/10)",
x="w-1.5*text_w",
y="0+(2*text_h)",
fontcolor="#000100",
box=True,
boxcolor="F702A4",
),
)


timeline.add_inline(asset=video_asset)
timeline.add_overlay(asset=text_asset, start=seeker - padding)
timeline.add_overlay(asset=audio_asset, start=seeker + padding)

seeker += duration
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.