Skip to content
videodb
VideoDB Documentation
  • Pages
    • Welcome to VideoDB Docs
    • Quick Start Guide
      • Video Indexing Guide
      • Semantic Search
      • How Accurate is Your Search?
      • Collections
      • Public Collections
      • Callback Details
      • Ref: Subtitle Styles
      • Language Support
      • Guide: Subtitles
    • Visual Search and Indexing
      • Scene Extraction Algorithms
      • Custom Annotations
      • Scene-Level Metadata: Smarter Video Search & Retrieval
      • Advanced Visual Search Pipelines
      • Playground for Scene Extractions
      • Deep Dive into Prompt Engineering : Mastering Video Scene Indexing
    • Multimodal Search
      • Multimodal Search: Quickstart
      • Conference Slide Scraper with VideoDB
    • Real‑Time Video Pipeline
    • Meeting Recording Agent Quickstart
    • How VideoDB Solves Complex Visual Analysis Tasks
    • Generative Media Quickstart
      • Generative Media Pricing
    • Examples and Tutorials
      • Dubbing - Replace Soundtrack with New Audio
      • VideoDB x TwelveLabs: Real-Time Video Understanding
      • Beep curse words in real-time
      • Remove Unwanted Content from videos
      • Instant Clips of Your Favorite Characters
      • Insert Dynamic Ads in real-time
      • Adding Brand Elements with VideoDB
      • Eleven Labs x VideoDB: Adding AI Generated voiceovers to silent footage
      • Elevating Trailers with Automated Narration
      • Add Intro/Outro to Videos
      • Enhancing Video Captions with VideoDB Subtitle Styling
      • Audio overlay + Video + Timeline
      • Building Dynamic Video Streams with VideoDB: Integrating Custom Data and APIs
      • Adding AI Generated Voiceovers with VideoDB and LOVO
      • AI Generated Ad Films for Product Videography: Wellsaid, Open AI & VideoDB
      • Fun with Keyword Search
      • AWS Rekognition and VideoDB - Intelligent Video Clips
      • AWS Rekognition and VideoDB - Effortlessly Remove Inappropriate Content from Video
      • icon picker
        Overlay a Word-Counter on Video Stream
      • Generate Automated Video Outputs with Text Prompts | DALL-E + ElevenLabs + OpenAI + VideoDB
    • Dynamic Video Streams
      • Ref: TextAsset
      • Guide : TextAsset
    • Transcoding Quickstart
    • director-light
      Director - Video Agent Framework
      • Agent Creation Playbook
      • How I Built a CRM-integrated Sales Assistant Agent in 1 Hour
      • Make Your Video Sound Studio Quality with Voice Cloning
      • Setup Director Locally
    • github
      Open Source Tools
      • llama
        LlamaIndex VideoDB Retriever
      • PromptClip: Use Power of LLM to Create Clips
      • StreamRAG: Connect ChatGPT to VideoDB
    • zapier
      Zapier Integration
      • Auto-Dub Videos & Save to Google Drive
      • Create & Add Intelligent Video Highlights to Notion
      • Create GenAI Video Engine - Notion Ideas to Youtube
      • Automatically Detect Profanity in Videos with AI - Update on Slack
      • Generate and Store YouTube Video Summaries in Notion
      • Automate Subtitle Generation for Video Libraries
      • Solve customers queries with Video Answers
    • n8n
      N8N Workflows
      • AI-Powered Meeting Intelligence: Recording to Insights Automation
      • AI Powered Dubbing Workflow for Video Content
      • Automate Subtitle Generation for Video Libraries
      • Automate Interview Evaluations with AI
      • Turn Meeting Recordings into Actionable Summaries
      • Auto-Sync Sales Calls to HubSpot CRM with AI
      • Instant Notion Summaries for Your Youtube Playlist
    • mcp
      VideoDB MCP Server
    • Edge of Knowledge
      • Building Intelligent Machines
        • Part 1 - Define Intelligence
        • Part 2 - Observe and Respond
        • Part 3 - Training a Model
      • Society of Machines
        • Society of Machines
        • Autonomy - Do we have the choice?
        • Emergence - An Intelligence of the collective
      • From Language Models to World Models: The Next Frontier in AI
      • The Future Series
    • videodb
      Building World's First Video Database
      • Multimedia: From MP3/MP4 to the Future with VideoDB
      • Dynamic Video Streams
      • Why do we need a Video Database Now?
      • What's a Video Database ?
      • Enhancing AI-Driven Multimedia Applications
      • Misalignment of Today's Web
      • Beyond Traditional Video Infrastructure
      • Research Grants
    • Customer Love
    • Team
      • videodb
        Internship: Build the Future of AI-Powered Video Infrastructure
      • Ashutosh Trivedi
        • Playlists
        • Talks - Solving Logical Puzzles with Natural Language Processing - PyCon India 2015
      • Ashish
      • Shivani Desai
      • Gaurav Tyagi
      • Rohit Garg
      • VideoDB Acquires Devzery: Expanding Our AI Infra Stack with Developer-First Testing Automation

Overlay a Word-Counter on Video Stream

Introduction

With an endless stream of new video content on our feeds, engaging the audience with dynamic visual elements can make educational and promotional videos much more impactful. VideoDB's suite of features allows you to enhance videos with programmatic editing.
In this tutorial, we'll explore how to create a video that visually counts and displays instances of a specified word as it's spoken. We'll use VideoDB’s to index spoken words, and then apply audio and to show a counter updating in real-time with synchronized audio cues.

Setup

📦 Installing packages

%pip install videodb

🔑 API Keys

Before proceeding, ensure access to and set up
light
Get your API key from . ( Free for first 50 uploads, No credit card required ) 🎉
import os

os.environ["VIDEO_DB_API_KEY"] = ""

Steps

🌐 Step 1: Connect to VideoDB

Establish a session for uploading videos. Import the necessary modules from VideoDB library to access functionalities.
from videodb import connect

conn = connect()
coll = conn.get_collection()

🗳️ Step 2: Upload Video

Upload and play the video to ensure it's correctly loaded. We’ll be using for the purpose of this tutorial.
video = coll.upload(url="https://www.youtube.com/watch?v=Js4rTM2Z1Eg")
video.play()

📝 Step 3: Indexing Spoken Words

Index the video to identify and timestamp all spoken words.
video.index_spoken_words()

🔍 Step 4: Keyword Search

Search within the video for the keyword ("education" in this example), and note each occurrence.
from videodb import SearchType

result = video.search(query="education", search_type=SearchType.keyword)


🎼 Step 5: Setup Timeline and Audio

Initialize the timeline and retrieve an audio asset to use for each word occurrence.
from videodb.timeline import Timeline
from videodb.asset import AudioAsset
from videodb import MediaType

timeline = Timeline(conn)

audio = conn.upload(url="https://github.com/video-db/videodb-cookbook-assets/raw/main/audios/twink.mp3", media_type=MediaType.audio)

audio_asset = AudioAsset(
asset_id=audio.id,
start=0,
end=1.7,
disable_other_tracks=False,
fade_in_duration=1,
fade_out_duration=0,
)


💬 Step 6: Overlay Text and Audio

Add text and audio overlays at each instance where the word is spoken.
info
Note: Adding the ‘padding’ is an optional step. It helps in adding a little more context to the exact instance identified, thus resulting in a better compiled output.
from videodb.asset import TextAsset, TextStyle, VideoAsset, AudioAsset

seeker = 0
counter = 0
padding = 1.5

for shot in result.shots:
duration = shot.end - shot.start + 2 * padding
# VideoAsset for each Shot
video_asset = VideoAsset(
asset_id=shot.video_id, start=shot.start - padding, end=shot.end + padding
)

# TextAsset that displays count
text_asset = TextAsset(
text=f"Count-{counter}",
duration=duration,
style=TextStyle(
font="Do Hyeon",
fontsize = "(h/10)",
x="w-1.5*text_w",
y="0+(2*text_h)",
fontcolor="#000100",
box=True,
boxcolor="F702A4",
),
)


timeline.add_inline(asset=video_asset)
timeline.add_overlay(asset=text_asset, start=seeker - padding)
timeline.add_overlay(asset=audio_asset, start=seeker + padding)

seeker += duration
counter += 1

⚡️ Step 7: Generate and Play the Stream

Finally, generate a streaming URL for your edited video and play it.
from videodb import play_stream

stream_url = timeline.generate_stream()
play_stream(stream_url)


Here’s a preview of showing occurrence of the word Education

Conclusion

This tutorial showcases VideoDB's capabilities to create a video that programmatically counts and displays the frequency of a specific keyword spoken throughout the video. This method can be adapted for various applications where dynamic text overlays add significant value to video content.

Tips and Tricks

Use different text styles and positions based on your video's theme.
Add background sounds or effects to enhance the viewer's experience.

For more such explorations, refer to the and join the VideoDB community on or for support and collaboration.
Feel free to share your creations via social media or other platforms to inspire others in the community. Join us on , or
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.