Skip to content
videodb
VideoDB Documentation
  • Pages
    • Welcome to VideoDB Docs
    • Quick Start Guide
      • Video Indexing Guide
      • Semantic Search
      • Collections
      • Public Collections
      • Callback Details
      • Ref: Subtitle Styles
      • Language Support
      • Guide: Subtitles
      • How Accurate is Your Search?
    • Visual Search and Indexing
      • Scene Extraction Algorithms
      • Custom Annotations
      • Scene-Level Metadata: Smarter Video Search & Retrieval
      • Advanced Visual Search Pipelines
      • Playground for Scene Extractions
      • Deep Dive into Prompt Engineering : Mastering Visual Indexing
      • How VideoDB Solves Complex Visual Analysis Tasks
      • Multimodal Search: Quickstart
      • Conference Slide Scraper with VideoDB
    • Examples and Tutorials
      • Dubbing - Replace Soundtrack with New Audio
      • VideoDB: Adding AI Generated voiceovers to silent footage
      • Beep curse words in real-time
      • Remove Unwanted Content from videos
      • Instant Clips of Your Favorite Characters
      • Insert Dynamic Ads in real-time
      • Adding Brand Elements with VideoDB
      • Elevating Trailers with Automated Narration
      • Add Intro/Outro to Videos
      • Audio overlay + Video + Timeline
      • Building Dynamic Video Streams with VideoDB: Integrating Custom Data and APIs
      • AI Generated Ad Films for Product Videography
      • Fun with Keyword Search
      • Overlay a Word-Counter on Video Stream
      • icon picker
        Generate Automated Video Outputs with Text Prompts | VideoDB
      • Eleven Labs x VideoDB: Adding AI Generated voiceovers to silent footage
      • VideoDB x TwelveLabs: Real-Time Video Understanding
      • Multimodal Search
      • How I Built a CRM-integrated Sales Assistant Agent in 1 Hour
      • Make Your Video Sound Studio Quality with Voice Cloning
      • Automated Traffic Violation Reporter
    • Live Video→ Instant Action
    • Generative Media Quickstart
      • Generative Media Pricing
    • Video Editing Automation
      • Fit & Position: Aspect Ratio Control
      • Trimming vs Timing: Two Independent Timelines
      • Advanced Clip Control: The Composition Layer
      • Caption & Subtitles: Auto-Generated Speech Synchronization
      • Notebooks
    • Transcoding Quickstart
    • director-light
      Director - Video Agent Framework
      • Agent Creation Playbook
      • Setup Director Locally
    • Workflows and Integrations
      • zapier
        Zapier Integration
        • Auto-Dub Videos & Save to Google Drive
        • Create & Add Intelligent Video Highlights to Notion
        • Create GenAI Video Engine - Notion Ideas to Youtube
        • Automatically Detect Profanity in Videos with AI - Update on Slack
        • Generate and Store YouTube Video Summaries in Notion
        • Automate Subtitle Generation for Video Libraries
        • Solve customers queries with Video Answers
      • n8n
        N8N Workflows
        • AI-Powered Meeting Intelligence: Recording to Insights Automation
        • AI Powered Dubbing Workflow for Video Content
        • Automate Subtitle Generation for Video Libraries
        • Automate Interview Evaluations with AI
        • Turn Meeting Recordings into Actionable Summaries
        • Auto-Sync Sales Calls to HubSpot CRM with AI
        • Instant Notion Summaries for Your Youtube Playlist
    • Meeting Recording SDK
    • github
      Open Source
      • llama
        LlamaIndex VideoDB Retriever
      • PromptClip: Use Power of LLM to Create Clips
      • StreamRAG: Connect ChatGPT to VideoDB
    • mcp
      VideoDB MCP Server
    • videodb
      Give your AI, Eyes and Ears
      • Building Infrastructure that “Sees” and “Edits”
      • Agents with Video Experience
      • From MP3/MP4 to the Future with VideoDB
      • Dynamic Video Streams
      • Why do we need a Video Database Now?
      • What's a Video Database ?
      • Enhancing AI-Driven Multimedia Applications
      • Beyond Traditional Video Infrastructure
    • Customer Love
    • Join us
      • videodb
        Internship: Build the Future of AI-Powered Video Infrastructure
      • Ashutosh Trivedi
        • Playlists
        • Talks - Solving Logical Puzzles with Natural Language Processing - PyCon India 2015
      • Ashish
      • Shivani Desai
      • Gaurav Tyagi
      • Rohit Garg
      • Edge of Knowledge
        • Language Models to World Models: The Next Frontier in AI
        • Society of Machines
          • Society of Machines
          • Autonomy - Do we have the choice?
          • Emergence - An Intelligence of the collective
        • Building Intelligent Machines
          • Part 1 - Define Intelligence
          • Part 2 - Observe and Respond
          • Part 3 - Training a Model
      • Updates
        • VideoDB Acquires Devzery: Expanding Our AI Infra Stack with Developer-First Testing Automation

Generate Automated Video Outputs with Text Prompts | VideoDB


💬 Overview

Creating video storyboards for app user flows is traditionally a laborious process involving scriptwriting, recording voiceovers, designing frames, and editing them together.
VideoDB automates this entire pipeline.
In this tutorial, we will build a Storyboard Generator Tool.
Input: You provide an app name and a list of user steps.
Process: VideoDB’s AI agents generate:
Step-by-step narration scripts (Text Gen)
Professional voiceovers (Voice Gen)
Concept art for each screen (Image Gen)
Output: A fully compiled video walkthrough with visual overlays and synced audio.
No external tools or complex integrations required.

Setup

📦 Installing VideoDB

!pip install videodb

🔑 API Keys

You only need your VideoDB API Key.
Get your API key from . (Free for first 50 uploads, No credit card required).
import videodb
import os
from getpass import getpass

# Prompt user for API key securely
api_key = getpass("Please enter your VideoDB API Key: ")
os.environ["VIDEO_DB_API_KEY"] = api_key

Implementation

🌐 Step 1: Connect to VideoDB

Connect to VideoDB using your API key to establish a session.
from videodb import connect

conn = connect()
coll = conn.get_collection()

💬 Step 2: Set up the primary text inputs

While building an app, these input fields will be exposed to your users and this input will then become the foundation for the rest of this workflow.
For the purpose of this tutorial, we are using the sample use case of a user requesting a storyboard for their meditation app via the storyboarding tool that we’re building.
# Define Your App Concept
app_description = "A meditation app for busy people with anxiety."

# Define the User Flow
raw_steps = [
"Set up profile",
"Select preference for theme & music",
"Set meditation session timing",
"Start the session"
]

🕹️ Step 3: Generate Assets

We will now iterate through each step of the user journey. For every step, we use VideoDB to:
Write a Script: Generate a short, conversational script based on the step name.
Create Visuals: Generate a sketch-style illustration of the user action.
Synthesize Voice: Turn the script into audio.
We store the resulting Asset IDs directly, skipping any manual file management.
import json

storyboard_assets = []

for i, step_name in enumerate(raw_steps):
# 1. Generate Script using Text Generation
# We ask for a short sentence.
script_prompt = f"""
Write a single conversational sentence for a video narration explaining the step: '{step_name}'
for an app described as: '{app_description}'.
Keep it encouraging and brief.
"""

# Generate text
text_response = coll.generate_text(prompt=script_prompt, model_name="pro")
script_text = text_response["output"]

# 2. Generate Voiceover
audio_asset = coll.generate_voice(
text=script_text,
voice_name="Aria"
)

# 3. Generate Image
# We create a consistent art style prompt
image_prompt = f"""
A minimal, stippling black ballpoint pen illustration of a user interface or scene representing: '{step_name}'.
Context: {app_description}.
Clean white background, professional storyboard style.
"""

image_asset = coll.generate_image(
prompt=image_prompt
)

# Store everything we need for the timeline
storyboard_assets.append({
"step_name": step_name,
"audio_id": audio_asset.id,
"image_id": image_asset.id,
"duration": float(audio_asset.length)
})

🎥 Step 4: Create the Timeline

Now we assemble the video. We will use:
Background: A generic looping video to serve as a canvas.
Image Track: The AI-generated sketches overlayed on the center.
Audio Track: The generated voiceovers sequenced one after another.
Text Track: A label at the bottom showing the current step name.

Background Track

We use a stock video as a dynamic background
base_vid = conn.upload(url="https://www.youtube.com/watch?v=4dW1ybhA5bM")

from videodb.editor import (
Timeline, Track, Clip,
VideoAsset, ImageAsset, AudioAsset, TextAsset,
Font, Background, Alignment, HorizontalAlignment, VerticalAlignment, Position
)

# Initialize Timeline
timeline = Timeline(conn)


# Calculate total duration
total_duration = sum(item['duration'] for item in storyboard_assets)

# Create main track loop
main_track = Track()
video_asset = VideoAsset(id=base_vid.id)
video_clip = Clip(asset=video_asset, duration=total_duration)
main_track.add_clip(0, video_clip)
timeline.add_track(main_track)

# Setup Overlay Tracks
image_track = Track()
audio_track = Track()
text_track = Track()

current_time = 0

# Assemble the Sequence
for asset in storyboard_assets:
duration = asset['duration']

# A. Visual: The AI Sketch (Centered)
image_clip = Clip(
asset=ImageAsset(id=asset['image_id']),
duration=duration,
position=Position.center,
)
image_track.add_clip(current_time, image_clip)

# B. Audio: The Voiceover
audio_clip = Clip(
asset=AudioAsset(id=asset['audio_id']),
duration=duration
)
audio_track.add_clip(current_time, audio_clip)

# C. Text: The Step Name Label
text_clip = Clip(
asset=TextAsset(
text=asset['step_name'],
font=Font(family="League Spartan", size=36, color="#FFFAFA"),
background=Background(color="#FF4500", border_width=10, opacity=1.0),
alignment=Alignment(horizontal=HorizontalAlignment.center, vertical=VerticalAlignment.bottom),
),
duration=duration,
position=Position.bottom
)
text_track.add_clip(current_time, text_clip)

# Advance the seeker
current_time += duration

# Add all tracks to timeline
timeline.add_track(image_track)
timeline.add_track(audio_track)
timeline.add_track(text_track)

📺 Step 5: Watch the Storyboard

Generate the stream and view your automated video creation.
from videodb import play_stream

stream_url = timeline.generate_stream()
play_stream(stream_url)

Conclusion

You have successfully built a Generative AI Storyboard Tool in under 50 lines of logic.
You can now expand this to generate marketing videos, tutorials, or dynamic social media content instantly.
Explore more at
Want to print your doc?
This is not the way.
Try clicking the ··· in the right corner or using a keyboard shortcut (
CtrlP
) instead.