Skip to main content
Open In Colab

Overview

Creating video storyboards for app user flows is traditionally a laborious process involving scriptwriting, recording voiceovers, designing frames, and editing them together. VideoDB automates this entire pipeline. In this tutorial, we will build a Storyboard Generator Tool.
  1. Input: You provide an app name and a list of user steps.
  2. Process: VideoDB’s AI agents generate:
    • Step-by-step narration scripts (Text Gen)
    • Professional voiceovers (Voice Gen)
    • Concept art for each screen (Image Gen)
  3. Output: A fully compiled video walkthrough with visual overlays and synced audio.
No external tools or complex integrations required.

Setup

Installing VideoDB

!pip install videodb

API Keys

You only need your VideoDB API Key. Get your API key from VideoDB Console. Free for first 50 uploads, no credit card required.

Implementation

Step 1: Connect to VideoDB

Connect to VideoDB using your API key to establish a session.
import videodb

# Set your API key
api_key = "your_api_key"

# Connect to VideoDB
conn = videodb.connect(api_key=api_key)
coll = conn.get_collection()

Step 2: Set up the primary text inputs

While building an app, these input fields will be exposed to your users and this input will then become the foundation for the rest of this workflow. For the purpose of this tutorial, we are using the sample use case of a user requesting a storyboard for their meditation app via the storyboarding tool that we’re building.
# Define Your App Concept
app_description = "A meditation app for busy people with anxiety."

# Define the User Flow
raw_steps = [
    "Set up profile",
    "Select preference for theme & music",
    "Set meditation session timing",
    "Start the session"
]

🕹️ Step 3: Generate Assets

We will now iterate through each step of the user journey. For every step, we use VideoDB to:
  1. Write a Script: Generate a short, conversational script based on the step name.
  2. Create Visuals: Generate a sketch-style illustration of the user action.
  3. Synthesize Voice: Turn the script into audio.
We store the resulting Asset IDs directly, skipping any manual file management.
import json

storyboard_assets = []

for i, step_name in enumerate(raw_steps):
    # 1. Generate Script using Text Generation
    # We ask for a short sentence.
    script_prompt = f"""
    Write a single conversational sentence for a video narration explaining the step: '{step_name}'
    for an app described as: '{app_description}'.
    Keep it encouraging and brief.
    """

    # Generate text
    text_response = coll.generate_text(prompt=script_prompt, model_name="pro")
    script_text = text_response

    # 2. Generate Voiceover
    audio_asset = coll.generate_voice(
        text=script_text,
        voice_name="Aria")

    # 3. Generate Image
    # We create a consistent art style prompt
    image_prompt = f"""
    A minimal, stippling black ballpoint pen illustration of a user interface or scene representing: '{step_name}'.
    Context: {app_description}.
    Clean white background, professional storyboard style.
    """

    image_asset = coll.generate_image(
        prompt=image_prompt)

    # Store everything we need for the timeline
    storyboard_assets.append({
        "step_name": step_name,
        "audio_id": audio_asset.id,
        "image_id": image_asset.id,
        "duration": float(audio_asset.length)
    })

Step 4: Create the Timeline

Now we assemble the video. We will use:
  • Background: A generic looping video to serve as a canvas.
  • Image Track: The AI-generated sketches overlayed on the center.
  • Audio Track: The generated voiceovers sequenced one after another.
  • Text Track: A label at the bottom showing the current step name.

Background Track

We use a stock video as a dynamic background
coll = conn.get_collection()
base_vid = coll.upload(url="https://www.youtube.com/watch?v=4dW1ybhA5bM")
from videodb.editor import (
    Timeline, Track, Clip,
    VideoAsset, ImageAsset, AudioAsset, TextAsset,
    Font, Background, Alignment, HorizontalAlignment, VerticalAlignment, Position)

# Initialize Timeline
timeline = Timeline(conn)

# Calculate total duration
total_duration = sum(item['duration'] for item in storyboard_assets)

# Create main track loop
main_track = Track()
video_asset = VideoAsset(id=base_vid.id)
video_clip = Clip(asset=video_asset, duration=total_duration)
main_track.add_clip(0, video_clip)
timeline.add_track(main_track)

# Setup Overlay Tracks
image_track = Track()
audio_track = Track()
text_track = Track()

current_time = 0

# Assemble the Sequence
for asset in storyboard_assets:
    duration = asset['duration']

    # A. Visual: The AI Sketch (Centered)
    image_clip = Clip(
        asset=ImageAsset(id=asset['image_id']),
        duration=duration,
        position=Position.center,)
    image_track.add_clip(current_time, image_clip)

    # B. Audio: The Voiceover
    audio_clip = Clip(
        asset=AudioAsset(id=asset['audio_id']),
        duration=duration)
    audio_track.add_clip(current_time, audio_clip)

    # C. Text: The Step Name Label
    text_clip = Clip(
        asset=TextAsset(
            text=asset['step_name'],
            font=Font(family="League Spartan", size=36, color="#FFFAFA"),
            background=Background(color="#FF4500", border_width=10, opacity=1.0),
            alignment=Alignment(horizontal=HorizontalAlignment.center, vertical=VerticalAlignment.bottom),),
        duration=duration,
        position=Position.bottom)
    text_track.add_clip(current_time, text_clip)

    # Advance the seeker
    current_time += duration

# Add all tracks to timeline
timeline.add_track(image_track)
timeline.add_track(audio_track)
timeline.add_track(text_track)

Step 5: Watch the Storyboard

Generate the stream and view your automated video creation.
from videodb import play_stream

stream_url = timeline.generate_stream()
play_stream(stream_url)

Conclusion

You have successfully built a Generative AI Storyboard Tool in under 50 lines of logic. You can now expand this to generate marketing videos, tutorials, or dynamic social media content instantly.

Explore Full Notebook

Open the complete implementation in Google Colab with all code examples.