Skip to content
videodb
VideoDB Documentation
  • Pages
    • Welcome to VideoDB Docs
    • Quick Start Guide
      • Video Indexing Guide
      • Semantic Search
      • How Accurate is Your Search?
      • Collections
      • Public Collections
      • Callback Details
      • Ref: Subtitle Styles
      • Language Support
      • Guide: Subtitles
    • Examples and Tutorials
      • Dubbing - Replace Soundtrack with New Audio
      • VideoDB x TwelveLabs: Real-Time Video Understanding
      • Beep curse words in real-time
      • Remove Unwanted Content from videos
      • Instant Clips of Your Favorite Characters
      • Insert Dynamic Ads in real-time
      • Adding Brand Elements with VideoDB
      • Eleven Labs x VideoDB: Adding AI Generated voiceovers to silent footage
      • Elevating Trailers with Automated Narration
      • Add Intro/Outro to Videos
      • Audio overlay + Video + Timeline
      • Building Dynamic Video Streams with VideoDB: Integrating Custom Data and APIs
      • AI Generated Ad Films for Product Videography: Wellsaid, Open AI & VideoDB
      • Fun with Keyword Search
      • AWS Rekognition and VideoDB - Effortlessly Remove Inappropriate Content from Video
      • Overlay a Word-Counter on Video Stream
      • Generate Automated Video Outputs with Text Prompts | DALL-E + ElevenLabs + OpenAI + VideoDB
    • Visual Search and Indexing
      • Scene Extraction Algorithms
      • Custom Annotations
      • Scene-Level Metadata: Smarter Video Search & Retrieval
      • Advanced Visual Search Pipelines
      • Playground for Scene Extractions
      • Deep Dive into Prompt Engineering : Mastering Video Scene Indexing
    • Multimodal Search
      • Multimodal Search: Quickstart
      • Conference Slide Scraper with VideoDB
    • Real‑Time Video Pipeline
    • Meeting Recording SDK
    • Generative Media Quickstart
      • Generative Media Pricing
    • icon picker
      Realtime Video Editor SDK
      • Fit & Position: Aspect Ratio Control
      • Trimming vs Timing: Two Independent Timelines
      • Advanced Clip Control: The Composition Layer
      • Caption & Subtitles: Auto-Generated Speech Synchronization
      • Notebooks
    • Transcoding Quickstart
    • director-light
      Director - Video Agent Framework
      • Agent Creation Playbook
      • How I Built a CRM-integrated Sales Assistant Agent in 1 Hour
      • Make Your Video Sound Studio Quality with Voice Cloning
      • Setup Director Locally
    • github
      Open Source Tools
      • llama
        LlamaIndex VideoDB Retriever
      • PromptClip: Use Power of LLM to Create Clips
      • StreamRAG: Connect ChatGPT to VideoDB
    • zapier
      Zapier Integration
      • Auto-Dub Videos & Save to Google Drive
      • Create & Add Intelligent Video Highlights to Notion
      • Create GenAI Video Engine - Notion Ideas to Youtube
      • Automatically Detect Profanity in Videos with AI - Update on Slack
      • Generate and Store YouTube Video Summaries in Notion
      • Automate Subtitle Generation for Video Libraries
      • Solve customers queries with Video Answers
    • n8n
      N8N Workflows
      • AI-Powered Meeting Intelligence: Recording to Insights Automation
      • AI Powered Dubbing Workflow for Video Content
      • Automate Subtitle Generation for Video Libraries
      • Automate Interview Evaluations with AI
      • Turn Meeting Recordings into Actionable Summaries
      • Auto-Sync Sales Calls to HubSpot CRM with AI
      • Instant Notion Summaries for Your Youtube Playlist
    • mcp
      VideoDB MCP Server
    • Edge of Knowledge
      • Building Intelligent Machines
        • Part 1 - Define Intelligence
        • Part 2 - Observe and Respond
        • Part 3 - Training a Model
      • Society of Machines
        • Society of Machines
        • Autonomy - Do we have the choice?
        • Emergence - An Intelligence of the collective
      • From Language Models to World Models: The Next Frontier in AI
      • The Future Series
      • How VideoDB Solves Complex Visual Analysis Tasks
    • videodb
      Building World's First Video Database
      • Multimedia: From MP3/MP4 to the Future with VideoDB
      • Dynamic Video Streams
      • Why do we need a Video Database Now?
      • What's a Video Database ?
      • Enhancing AI-Driven Multimedia Applications
      • Misalignment of Today's Web
      • Beyond Traditional Video Infrastructure
      • Research Grants
    • Customer Love
    • Team
      • videodb
        Internship: Build the Future of AI-Powered Video Infrastructure
      • Ashutosh Trivedi
        • Playlists
        • Talks - Solving Logical Puzzles with Natural Language Processing - PyCon India 2015
      • Ashish
      • Shivani Desai
      • Gaurav Tyagi
      • Rohit Garg
      • VideoDB Acquires Devzery: Expanding Our AI Infra Stack with Developer-First Testing Automation
Imagine building videos like coding - declarative, composable, and infinitely reusable.
VideoDB Editor lets you create videos programmatically using code instead of clicking timelines. You define what you want (assets, effects, timing), and the engine handles the rendering.
This guide is your complete conceptual introduction. By the end, you’ll understand how to compose anything from simple clips to complex multi-layer productions - all through code.

Why Code-First Video Editing?

Traditional video editors are built for one-off productions. But what if you need to:
Generate 100 personalized videos from a template
Build a TikTok content pipeline that runs daily
Create video variations for A/B testing
Automate highlight reels from live streams
Code changes everything:
Reusability – One video asset, infinite variations
Scalability – Loop over data to generate hundreds of videos
Version control – Git-track your compositions
Automation – Integrate with AI, databases, APIs

The 4-Layer Architecture

VideoDB Editor uses a hierarchy where each layer has one job. Understanding this structure is the key to mastering composition:
Asset → Clip → Track → Timeline
image.png
Let’s walk through each layer using the simplest possible example: one video asset playing for 10 seconds. This is the “Hello World” of Editor - understanding this foundation lets you build anything.

Installing VideoDB in your environment

VideoDB is available as
!pip install videodb

Layer 1: Assets – Your Raw Materials

Assets are your content library. They reference media that exists in your VideoDB collection but don’t define how or when it plays.

VideoAsset

Your main video content. Each VideoAsset points to a video file via its ID.
Key parameters:
id (required) – The VideoDB media ID
start (optional) – Trim point in seconds (e.g., start=10 skips first 10s of source)
volume (optional) – Audio level: 0.0 (muted) to 2.0 (200%), default 1.0
Real example:
from videodb.editor import Timeline, Track, Clip, VideoAsset

video_asset = VideoAsset(
# Create a VideoAsset pointing to a video file in your collection
id=video.id,
start=0,
volume=1
)
# Ready to use in a Clip
This says: “Use the video from your VideoDB collection, start from the beginning (start=0), and keep original volume (volume=1).”
error
Important distinction: VideoAsset.start trims the source file. Where it appears on the timeline is controlled later at the Track layer. This “double start” concept is critical - we’ll explore it more in Layer 3 (Tracks).

AudioAsset

Background music, voiceovers, or sound effects. Works exactly like VideoAsset.
Key parameters:
id (required) – The VideoDB audio file ID
start (optional) – Same trim behavior as VideoAsset
volume (optional) – 0.0-2.0 range (0.2 = 20% volume)

ImageAsset

Logos, watermarks, title cards, or static backgrounds.
Key parameters:
id (required) – The VideoDB image ID
crop (optional) – Rarely used; trims edges before rendering
Crop the sides of an asset by a relative amount. The size of the crop is specified using a scale between 0 and 1.
A left crop of 0.5 will crop half of the asset from the left, a top crop of 0.25 will crop the top by quarter of the asset.
Images are static by nature - duration, position, and size are controlled at the

TextAsset

Custom text overlays with full typography control.
Key parameters:
text (required) – The string to display
font (optional) – Font object with family, size, color
border, shadow, background (optional) – Styling objects
Color format: ASS-style &HAABBGGRR in hex (e.g., &H00FFFFFF = white)
image.png

CaptionAsset

Auto-generated subtitles synced to speech. This is where VideoDB gets magical.
Important: CaptionAsset is a separate asset type from TextAsset. While TextAsset is for custom text overlays you write yourself, CaptionAsset automatically generates subtitles from video speech.
Key parameters:
src (required) – Set to "auto" to generate captions from video speech
animation (optional) – How words appear: reveal, karaoke, supersize, box_highlight
primary_color, secondary_color (optional) – ASS-style colors
font, positioning, border, shadow styling (optional)
error
Critical requirement: Before using CaptionAsset(src="auto"), you must call video.index_spoken_words() on the source video. This indexes the speech for auto-caption generation. Without it, captions won’t generate.
image.png

Supported Fonts for Text and Caption Assets:

image.png
Supported Indic fonts:
Noto Sans Kannada
Noto Sans Devanagari
Noto Sans Gujarati
Noto Sans Gurmukhi

Recap: Assets answer “What content exists?” They don’t yet define timing, size, position, or effects. That’s the Clip layer’s job. (rephrase)

Layer 2: Clips – The Presentation Engine

Clips wrap Assets and define how and how long they appear. This is your effects layer.
Every Clip must have an asset and a duration. Everything else is optional.

image.png

Duration – How Long It Plays

duration is a float in seconds. It defines how long the clip plays on the timeline.
Real example:
from videodb import Clip
clip = Clip(
asset=video_asset,
duration=10
)
“Play this VideoAsset for 10 seconds.”
info-squared
Key insight: Duration is independent of the source file’s length. If your source is 2 minutes but you set duration=10, only 10 seconds play (starting from VideoAsset.start).
We get an error if clip duration greater than video/audio length.

Fit – How It Scales to Canvas

When your asset’s aspect ratio doesn’t match the timeline’s, fit controls scaling behavior.
Four modes:
Fit.crop (most common) – Fills the canvas completely, cropping edges if needed
Use when: Filling the frame is priority, cropping is acceptable
Example: 16:9 video on a 9:16 (vertical) timeline
Fit.contain – Fits the entire asset inside the canvas, adding bars if needed
Use when: Showing all content is priority, bars are acceptable
Example: Preserving widescreen footage in a square format
Fit.cover – Stretches to fill canvas (distortion possible)
Use when: Artistic effect or abstract content
Fit.none – Uses native pixel dimensions (no scaling)
Use when: Precise pixel control needed (e.g., 1:1 pixel mapping)
Real example:
clip = Clip(
asset=video_asset,
duration=10,
fit=Fit.crop
)
“Fill the canvas completely, crop edges if aspect ratios don’t match.”

Position – Where It Appears

Position uses a 9-zone grid system:
top_left top top_right
center_left center center_right
bottom_left bottom bottom_right
image.png
Real example:
logo_clip = Clip(
asset=logo,
duration=30,
position=Position.top_right
)
“Place the logo in the top-right corner.”

Offset – For fine-tuned positioning

image.png
from videodb.editor import Offset

clip = Clip(
asset=logo,
duration=30,
position=Position.center,
offset=Offset(x=0.3, y=-0.2)
)
This shifts the logo 30% right, 20% up from center.

Scale – Size Adjustment

scale is a multiplier applied after fit. Default is 1.0.
Real example:
pip_clip = Clip(
asset=overlay_video,
duration=15,
scale=0.3
)
“Shrink this video to 30% of its fitted size” (perfect for picture-in-picture).

Opacity – Transparency

opacity ranges from 0.0 (invisible) to 1.0 (opaque).
Real example:
watermark_clip = Clip(
asset=logo,
duration=30,
opacity=0.6
)
“Make the logo 60% opaque (semi-transparent).”

Filter – Visual Effects

Apply color/blur effects:
from videodb.editor import Filter

clip = Clip(
asset=VideoAsset(id=video.id),
duration=10,
filter=Filter.greyscale
)
Available filters: greyscale, blur, boost (saturation), contrast, darken, lighten, muted, negative.
Filter
Effect
Filter.greyscale
Removes all color, creating a black-and-white look
Filter.blur
Blurs the scene for artistic or privacy effects
Filter.contrast
Increases contrast, making darks darker and lights lighter
Filter.darken
Darkens the entire scene
Filter.lighten
Lightens the entire scene
Filter.boost
Boosts both contrast and saturation for vibrant colors
Filter.muted
Reduces saturation and contrast for a subdued look
Filter.negative
Inverts colors for a surreal, negative effect
There are no rows in this table

Transition – Fades

Fade in/out at clip start/end:
from videodb.editor import Transition

clip = Clip(
asset=VideoAsset(id=video.id),
duration=10,
transition=Transition(
in_="fade",
out="fade",
duration=2
)
)
“Fade in over 1 second at start, fade out over 2 seconds at end.”
Recap: A Clip wraps an Asset and defines how long it plays (duration) and how it appears (fit, position, scale, opacity, filter, transition). Now let’s see how to place clips on the timeline.

Layer 3: Tracks – Sequencing and Layering

Tracks are timeline lanes. They control when clips play and how they stack.

The Track Object

A Track is a container you add clips to:
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.