CaptionAsset synchronizes text to audio timestamps, creating subtitles that move with spoken words.
Think of it like automatic subtitles that know exactly when each word is spoken - the text appears and animates in perfect sync with your video’s audio.
Unlike TextAsset which just displays static text overlays at fixed positions, CaptionAsset is specifically built for speech-driven content where timing matters.
CaptionAsset vs TextAsset
CaptionAsset uses ASS format for subtitle rendering, which enables time-synchronized animations and professional subtitle styling.
Auto-Caption Generation
CaptionAsset can automatically generate subtitles from speech in your video. This means you don’t need to manually type out transcripts or time-stamp each word - the system listens to your audio and creates perfectly synchronized captions for you.
Required: Video Indexing
Before using src=“auto”, you must index the video for spoken words:
video.index_spoken_words()
This is a one-time operation that analyzes your video’s audio track and figures out when each word is spoken.
The indexing creates a timestamp map that tells the caption system exactly when to display each word. Without this indexing step, the auto-caption feature won’t have the timing data it needs to work.
Basic Usage
from videodb.editor import CaptionAsset, Clip, Track
caption_clip = Clip(
asset=CaptionAsset(src="auto"),
duration=float(video.length)
)
track = Track()
track.add_clip(0, caption_clip)
Example:
The caption clip duration should match or exceed the video duration to ensure all words display.
Animation Types
CaptionAsset supports four animation modes that make your subtitles more dynamic:
Code Example
from videodb.editor import CaptionAnimation
caption_asset = CaptionAsset(
src="auto",
animation=CaptionAnimation.karaoke,
primary_color="&H00FFFFFF", # White
secondary_color="&H0000FFFF" # Yellow highlight
)
Example with CaptionAnimation.karaoke
ASS Color Format
ASS (Advanced SubStation Alpha) is a professional subtitle format that’s been used in video production for years.
It uses BGR (Blue-Green-Red) byte order with an alpha channel - which is backwards from the RGB format you might be used to from web colors.
This quirk exists for historical reasons in subtitle rendering systems.
Format Structure
&HAABBGGRR or &H00BBGGRR
AA = Alpha (00 = opaque, FF = transparent) HTML to ASS Conversion
To convert HTML colors to ASS format:
HTML #RRGGBB → Extract RGB bytes Add prefix &H00 (opaque) or &HAA (with transparency) Example: HTML #FF6600 (orange)
RGB: Red=FF, Green=66, Blue=00 Common Colors
Styling Parameters
CaptionAsset styling is organized into three parameter groups: FontStyling, Positioning, and BorderAndShadow.
FontStyling
Controls how your subtitle text looks - the font face, size, and whether it’s bold or italic. Think of this as the basic typography settings for making your captions readable and on-brand.
from videodb.editor import FontStyling
FontStyling(
size=36, # Font size in points
bold=True, # Bold weight
italic=False, # Italic style
name="Arial" # Font family
)
Positioning
Controls where on the screen your captions appear and how much spacing you want from the edges. You can place captions at the bottom like traditional subtitles, or anywhere else on screen with precise margin control.
Positioning(
alignment=CaptionAlignment.bottom_center,
margin_v=100, # Vertical margin in pixels
margin_l=20, # Left margin in pixels
margin_r=20 # Right margin in pixels
)
# Corners
CaptionAlignment.top_left
CaptionAlignment.top_right
CaptionAlignment.bottom_left
CaptionAlignment.bottom_right
# Edges
CaptionAlignment.top
CaptionAlignment.bottom
CaptionAlignment.left
CaptionAlignment.right
CaptionAlignment.center
# Center positions
CaptionAlignment.middle_center
CaptionAlignment.bottom_center
Example with :
position=Positioning(
alignment=CaptionAlignment.bottom_center,
margin_v=50 # 50px from bottom
),
font=FontStyling(
size=48,
bold=True,
name = "Clear Sans",
)
BorderAndShadow
Controls outlines and shadows that make your text readable over any background.
These parameters are crucial because subtitles need to be legible whether they’re over bright skies, dark scenes, or complex imagery - borders and shadows ensure the text always stands out.
from videodb.editor import BorderAndShadow, CaptionBorderStyle
BorderAndShadow(
style=CaptionBorderStyle.outline_and_shadow,
outline=3.0, # Outline width in pixels
shadow=2.0, # Shadow depth in pixels
outline_color="&H00000000", # Black outline (ASS format)
shadow_color="&H80000000" # Semi-transparent black shadow
)
CaptionBorderStyle Options:
CaptionBorderStyle.outline_and_shadow - Outline + drop shadow CaptionBorderStyle.opaque_box - Solid background box Example:
border=BorderAndShadow(
style=CaptionBorderStyle.outline_and_shadow,
outline=5,
outline_color="&H00000000", # Black outline
shadow=3
)
Complete Example
From the notebook, here’s a complete CaptionAsset with all styling parameters:
from videodb.editor import CaptionAsset, CaptionAnimation, Positioning, CaptionAlignment, FontStyling, BorderAndShadow