CaptionAsset synchronizes text to audio timestamps, creating subtitles that move with spoken words. Unlike TextAsset which displays static text overlays, CaptionAsset is built for speech-driven content where timing matters.
CaptionAsset can automatically generate subtitles from speech in your video. This means you don’t need to manually type out transcripts or time-stamp each word - the system listens to your audio and creates perfectly synchronized captions for you.
Before using src="auto", you must index the video for spoken words:
video.index_spoken_words()
This is a one-time operation that analyzes your video’s audio track and figures out when each word is spoken.The indexing creates a timestamp map that tells the caption system exactly when to display each word. Without this indexing step, the auto-caption feature won’t have the timing data it needs to work.
ASS (Advanced SubStation Alpha) is a professional subtitle format that’s been used in video production for years.It uses BGR (Blue-Green-Red) byte order with an alpha channel - which is backwards from the RGB format you might be used to from web colors.This quirk exists for historical reasons in subtitle rendering systems.
Controls how your subtitle text looks - the font face, size, and whether it’s bold or italic. Think of this as the basic typography settings for making your captions readable and on-brand.
from videodb.editor import FontStylingFontStyling( size=36, # Font size in points bold=True, # Bold weight italic=False, # Italic style name="Arial" # Font family)
Controls where on the screen your captions appear and how much spacing you want from the edges. You can place captions at the bottom like traditional subtitles, or anywhere else on screen with precise margin control.
Positioning( alignment=CaptionAlignment.bottom_center, margin_v=100, # Vertical margin in pixels margin_l=20, # Left margin in pixels margin_r=20 # Right margin in pixels)
Parameter
Type
Description
alignment
CaptionAlignment
Where on screen the captions appear (see alignment options below)
margin_v
int
Vertical margin in pixels from top or bottom edge
margin_l
int
Left margin in pixels from left edge
margin_r
int
Right margin in pixels from right edge
# CornersCaptionAlignment.top_leftCaptionAlignment.top_rightCaptionAlignment.bottom_leftCaptionAlignment.bottom_right# EdgesCaptionAlignment.topCaptionAlignment.top_centerCaptionAlignment.bottomCaptionAlignment.leftCaptionAlignment.right# Center positionsCaptionAlignment.middle_centerCaptionAlignment.bottom_center
Example:
position=Positioning( alignment=CaptionAlignment.bottom_center, margin_v=50 # 50px from bottom),font=FontStyling( size=48, bold=True, name = "Clear Sans",)
Controls outlines and shadows that make your text readable over any background.These parameters are crucial because subtitles need to be legible whether they’re over bright skies, dark scenes, or complex imagery - borders and shadows ensure the text always stands out.
from videodb.editor import BorderAndShadow, CaptionBorderStyleBorderAndShadow( style=CaptionBorderStyle.outline_and_shadow, outline=3.0, # Outline width in pixels shadow=2.0, # Shadow depth in pixels outline_color="&H00000000", # Black outline (ASS format) shadow_color="&H80000000" # Semi-transparent black shadow)
Parameter
Type
Description
style
CaptionBorderStyle
How the border/background is rendered
outline
float
Outline width in pixels around each letter
shadow
float
Shadow depth in pixels for drop shadow effect
outline_color
str
Outline color in ASS format
shadow_color
str
Shadow color in ASS format
CaptionBorderStyle Options:
CaptionBorderStyle.outline_and_shadow - Outline + drop shadow