Skip to main content

Quick Example

import videodb

conn = videodb.connect()
coll = conn.get_collection()
rtstream = coll.get_rtstream("rts-xxx")

# Index visuals with a prompt
scene_index = rtstream.index_visuals(
    prompt="Describe activity and detect unusual behavior",
    batch_config={"type": "time", "value": 5, "frame_count": 2}
)

# Create a reusable event
event_id = conn.create_event(
    event_prompt="Detect when someone enters restricted area",
    label="intrusion_detected"
)

# Set up alert
alert_id = scene_index.create_alert(
    event_id=event_id,
    callback_url="https://your-backend.com/webhooks/alerts"
)

Visual Indexing

Convert video frames into structured descriptions using prompts.
scene_index = rtstream.index_visuals(
    prompt="Describe the scene and highlight congestion",
    batch_config={"type": "time", "value": 5, "frame_count": 2},
    name="traffic_monitor",
    ws_connection_id=ws.connection_id  # optional, for real-time events
)

batch_config Options

FieldTypeDescription
typestrOnly "time" is supported
valueintWindow size in seconds
frame_countintFrames to extract per window
Examples:
# Every 5 seconds, extract 2 frames
{"type": "time", "value": 5, "frame_count": 2}

# Every 10 seconds, extract 5 frames
{"type": "time", "value": 10, "frame_count": 5}

Managing Indexes

# List all indexes
indexes = rtstream.list_scene_indexes()

# Get specific index
scene_index = rtstream.get_scene_index(index_id)

# Poll scenes
scenes = scene_index.get_scenes(
    start=0, end=None, page=1, page_size=100
)

Audio Indexing

Extract insights from audio tracks:
audio_index = rtstream.index_audio(
    prompt="Identify key speakers and main topics",
    batch_config={"type": "word", "value": 50}
)

Audio batch_config

TypeValueDescription
"word"countSegment every N words
"sentence"countSegment every N sentences
"time"secondsSegment every N seconds

Transcription

Real-time speech-to-text:
# Start transcription
rtstream.start_transcript(ws_connection_id=ws.connection_id)

# Stop transcription
rtstream.stop_transcript(mode="graceful")

# Poll transcripts
transcript = rtstream.get_transcript(
    start=0, page=1, page_size=100
)

Query indexed content with natural language:
results = rtstream.search(
    query="white car moving fast",
    score_threshold=0.5
)

for shot in results.shots:
    print(f"Match at {shot.start}: {shot.text}")
    shot.play()  # Opens in browser

Search Results

Each shot contains:
AttributeDescription
startStart timestamp
endEnd timestamp
textContent description
search_scoreRelevance score (0-1)
stream_urlPlayback URL

Events and Alerts

Events are reusable detection rules. Alerts wire events to indexes for notifications.

Create Event

event_id = conn.create_event(
    event_prompt="Detect pedestrians crossing the zebra",
    label="pedestrian_detected"
)

Create Alert

alert_id = scene_index.create_alert(
    event_id=event_id,
    callback_url="https://your-backend.com/webhooks/alerts",
    ws_connection_id=ws.connection_id  # optional
)

Alert Delivery

MethodLatencyUse Case
WebhookUnder 1sServer-to-server POST
WebSocketReal-timeFrontend dashboards

Manage Alerts

# List alerts
alerts = scene_index.list_alerts()

# Enable/disable
scene_index.enable_alert(alert_id)
scene_index.disable_alert(alert_id)

WebSocket Events

Receive real-time events by passing ws_connection_id:
ws = conn.connect_websocket()
await ws.connect()

# Pass ws_connection_id to methods
rtstream.start_transcript(ws_connection_id=ws.connection_id)

# Receive events
async for ev in ws.stream():
    channel = ev.get("channel")

    if channel == "transcript":
        print(f"TRANSCRIPT: {ev['data']['text']}")
    elif channel == "scene_index":
        print(f"SCENE: {ev['data']['text']}")
    elif channel == "alert":
        print(f"ALERT: {ev['data']}")

Event Channels

ChannelSourceContent
transcriptstart_transcript()Real-time speech-to-text
scene_indexindex_visuals()Visual analysis
audio_indexindex_audio()Audio analysis
alertcreate_alert()Alert notifications

Next Steps