Real-time Context

Capture sessions emit structured events in real-time. Use webhooks for durable delivery, WebSockets for live UI.

Quick Example

import videodb

conn = videodb.connect()
ws = conn.connect_websocket()
await ws.connect()

# Listen for events
async for ev in ws.stream():
    channel = ev.get("channel")

    if channel == "transcript":
        print(f"TRANSCRIPT: {ev['data']['text']}")
    elif channel == "scene_index":
        print(f"SCENE: {ev['data']['text']}")
    elif channel == "audio_index":
        print(f"AUDIO: {ev['data']['text']}")

Event Types

Transcript Events

Real-time speech-to-text from audio channels:

{
  "channel": "transcript",
  "rtstream_id": "rts-xxx",
  "rtstream_name": "mic:default",
  "data": {
    "text": "Let's schedule the meeting for Thursday",
    "is_final": true,
    "start": 1710000001234,
    "end": 1710000002345
  }
}

Field	Description
`text`	Transcribed speech
`is_final`	`true` for final, `false` for interim
`start/end`	Timestamps (ms)

Visual Index Events

Scene descriptions from screen capture:

{
  "channel": "visual_index",
  "rtstream_id": "rts-xxx",
  "rtstream_name": "display:1",
  "data": {
    "text": "User is viewing a Slack conversation with 3 unread messages",
    "start": 1710000012340,
    "end": 1710000018900
  }
}

Audio Index Events

Semantic understanding of audio:

{
  "channel": "audio_index",
  "rtstream_id": "rts-xxx",
  "rtstream_name": "mic:default",
  "data": {
    "text": "Discussion about scheduling a team meeting",
    "start": 1710000021500,
    "end": 1710000029200
  }
}

Alert Events

Custom detection rules firing:

{
  "channel": "alert",
  "rtstream_id": "rts-xxx",
  "data": {
    "label": "sensitive_content",
    "triggered": true,
    "confidence": 0.92,
    "start": 1710000045100,
    "end": 1710000047800
  }
}

WebSocket Channels

Channel	Source	Content
`capture_session`	Session lifecycle	Status changes
`transcript`	`start_transcript()`	Speech-to-text
`scene_index`	`index_visuals()`	Visual analysis
`audio_index`	`index_audio()`	Audio analysis
`alert`	`create_alert()`	Alert notifications

Connecting

conn = videodb.connect()
ws = conn.connect_websocket()
await ws.connect()

# Pass ws.connection_id when starting AI operations
rtstream.start_transcript(ws_connection_id=ws.connection_id)
rtstream.index_visuals(prompt="...", ws_connection_id=ws.connection_id)
rtstream.index_audio(prompt="...", ws_connection_id=ws.connection_id)

Webhooks

Durable, at-least-once delivery for session lifecycle events.

Webhook Envelope

{
  "version": "2",
  "event": "capture_session.active",
  "timestamp": "2026-01-20T12:34:56Z",
  "capture_session_id": "cap-xxx",
  "end_user_id": "user_abc",
  "status": "active",
  "data": {}
}

Session Lifecycle Events

Event	Status	Key Data
`capture_session.created`	`created`	—
`capture_session.starting`	`starting`	—
`capture_session.active`	`active`	`rtstreams[]`
`capture_session.stopping`	`stopping`	—
`capture_session.stopped`	`stopped`	—
`capture_session.exported`	`exported`	`exported_video_id`
`capture_session.failed`	`failed`	`error` object

Key Webhook: capture_session.active

This is where you start AI pipelines:

{
  "event": "capture_session.active",
  "capture_session_id": "cap-xxx",
  "status": "active",
  "data": {
    "rtstreams": [
      { "rtstream_id": "rts-1", "name": "mic:default", "media_types": ["audio"] },
      { "rtstream_id": "rts-2", "name": "system_audio:default", "media_types": ["audio"] },
      { "rtstream_id": "rts-3", "name": "display:1", "media_types": ["video"] }
    ]
  }
}

def on_active_webhook(payload):
    cap = conn.get_capture_session(payload["capture_session_id"])

    for rts_info in payload["data"]["rtstreams"]:
        rts_id = rts_info["rtstream_id"]
        rts_name = rts_info["name"]

        if "audio" in rts_info["media_types"]:
            rtstream = conn.get_rtstream(rts_id)
            rtstream.start_transcript()
            rtstream.index_audio(prompt="Extract key decisions")

        if "video" in rts_info["media_types"]:
            rtstream = conn.get_rtstream(rts_id)
            rtstream.index_visuals(prompt="Describe what user is doing")

Delivery Semantics

Method	Guarantee	Handle
WebSocket	Best-effort	Reconnect on disconnect
Webhook	At-least-once	Deduplicate by event ID

Webhooks may deliver duplicates. Respond 2xx quickly, process asynchronously, implement idempotency.

Start Here

Core Concepts

Ingest

Understand

Act

Automate

Build with Agents

Quick Example

Event Types

Transcript Events

Visual Index Events

Audio Index Events

Alert Events

WebSocket Channels

Connecting

Webhooks

Webhook Envelope

Session Lifecycle Events

Key Webhook: capture_session.active

Delivery Semantics

Next Steps

Capture Overview

Storage & Search

Start Here

Core Concepts

Ingest

Understand

Act

Automate

Build with Agents

​Quick Example

​Event Types

​Transcript Events

​Visual Index Events

​Audio Index Events

​Alert Events

​WebSocket Channels

​Connecting

​Webhooks

​Webhook Envelope

​Session Lifecycle Events

​Key Webhook: capture_session.active

​Delivery Semantics

​Next Steps

Capture Overview

Storage & Search

Quick Example

Event Types

Transcript Events

Visual Index Events

Audio Index Events

Alert Events

WebSocket Channels

Connecting

Webhooks

Webhook Envelope

Session Lifecycle Events

Key Webhook: capture_session.active

Delivery Semantics

Next Steps