> ## Documentation Index
> Fetch the complete documentation index at: https://docs.videodb.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Real-time Context

> Events you receive from capture - transcripts, visual indexes, audio indexes, and alerts

Capture sessions emit structured events in real-time. Use webhooks for durable delivery, WebSockets for live UI.

<Note>
  Desktop capture currently supports **macOS** and **Windows**.
</Note>

## Quick Example

<CodeGroup>
  ```python Python theme={null}
  import videodb

  conn = videodb.connect()
  ws = conn.connect_websocket()
  await ws.connect()

  # Listen for events
  async for ev in ws.stream():
      channel = ev.get("channel")

      if channel == "transcript":
          print(f"TRANSCRIPT: {ev['data']['text']}")
      elif channel == "scene_index":
          print(f"SCENE: {ev['data']['text']}")
      elif channel == "audio_index":
          print(f"AUDIO: {ev['data']['text']}")
  ```

  ```javascript Node.js theme={null}
  import { connect } from 'videodb';

  const conn = connect();
  const ws = conn.connectWebsocket();
  await ws.connect();

  // Listen for events
  for await (const ev of ws.stream()) {
      const channel = ev.channel;

      if (channel === "transcript") {
          console.log(`TRANSCRIPT: ${ev.data.text}`);
      } else if (channel === "scene_index") {
          console.log(`SCENE: ${ev.data.text}`);
      } else if (channel === "audio_index") {
          console.log(`AUDIO: ${ev.data.text}`);
      }
  }
  ```
</CodeGroup>

***

## Event Types

### Transcript Events

Real-time speech-to-text from audio channels:

```json theme={null}
{
  "channel": "transcript",
  "rtstream_id": "rts-xxx",
  "rtstream_name": "mic:default",
  "data": {
    "text": "Let's schedule the meeting for Thursday",
    "is_final": true,
    "start": 1710000001234,
    "end": 1710000002345
  }
}
```

| Field       | Description                           |
| :---------- | :------------------------------------ |
| `text`      | Transcribed speech                    |
| `is_final`  | `true` for final, `false` for interim |
| `start/end` | Timestamps (ms)                       |

### Visual Index Events

Scene descriptions from screen capture:

```json theme={null}
{
  "channel": "visual_index",
  "rtstream_id": "rts-xxx",
  "rtstream_name": "display:1",
  "data": {
    "text": "User is viewing a Slack conversation with 3 unread messages",
    "start": 1710000012340,
    "end": 1710000018900
  }
}
```

### Audio Index Events

Semantic understanding of audio:

```json theme={null}
{
  "channel": "audio_index",
  "rtstream_id": "rts-xxx",
  "rtstream_name": "mic:default",
  "data": {
    "text": "Discussion about scheduling a team meeting",
    "start": 1710000021500,
    "end": 1710000029200
  }
}
```

### Alert Events

Custom detection rules firing:

```json theme={null}
{
  "channel": "alert",
  "rtstream_id": "rts-xxx",
  "data": {
    "label": "sensitive_content",
    "triggered": true,
    "confidence": 0.92,
    "start": 1710000045100,
    "end": 1710000047800
  }
}
```

***

## WebSocket Channels

| Channel           | Source               | Content             |
| :---------------- | :------------------- | :------------------ |
| `capture_session` | Session lifecycle    | Status changes      |
| `transcript`      | `start_transcript()` | Speech-to-text      |
| `scene_index`     | `index_visuals()`    | Visual analysis     |
| `audio_index`     | `index_audio()`      | Audio analysis      |
| `alert`           | `create_alert()`     | Alert notifications |

### Connecting

<CodeGroup>
  ```python Python theme={null}
  conn = videodb.connect()
  ws = conn.connect_websocket()
  await ws.connect()

  # Pass ws.connection_id when starting AI operations
  rtstream.start_transcript(ws_connection_id=ws.connection_id)
  rtstream.index_visuals(prompt="...", ws_connection_id=ws.connection_id)
  rtstream.index_audio(prompt="...", ws_connection_id=ws.connection_id)
  ```

  ```javascript Node.js theme={null}
  const conn = connect();
  const ws = conn.connectWebsocket();
  await ws.connect();

  // Pass wsConnectionId when starting AI operations
  await rtstream.startTranscript(ws.connectionId);
  await rtstream.indexVisuals({ prompt: "...", wsConnectionId: ws.connectionId });
  await rtstream.indexAudio({ prompt: "...", wsConnectionId: ws.connectionId });
  ```
</CodeGroup>

***

## Webhooks

Durable, at-least-once delivery for session lifecycle events.

### Webhook Envelope

```json theme={null}
{
  "version": "2",
  "event": "capture_session.active",
  "timestamp": "2026-01-20T12:34:56Z",
  "capture_session_id": "cap-xxx",
  "end_user_id": "user_abc",
  "status": "active",
  "data": {}
}
```

### Session Lifecycle Events

| Event                      | Status     | Key Data            |
| :------------------------- | :--------- | :------------------ |
| `capture_session.created`  | `created`  | —                   |
| `capture_session.starting` | `starting` | —                   |
| `capture_session.active`   | `active`   | `rtstreams[]`       |
| `capture_session.stopping` | `stopping` | —                   |
| `capture_session.stopped`  | `stopped`  | —                   |
| `capture_session.exported` | `exported` | `exported_video_id` |
| `capture_session.failed`   | `failed`   | `error` object      |

### Key Webhook: capture\_session.active

This is where you start AI pipelines:

```json theme={null}
{
  "event": "capture_session.active",
  "capture_session_id": "cap-xxx",
  "status": "active",
  "data": {
    "rtstreams": [
      { "rtstream_id": "rts-1", "name": "mic:default", "media_types": ["audio"] },
      { "rtstream_id": "rts-2", "name": "system_audio:default", "media_types": ["audio"] },
      { "rtstream_id": "rts-3", "name": "display:1", "media_types": ["video"] }
    ]
  }
}
```

<CodeGroup>
  ```python Python theme={null}
  def on_active_webhook(payload):
      cap = conn.get_capture_session(payload["capture_session_id"])

      for rts_info in payload["data"]["rtstreams"]:
          rts_id = rts_info["rtstream_id"]
          rts_name = rts_info["name"]

          if "audio" in rts_info["media_types"]:
              rtstream = conn.get_rtstream(rts_id)
              rtstream.start_transcript()
              rtstream.index_audio(prompt="Extract key decisions")

          if "video" in rts_info["media_types"]:
              rtstream = conn.get_rtstream(rts_id)
              rtstream.index_visuals(prompt="Describe what user is doing")
  ```

  ```javascript Node.js theme={null}
  async function onActiveWebhook(payload) {
      const cap = await conn.getCaptureSession(payload.capture_session_id);

      for (const rtsInfo of payload.data.rtstreams) {
          const rtsId = rtsInfo.rtstream_id;
          const rtsName = rtsInfo.name;

          if (rtsInfo.media_types.includes("audio")) {
              const rtstream = await conn.getRtstream(rtsId);
              await rtstream.startTranscript();
              await rtstream.indexAudio({ prompt: "Extract key decisions" });
          }

          if (rtsInfo.media_types.includes("video")) {
              const rtstream = await conn.getRtstream(rtsId);
              await rtstream.indexVisuals({ prompt: "Describe what user is doing" });
          }
      }
  }
  ```
</CodeGroup>

***

## Delivery Semantics

| Method    | Guarantee     | Handle                  |
| :-------- | :------------ | :---------------------- |
| WebSocket | Best-effort   | Reconnect on disconnect |
| Webhook   | At-least-once | Deduplicate by event ID |

<Warning>
  Webhooks may deliver duplicates. Respond 2xx quickly, process asynchronously, implement idempotency.
</Warning>

***

## Next Steps

<CardGroup cols={2}>
  <Card icon="camera" title="Capture Overview" href="/pages/ingest/capture-sdks/overview">
    Architecture and quickstart
  </Card>

  <Card icon="search" title="Storage & Search" href="/pages/ingest/capture-sdks/storage-and-search">
    Export and persistence patterns
  </Card>
</CardGroup>
