Skip to main content

Give Your AI Agents Eyes and Ears

Your agents read text. They generate text. But the world isn’t text - it’s video calls, security feeds, screen recordings, and live streams. VideoDB is the perception layer that lets agents see, hear, remember, and act on continuous media.

What You Can Build

Desktop Agents

Stream screen, mic, and camera. Get real-time context about what the user is doing and saying.Sales Copilot →

Video RAG

Search across hours of meetings, lectures, or archives. Get timestamped moments with playable evidence.Multimodal Search →

Real-time Monitoring

Connect RTSP cameras and drones. Detect events as they happen. Trigger alerts and automations.Intrusion Detection →

Media Automation

Compose videos with code. Generate voice, music, and images. Export to any format.Faceless Video Creator →

Browse All Examples

Explore 30+ examples across AI Copilots, Video RAG, Live Intelligence, Content Factory, and more

The Platform Loop

Every workflow follows the same pattern:
See → Understand → Act
StageWhat Happens
SeeIngest from files, live streams, or desktop capture
UnderstandIndex with prompts. Search with natural language. Get timestamped moments.
ActTrigger alerts, compose edits, export streams
import videodb

conn = videodb.connect()

# See: Get an active stream (from desktop capture or RTSP)
rtstream = conn.get_rtstream("rts-abc123")

# Understand: Create indexes on the live stream
visual_index = rtstream.index_visuals(prompt="Describe what the user is doing")
audio_index = rtstream.index_audio(prompt="Extract key decisions and action items")

# Act: Create an event and attach an alert
event = conn.create_event(
    event_prompt="Detect when someone mentions a deadline or due date"
)
alert = audio_index.create_alert(
    webhook_url="https://your-backend.com/webhooks/deadline-mentioned"
)

# Real-time events arrive via WebSocket or webhook
# { "channel": "alert", "timestamp": "2026-02-11T12:18:00.968810+00:00", "rtstream_id": "rts-xxx", "rtstream_name": "Meeting", "data": { "event_id": "event-77aae6b981970542", "label": "objection", "triggered": true, "confidence": 0.9, "start": 1770812246.3445818, "end": 1770812277.3488276 } }

Install the SDK

pip install videodb

Philosophy

Why perception is the next frontier for AI agents.