AI Agent Skills

Your AI agents can write code and automate tasks brilliantly. But they’re missing one critical capability: the ability to work with video and audio - capturing screens, searching through recordings, editing clips, and streaming results. VideoDB Skills give agents like Claude Code and Codex the power to execute server-side video workflows, turning text-only agents into multimodal collaborators.

Install VideoDB Skills

Get video and audio perception in your agent with one command:

NPX (Recommended)
Claude Code Plugin

npx skills add video-db/skills

/plugin marketplace add video-db/skills
/plugin install videodb@videodb-skills

Then run /videodb setup to configure your API key and verify connectivity.

VideoDB Skills on GitHub

Complete source code, installation guide, and configuration examples

Prerequisites

VideoDB API Key

Get a free API key from console.videodb.ioNo credit card required. Free tier includes 50 uploads.

System Requirements

Python 3.9+
Platform: macOS, Linux, Windows (PowerShell)

Set Your API Key

Export your API key in your shell:

export VIDEO_DB_API_KEY=your-key-here

Or add it to a .env file in your project root.

What It Does

VideoDB Skills is a perception capability that enables See → Understand → Act, as an API, for video and audio. It gives agents like Claude Code, Codex, and Cursor the ability to execute server-side video workflows. One unified interface for:

See - Capture desktop screens, microphone/system audio, RTSP streams, and ingest files, URLs, and YouTube content
Understand - Visual analysis, transcription, indexing, and searching moments with playable clips
Act - Stream results, trigger alerts, edit timelines, generate subtitles/overlays, and export clips

Why Use It

Video Workflows
Real-Time Perception
Search & Intelligence

Execute video operations without local ffmpeg installation:

Upload from YouTube, URLs, or local files
Trim, merge, clip, overlay text/images/audio
Transcode, reframe, adjust resolution and aspect ratio
Get instant playable HLS links via built-in CDN

Quick Start

Ask your agent to execute video tasks:

Upload [YouTube URL] and provide a shareable stream link

Extract clips from 10s-30s and 45s-60s and merge them

Generate background music and add to this clip

Add white text on black background subtitles to the original video

Capture my screen for two minutes and report my activities with insights

Monitor my IP Camera RTSP stream and log person detection alerts with timestamps

Capabilities

Capability	What It Does
Capture	Desktop screen, microphone, and system audio for real-time processing
Upload	Ingest from YouTube, URLs, or local files
Context	Generate structured context from RTSP feeds or desktop streams
Search	Locate moments by speech, scenes, or metadata with playable evidence
Transcripts	Generate timestamped transcripts
Subtitles	Auto-generate, style, and burn-in subtitles
Edit	Trim, merge, clip, overlay text/images/audio; add dubbing/translation
AI Generate	Create images, video, music, sound effects, voiceovers
Transcode/Reframe	Adjust resolution, quality, aspect ratio, social crops server-side
Stream	Obtain instant playable HLS links via built-in CDN

Example: OpenClaw Monitoring

VideoDB Skills powers OpenClaw Monitoring - “CCTV for AI agents” that monitors, records, and audits autonomous agent sessions. Every agent run becomes a live stream, replayable recording, and searchable archive.

OpenClaw Monitoring on GitHub

See how VideoDB Skills enables visual observability for autonomous agents

Next Steps

Capture SDK Overview

Deep dive: channels, permissions, client code, and event handling

Real-time Context

How real-time indexing and search works

AI Copilot Examples

Explore more AI copilot projects and use cases

Quickstart

Try desktop perception with a hosted OpenClaw agent

Start Here

Core Concepts

Ingest

Understand

Act

Automate

Open Source Frameworks

Install VideoDB Skills

VideoDB Skills on GitHub

Prerequisites

What It Does

Why Use It

Quick Start

Capabilities

Example: OpenClaw Monitoring

OpenClaw Monitoring on GitHub

Next Steps

Capture SDK Overview

Real-time Context

AI Copilot Examples

Quickstart

Start Here

Core Concepts

Ingest

Understand

Act

Automate

Open Source Frameworks

Documentation Index

​Install VideoDB Skills

VideoDB Skills on GitHub

​Prerequisites

​What It Does

​Why Use It

​Quick Start

​Capabilities

​Example: OpenClaw Monitoring

OpenClaw Monitoring on GitHub

​Next Steps

Capture SDK Overview

Real-time Context

AI Copilot Examples

Quickstart

Install VideoDB Skills

Prerequisites

What It Does

Why Use It

Quick Start

Capabilities

Example: OpenClaw Monitoring

Next Steps