> ## Documentation Index
> Fetch the complete documentation index at: https://docs.videodb.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Pair Programmer

> Turn your coding agent into a screen aware, voice aware, context rich collaborator

<iframe className="w-full aspect-video rounded-xl" src="https://www.youtube.com/embed/dIvZoZr3DyM" title="Pair Programmer - Screen Aware AI Coding Assistant" allow="accelerometer; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowFullScreen />

<Card title="Pair Programmer on GitHub" icon="github" href="https://github.com/video-db/pair-programmer">
  Complete source code, installation guide, and configuration examples
</Card>

## What Is It?

Pair Programmer is an **agentic skill** that gives your AI coding assistant real time perception.

It captures:

* **Screen** for visual context like terminals, editors, browser tabs, errors, and UI state
* **Microphone** for your spoken intent, ideas, and debugging notes
* **System audio** for tutorials, meetings, demos, and anything else your computer is playing

Once captured, that context becomes searchable.

So instead of re explaining what was on screen, copy pasting logs, or summarizing a 20 minute debugging session, you can ask:

* *What was I doing when the auth flow broke?*
* *What did I say about the database migration?*
* *Show me what was on screen when the test failed*
* *What happened in the last 10 minutes?*

<Tip>
  **The Missing Piece**: This is the missing perception layer for coding agents. Works with Claude Code, Cursor, Codex, and other skill compatible agents.
</Tip>

**Why this changes everything**: Most coding agents operate in a text-only world. They can read your files and write code, but they can't see your terminal output, your browser's error messages, your Figma mockups, or hear you explaining the problem out loud. That means you spend half your time copy-pasting context, describing what's on screen, or re-explaining what you already said 5 minutes ago.

Pair Programmer closes this gap. It gives your agent the same sensory context you have—screen, mic, system audio—making collaboration feel natural instead of fragmented.

## Why This Is Useful

<Tabs>
  <Tab title="Context Aware" icon="eye">
    ### Stay Grounded

    Most coding agents can write code. Very few can stay grounded in the same context as you.

    Pair Programmer helps your agent stay on the same page by giving it access to what you saw, what you said, and what your machine was playing.
  </Tab>

  <Tab title="Natural Search" icon="search">
    ### Ask in Plain Language

    Search your session in natural language:

    * "What was I working on when I mentioned the auth bug?"
    * "What did I say in the last 5 minutes?"
    * "Show me what was on screen when the test failed"

    No more copy-pasting or repeated explanations.
  </Tab>

  <Tab title="Real-Time Recording" icon="video">
    ### Continuous Capture

    A lightweight overlay shows recording status, active channels, and elapsed time.

    Record your screen, mic, and system audio in real time, then search what happened when you need it.
  </Tab>
</Tabs>

## Use Cases

Pair Programmer is perfect for:

* **Debugging sessions** — Track what you tried and where it went wrong
* **Tutorial driven development** — Build while following video tutorials
* **Bug reproduction** — Capture exact steps that triggered the issue
* **Meeting follow ups** — Search conversations and screen activity
* **Architecture walkthroughs** — Review code with full context
* **Voice first coding workflows** — Speak your thoughts and code together

## Getting Started with Pair Programmer

The most common workflow:

1. **Start recording when you begin work** — just run `/pair-programmer record` and choose your screen. Let it capture in the background.

2. **Work normally** — code, debug, browse Stack Overflow, watch tutorials. Don't think about the recording.

3. **Ask for context when you need it** — stuck on a bug? Run `/pair-programmer search "when did the build error first appear?"` Your agent sees the exact moment with full terminal output and code context.

4. **Let your agent act on spoken instructions** — said "refactor this function to use async/await" into your mic 5 minutes ago? Run `/pair-programmer act` and your agent will do it, using your own words as the spec.

5. **Stop recording when done** — run `/pair-programmer stop`. All context is saved and searchable.

Over time, you'll develop your own patterns—maybe you record only during debugging sessions, or maybe you keep it running all day for complete work memory.

## Installation

<Info>
  **Prerequisites**

  * Node.js 18+
  * macOS 12+ or Windows 10+
  * [VideoDB API key](https://console.videodb.io) (free, no credit card required)
</Info>

If you have an older version installed, remove it first before upgrading.

<Tabs>
  <Tab title="Option 1: NPX" icon="terminal">
    ### Install with npx (Recommended)

    ```bash theme={null}
    npx skills add video-db/pair-programmer
    ```
  </Tab>

  <Tab title="Option 2: Marketplace" icon="store">
    ### Install from marketplace

    ```bash theme={null}
    /plugin marketplace add video-db/pair-programmer
    /plugin install pair-programmer
    ```
  </Tab>
</Tabs>

## Setup

<Steps>
  <Step title="Get API Key">
    Get a free VideoDB API key from [console.videodb.io](https://console.videodb.io)

    No credit card required.
  </Step>

  <Step title="Set API Key">
    Export your API key in your shell:

    ```bash theme={null}
    export VIDEO_DB_API_KEY=your-key
    ```

    Or add it to a `.env` file in your project root
  </Step>

  <Step title="Run Setup">
    Install dependencies and complete local setup:

    ```bash theme={null}
    /pair-programmer setup
    ```
  </Step>
</Steps>

## Quick Start

<Steps>
  <Step title="Start Recording">
    Start recording your screen, mic, and system audio:

    ```bash theme={null}
    /pair-programmer record
    ```

    A source picker will open so you can choose what to capture. Once recording starts, a lightweight overlay shows recording status, active channels, and elapsed time.
  </Step>

  <Step title="Work Normally">
    Continue your coding session. Pair Programmer captures everything in the background.
  </Step>

  <Step title="Search Your Session">
    Search your session in natural language:

    ```bash theme={null}
    /pair-programmer search "what was I working on when I mentioned the auth bug?"
    ```

    ```bash theme={null}
    /pair-programmer search "what did I say in the last 5 minutes?"
    ```

    ```bash theme={null}
    /pair-programmer search "show me what was on screen when the test failed"
    ```
  </Step>

  <Step title="Get Summary">
    Get a summary of recent activity:

    ```bash theme={null}
    /pair-programmer what-happened
    ```
  </Step>

  <Step title="Stop Recording">
    Stop recording when you're done:

    ```bash theme={null}
    /pair-programmer stop
    ```
  </Step>
</Steps>

## Commands

| Command                             | Description                                                  |
| ----------------------------------- | ------------------------------------------------------------ |
| `/pair-programmer record`           | Start recording and open the source picker                   |
| `/pair-programmer stop`             | Stop the active recording                                    |
| `/pair-programmer search "<query>"` | Search screen, mic, and audio context using natural language |
| `/pair-programmer what-happened`    | Summarize recent activity                                    |
| `/pair-programmer setup`            | Install dependencies and complete local setup                |
| `/pair-programmer config`           | Update indexing and recording settings                       |

## Real-World Examples

<AccordionGroup>
  <Accordion title="Debugging Complex Issues" icon="bug">
    You're chasing a bug across multiple files and terminals. Instead of documenting every step, just keep coding. Later, run:

    ```bash theme={null}
    /pair-programmer search "what was on screen when the test failed"
    ```

    Get instant context about terminal output, error messages, and which files you had open.
  </Accordion>

  <Accordion title="Learning From Tutorials" icon="graduation-cap">
    Following a video tutorial while coding? Pair Programmer captures both the tutorial (system audio) and your implementation (screen).

    ```bash theme={null}
    /pair-programmer search "build me the project from the video I was just watching"
    ```

    Your agent sees what was on screen and heard what was being said in the tutorial.
  </Accordion>

  <Accordion title="Pair Programming Sessions" icon="users">
    In a meeting discussing code? Pair Programmer captures your screen and the conversation.

    ```bash theme={null}
    /pair-programmer what-happened
    ```

    Get a summary of what was discussed, what code was reviewed, and action items.
  </Accordion>

  <Accordion title="Voice-Driven Development" icon="mic">
    Speaking your thoughts while coding? Your microphone captures your debugging notes and ideas.

    ```bash theme={null}
    /pair-programmer search "what did I say about the database migration?"
    ```

    Find moments where you verbally explained your thinking.
  </Accordion>
</AccordionGroup>

## How It Works

Pair Programmer uses VideoDB's Capture SDK to:

1. **Record** — Continuously capture screen, microphone, and system audio
2. **Process** — Stream to VideoDB for real-time AI indexing
3. **Search** — Query across all captured context with natural language
4. **Retrieve** — Get timestamped results with relevant clips

All context is searchable in real-time, giving your coding agent full perception of your workflow.

***

<Card title="Complete Setup Guide on GitHub" icon="github" href="https://github.com/video-db/pair-programmer">
  Detailed installation instructions, troubleshooting tips, and configuration examples
</Card>

## Related Tutorials

<CardGroup cols={2}>
  <Card title="Bloom" icon="video" href="/examples-and-tutorials/ai-copilots/bloom">
    Local-first screen recorder with AI-ready search and indexing
  </Card>

  <Card title="Focusd Productivity Tracker" icon="chart-line" href="/examples-and-tutorials/ai-copilots/focusd">
    AI-powered productivity tracking with automatic time insights
  </Card>

  <Card title="Call.md" icon="users" href="/examples-and-tutorials/ai-copilots/call-md">
    Real-time AI meeting assistant with live coaching
  </Card>
</CardGroup>
