> ## Documentation Index
> Fetch the complete documentation index at: https://docs.videodb.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Create an Index

> Transform video into searchable data with spoken word and visual indexes

Indexes turn raw video into structured, searchable data. Create a spoken word index for dialogue and narration, or a scene index for visual content.

## Quick Example

<CodeGroup>
  ```python Python theme={null}
  import videodb

  conn = videodb.connect()
  coll = conn.get_collection()
  video = coll.get_video("m-xxx")

  # Index spoken content (dialogue, narration)
  video.index_spoken_words()

  # Index visual content (scenes, objects, actions)
  scene_index_id = video.index_scenes(
      prompt="Describe what's happening in the scene"
  )

  # Search both
  results = video.search("car chase through the city")
  results.play()
  ```

  ```javascript Node.js theme={null}
  import { connect } from 'videodb';

  const conn = connect();
  const coll = await conn.getCollection();
  const video = await coll.getVideo("m-xxx");

  // Index spoken content (dialogue, narration)
  await video.indexSpokenWords();

  // Index visual content (scenes, objects, actions)
  const sceneIndexId = await video.indexScenes({
      prompt: "Describe what's happening in the scene"
  });

  // Search both
  const results = await video.search("car chase through the city");
  await results.play();
  ```
</CodeGroup>

***

## Spoken Word Index

Transcribes audio into timestamped text using automatic speech recognition (ASR).

<CodeGroup>
  ```python Python theme={null}
  video.index_spoken_words()
  ```

  ```javascript Node.js theme={null}
  await video.indexSpokenWords();
  ```
</CodeGroup>

**What it captures:**

* Dialogue and conversations
* Narration and voiceovers
* Lectures and presentations
* Interviews and podcasts

### Language Support

Major languages are auto-detected. For others, pass the language code:

<CodeGroup>
  ```python Python theme={null}
  # Auto-detect (English, Spanish, French, German, Italian, Portuguese, Dutch)
  video.index_spoken_words()

  # Explicit language code
  video.index_spoken_words(language_code="hi")  # Hindi
  video.index_spoken_words(language_code="ja")  # Japanese
  video.index_spoken_words(language_code="zh")  # Chinese
  ```

  ```javascript Node.js theme={null}
  // Auto-detect (English, Spanish, French, German, Italian, Portuguese, Dutch)
  await video.indexSpokenWords();

  // Explicit language code
  await video.indexSpokenWords({ languageCode: "hi" });  // Hindi
  await video.indexSpokenWords({ languageCode: "ja" });  // Japanese
  await video.indexSpokenWords({ languageCode: "zh" });  // Chinese
  ```
</CodeGroup>

| Language           | Code                      |
| :----------------- | :------------------------ |
| English (Global)   | `en`                      |
| English (US/UK/AU) | `en_us`, `en_uk`, `en_au` |
| Spanish            | `es`                      |
| French             | `fr`                      |
| German             | `de`                      |
| Hindi              | `hi`                      |
| Japanese           | `ja`                      |
| Chinese            | `zh`                      |
| Korean             | `ko`                      |
| Russian            | `ru`                      |

***

## Scene Index

Analyzes video frames using vision models to describe visual content.

<CodeGroup>
  ```python Python theme={null}
  scene_index_id = video.index_scenes(
      prompt="Describe the scene in detail"
  )
  ```

  ```javascript Node.js theme={null}
  const sceneIndexId = await video.indexScenes({
      prompt: "Describe the scene in detail"
  });
  ```
</CodeGroup>

**What it captures:**

* Objects and people
* Actions and activities
* Environments and settings
* Visual transitions

### Prompt Shapes the Index

The prompt you provide determines what gets indexed:

<CodeGroup>
  ```python Python theme={null}
  # Focus on people
  video.index_scenes(prompt="Describe the people and their actions")

  # Focus on environment
  video.index_scenes(prompt="Describe the location and setting")

  # Focus on specific objects
  video.index_scenes(prompt="Identify all vehicles and their colors")
  ```

  ```javascript Node.js theme={null}
  // Focus on people
  await video.indexScenes({ prompt: "Describe the people and their actions" });

  // Focus on environment
  await video.indexScenes({ prompt: "Describe the location and setting" });

  // Focus on specific objects
  await video.indexScenes({ prompt: "Identify all vehicles and their colors" });
  ```
</CodeGroup>

### Extraction Configuration

Control how frames are sampled - choose between frame segmentation (regular intervals) and scene segmentation (automatic transitions):

<img src="https://mintcdn.com/videodb/6KL5X6-sIPSRpEUt/assets/indexing/scene-extraction-type.avif?fit=max&auto=format&n=6KL5X6-sIPSRpEUt&q=85&s=05d44b8c0a2ae53395f638015a60b6b2" style={{width: "auto", height: "auto"}} alt="Comparison of frame segmentation and scene segmentation extraction types" width="1766" height="724" data-path="assets/indexing/scene-extraction-type.avif" />

<CodeGroup>
  ```python Python theme={null}
  from videodb import SceneExtractionType

  # Time-based: every N seconds
  video.index_scenes(
      extraction_type=SceneExtractionType.time_based,
      extraction_config={"time": 10, "frame_count": 2},
      prompt="Describe the scene"
  )

  <img
    src="/assets/indexing/time-based-extraction.avif"
    style={{width: "auto", height: "auto"}}
    alt="Time-based extraction example showing consistent frame sampling at regular intervals"
  />

  # Shot-based: detect visual transitions
  video.index_scenes(
      extraction_type=SceneExtractionType.shot_based,
      extraction_config={"threshold": 20, "frame_count": 1},
      prompt="Describe the scene"
  )
  ```

  ```javascript Node.js theme={null}
  // Time-based: every N seconds
  await video.indexScenes({
      extractionType: 'time',
      extractionConfig: { time: 10, frame_count: 2 },
      prompt: "Describe the scene"
  });

  // Shot-based: detect visual transitions
  await video.indexScenes({
      extractionType: 'shot',
      extractionConfig: { threshold: 20, frame_count: 1 },
      prompt: "Describe the scene"
  });
  ```
</CodeGroup>

| Method     | Best For                               |
| :--------- | :------------------------------------- |
| Time-based | Consistent sampling, dynamic content   |
| Shot-based | Edited videos with clear scene changes |

***

## Managing Indexes

### List All Scene Indexes

<CodeGroup>
  ```python Python theme={null}
  indexes = video.list_scene_index()
  for idx in indexes:
      print(f"{idx.id}: {idx.name} - {idx.status}")
  ```

  ```javascript Node.js theme={null}
  const indexes = await video.listSceneIndex();
  for (const idx of indexes) {
      console.log(`${idx.id}: ${idx.name} - ${idx.status}`);
  }
  ```
</CodeGroup>

<img src="https://mintcdn.com/videodb/6KL5X6-sIPSRpEUt/assets/indexing/list_scene_index.webp?fit=max&auto=format&n=6KL5X6-sIPSRpEUt&q=85&s=69c1616c1805a831c89ac2b87c655f80" style={{width: "auto", height: "auto"}} alt="List of scene indexes showing id, name, and status" width="1917" height="949" data-path="assets/indexing/list_scene_index.webp" />

### Get Index Details

<CodeGroup>
  ```python Python theme={null}
  scene_index = video.get_scene_index(scene_index_id)
  for scene in scene_index:
      print(f"{scene.start}-{scene.end}: {scene.description}")
  ```

  ```javascript Node.js theme={null}
  const sceneIndex = await video.getSceneIndex(sceneIndexId);
  for (const scene of sceneIndex) {
      console.log(`${scene.start}-${scene.end}: ${scene.description}`);
  }
  ```
</CodeGroup>

### Delete an Index

<CodeGroup>
  ```python Python theme={null}
  video.delete_scene_index(scene_index_id)
  ```

  ```javascript Node.js theme={null}
  await video.deleteSceneIndex(sceneIndexId);
  ```
</CodeGroup>

***

## Async Processing with Callbacks

For long videos, use callbacks to get notified when indexing completes:

<CodeGroup>
  ```python Python theme={null}
  scene_index_id = video.index_scenes(
      prompt="Describe the scene",
      callback_url="https://your-backend.com/webhooks/index-complete"
  )
  ```

  ```javascript Node.js theme={null}
  const sceneIndexId = await video.indexScenes({
      prompt: "Describe the scene",
      callbackUrl: "https://your-backend.com/webhooks/index-complete"
  });
  ```
</CodeGroup>

***

## What You Can Build

<CardGroup cols={2}>
  <Card title="Keyword Search Compilation" icon="search" href="/examples-and-tutorials/video-rag/keyword-search">
    Index spoken words, then search to create highlight reels
  </Card>

  <Card title="Multimodal Search" icon="brain" href="/examples-and-tutorials/video-rag/multimodal-search">
    Combine spoken word and scene indexes for powerful queries
  </Card>

  <Card title="Baby Crib Monitoring" icon="baby" href="/examples-and-tutorials/live-intelligence/baby-crib-monitoring">
    Scene indexing enables real-time infant monitoring
  </Card>

  <Card title="Intrusion Detection" icon="shield" href="/examples-and-tutorials/live-intelligence/intrusion-detection">
    Index camera feeds to detect unauthorized access
  </Card>
</CardGroup>

***

## Next Steps

<CardGroup cols={2}>
  <Card icon="list" title="Multimodal Indexing" href="/pages/understand/indexing-pipelines/multimodal-indexing">
    Extraction strategies for video + audio
  </Card>

  <Card icon="search" title="Multiple Indexes" href="/pages/understand/indexing-pipelines/multiple-indexes">
    Layer different perspectives on the same media
  </Card>
</CardGroup>
