VideoDB Documentation
VideoDB Documentation

Index Scenes

The versatility of scene indexing opens up a world of possibilities for finding visual information in videos. Vision models now enable useful extraction of information from videos that you can easily index using VideoDB.
Now, you can easily build RAG for queries like:
Scene Index_Announcement_improved.png
Show me where birds are flying near a castle. Show me when the person took out the gun. Show where people are running towards the sea

index_id = video.index_scenes()
In just one command, the index_scenes function can index visual information in your video.

Optional Parameters

index_scenes() function accepts a few optional parameters.
You can use different extraction algorithms to select scene and frames.
Additionally, you can use prompts to describe these scenes and frames using a vision model.

from videodb import IndexType
from videodb import SceneExtractionType
index_id = video.index_scenes(
extraction_config={"time":10, "select_frames": ['first']},
prompt="describe the image in 100 words",

# Wait to Indexing to finish
scene_index = video.get_scene_index(index_id)

# search your video with index_id,
# Default Case: search all indexes
res ="religious gathering",

extraction_type - Choose scene extraction algorithm.
extraction_config - Configuration of scene extraction algorithm.
prompt - Prompt to describe each scene in text.
callback_url - Notification url when the job is done.

Let’s go in detail of each parameter:


Visually, a video is a series of images in a timeline. A 60 fps video, for instance, shows 60 frames per second and feels higher in quality compared to a 30 fps video. Parameter extraction_type, can be used to experiment with the scene extraction algorithms and in-turn choosing the frames that are relevant to describe details. Checkout for details.
Screenshot 2024-07-04 at 11.41.39 AM.jpg


Prompt is for the vision models to understand the context and nature of output that you want. For example, if someone is interested in identifying running activity they can use following prompt to describe each scene:
“Describe clearly what is happening in the video. Add running_detected if you see a person running.”
If you are interested in experimenting with your own model, and prompts Chekout
Currently scene index is well suited for semantic search, try to have your prompts designed to output well written prose that can be indexed for semantic search.
😎 Soon we are going to support json and sql data extraction and indexing.


URL to send notification when the scene index process is completed.
Checkout 👀

Managing Indexes

List all the scene Indexes created for a video.
scene_indexes = video.list_scene_index()
This function returns a list of available scene indexes with id name and status

Get Specific Index
scene_indexe = video.get_scene_index(scene_index_id)
This function returns a list of indexed scenes with start end and description

Delete a index

Create multiple indexes for one video

You can create multiple scene indexes for a video.
Use these indexes to search different layers of topics and concepts within a single video.
Screenshot 2024-07-04 at 12.29.46 PM.jpg

Deep Dive

If you want to bring your own scene descriptions and annotations, explore the Pipeline.
Experiment with extraction algorithms, prompts, and search using the
Check out our open and flexible
In our upcoming releases, we are introducing integration with numerous metadata stores. This will allow you to extract not just plain text, but also JSON or tabular information from videos. You can then index this data using the database of your choice. Currently, we only offer vector indexing, but we plan to expand this to include more methods for finding information, such as filters, searches, and queries.
Additionally, we will introduce integration with vision models of your choice.

Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
) instead.