VideoDB Documentation

Pages

Visual Search and Indexing

Scene Extraction Algorithms

Custom Annotations

Scene-Level Metadata: Smarter Video Search & Retrieval

Advanced Visual Search Pipelines

Playground for Scene Extractions

Deep Dive into Prompt Engineering : Mastering Video Scene Indexing

⁠

Index Scenes

The versatility of scene indexing opens up a world of possibilities for finding visual information in videos. Vision models now enable useful extraction of information from videos that you can easily index using VideoDB.

Now, you can easily build RAG for queries like:

⁠

⁠

Show me where birds are flying near a castle. Show me when the person took out the gun. Show where people are running towards the sea

index_id = video.index_scenes()

In just one command, the index_scenes function can index visual information in your video.

⁠

Optional Parameters

index_scenes() function accepts a few optional parameters.

You can use different extraction algorithms to select scene and frames.

Additionally, you can use prompts to describe these scenes and frames using a vision model.

👉🏼

Read more about Scene and Frame object⁠

from videodb import IndexType

from videodb import SceneExtractionType

index_id = video.index_scenes(

extraction_type=SceneExtractionType.time_based,

extraction_config={"time":10, "select_frames": ['first']},

prompt="describe the image in 100 words",

callback_url=callback_url,

)

# Wait to Indexing to finish

scene_index = video.get_scene_index(index_id)

print(scene_index)

# search your video with index_id,

# Default Case: search all indexes

res = video.search(query="religious gathering",

index_type=IndexType.scene,

index_id=index_id)

res.play()

extraction_type - Choose scene extraction algorithm.

extraction_config - Configuration of scene extraction algorithm.

prompt - Prompt to describe each scene in text.

callback_url - Notification url when the job is done.

Let’s go in detail of each parameter:

extraction_type

Visually, a video is a series of images in a timeline. A 60 fps video, for instance, shows 60 frames per second and feels higher in quality compared to a 30 fps video. Parameter extraction_type, can be used to experiment with the scene extraction algorithms and in-turn choosing the frames that are relevant to describe details. Checkout

Scene Extraction Algorithms⁠

for details.

⁠

Screenshot 2024-07-04 at 11.41.39 AM.jpg

⁠

prompt

Prompt is for the vision models to understand the context and nature of output that you want. For example, if someone is interested in identifying running activity they can use following prompt to describe each scene:

“Describe clearly what is happening in the video. Add running_detected if you see a person running.”

If you are interested in experimenting with your own model, and prompts Checkout

Advanced Visual Search Pipelines⁠

⁠

Currently scene index is well suited for semantic search, try to have your prompts designed to output well written prose that can be indexed for semantic search.

😎 Soon we are going to support json and sql data extraction and indexing.

callback_url

URL to send notification when the scene index process is completed.

Checkout 👀

callback details here.⁠

⁠

Managing Indexes

List all the scene Indexes created for a video.

scene_indexes = video.list_scene_index()

This function returns a list of available scene indexes with id name and status

Get Specific Index

scene_index = video.get_scene_index(scene_index_id)

This function returns a list of indexed scenes with start end and description

Delete a index

video.delete_scene_index(index_id)

Create multiple indexes for one video

You can create multiple scene indexes for a video.

Use these indexes to search different layers of topics and concepts within a single video.

⁠

Screenshot 2024-07-04 at 12.29.46 PM.jpg

⁠

Deep Dive

Checkout

Scene Extraction Algorithms⁠

Pass your metadata for search filters

Broken link⁠

If you want to bring your own scene descriptions and annotations, explore the

Custom Annotations⁠

Pipeline.

Experiment with extraction algorithms, prompts, and search using the

Playground for Scene Extractions⁠

Check out our open and flexible

Advanced Visual Search Pipelines⁠

In our upcoming releases, we are introducing integration with numerous metadata stores. This will allow you to extract not just plain text, but also JSON or tabular information from videos. You can then index this data using the database of your choice. Currently, we only offer vector indexing, but we plan to expand this to include more methods for finding information, such as filters, searches, and queries.

Additionally, we will introduce integration with vision models of your choice.

Create multiple indexes for one video

Deep Dive

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.