videodb
VideoDB Documentation
videodb
VideoDB Documentation
Visual Search and Indexing

icon picker
Playground for Scene Extractions


Playground: Extract Scenes without Indexing

Sometimes, it's important to determine the number of scenes needed to describe a video, as this can vary depending on the type of video. For instance, videos of a podcast with two hosts tend to be less dynamic than sports videos
light
If you want to extract scenes from the video without indexing them, you can use the video.extract_scenes() function.
Using this pipeline you can experiment with scene extraction and find your suitable configuration.

extract_scenes()

This function accepts the extraction_type and extraction_config and returns a SceneCollection object, that keeps the information about all the extracted scene lists.
scene_collections = video.extract_scenes(
extraction_type=SceneExtractionType.time_based,
extraction_config={"time": 30, "select_frames": ["middle"]},
)

SceneCollection Viewing, Inspecting, and Deleting Scenes

For every scene extraction pipeline that you run on a video, a SceneCollection object is created.
You can use following functions to View, Inspect and Delete your SceneCollections

list_scene_collection
scene_collections = video.list_scene_collection()

for scene_collection in scene_collections:
print("Scene Collection ID :",scene_collection["scene_collection_id"])


Get SceneCollection by ID
scene_collection = video.get_scene_collection("scene_collection_id")

Inspecting SceneCollection
print("This is scene collection id", scene_collection.id)
print("This is scene collection config", scene_collection.config)


Playground: Play with Prompt

Before finalizing your prompt, consider experimenting with different ones. This will help you see how the search performs for your use cases. Start by iterating over only a few scenes. Then, experiment with your prompt and test it after indexing
We believe that the right prompt is very helpful in finding information that aligns with your domain knowledge and experience. For this we provide following describe(prompt= ) functions at Frame and Scene level.
#describe frame image using vision LLM
frame.describe(
prompt=str,
)

# run vision model on scene level
# primarily for activity detection.
Scene.describe(
prompt=str,
)
Start by iterating over only few scenes and experiment with your prompt and test after indexing.

# get scene from collection
scenes = scene_collection.scenes

# Iterate through only 5 scene
for scene in scenes[:5]:
print(f"Scene Duration {scene.start}-{scene.end}")
# Iterate through each frame in the scene
for frame in scene.frames:
print(f"Frame at {frame.frame_time} {frame.url}")
frame.describe(
prompt=str,
)

Experiment with prompt at scene level

# get scene from collection
scenes = scene_collection.scenes

# Iterate through first 5 scene
for scene in scenes[:5] :
scene.describe(
prompt=str,
)

Index and search scenes

# Give a name to your index for reference
index_id = video.index_scenes(scenes=scenes, name="")


# search using the index_id
res = video.search(query="religious gathering",
index_type=IndexType.scene,
index_id=index_id)

res.play()

Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.