Skip to main content
Indexing and search have trade-offs between speed, quality, and cost. This guide helps you optimize for your priorities.

Quick Reference

FactorLower CostHigher Quality
Frame count1 frame/scene3-5 frames/scene
Scene interval30+ seconds5-10 seconds
Extraction typeTime-basedShot-based
Prompt complexitySimpleDetailed

Indexing Cost Factors

Frame Count

More frames = more vision API calls = higher cost.
from videodb import SceneExtractionType

# Economical: 1 frame per scene
video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time": 30, "frame_count": 1},
    prompt="Describe the scene"
)

# Premium: 5 frames per scene
video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time": 10, "frame_count": 5},
    prompt="Describe the activity and how it progresses"
)
ConfigScenes/hourFrames/hourRelative Cost
30s, 1 frame1201201x
10s, 1 frame3603603x
10s, 3 frames3601,0809x
5s, 5 frames7203,60030x

Scene Interval

Shorter intervals = more scenes = more processing.
# Long interval: fewer scenes, lower cost
video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time": 60},  # 1 scene per minute
    prompt="Describe the main topic"
)

# Short interval: more scenes, higher cost
video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time": 5},  # 12 scenes per minute
    prompt="Describe what's happening"
)

Indexing Latency

Time-Based vs Shot-Based

MethodLatencyBest For
Time-basedPredictableLong-form content
Shot-basedVariableEdited content
Shot-based detection adds processing overhead but produces more natural boundaries.

Async Processing

For long videos, use callbacks to avoid blocking:
# Non-blocking with callback
scene_index_id = video.index_scenes(
    prompt="Describe the scene",
    callback_url="https://your-backend.com/webhooks/index-complete"
)

# Check status later
scene_index = video.get_scene_index(scene_index_id)
print(scene_index.status)  # "processing" or "completed"

Search Latency

Single Video vs Collection

ScopeLatencyUse Case
video.search()FasterKnown video
coll.search()SlowerDiscovery

Reduce Search Space

Metadata filters improve performance:
# Slower: search everything
results = coll.search("product demo")

# Faster: filtered search
results = coll.search(
    query="product demo",
    filter=[{"category": "marketing"}]
)

Limit Results

# Return fewer results for faster response
results = video.search(
    query="highlights",
    result_threshold=5  # Only top 5
)

Optimization Strategies

Tiered Indexing

Create multiple indexes at different quality levels:
# Fast, cheap index for preview/discovery
preview_index = video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time": 60, "frame_count": 1},
    prompt="What is the main topic?",
    name="preview"
)

# Detailed index for deep search
detailed_index = video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time": 10, "frame_count": 3},
    prompt="Describe people, objects, and actions in detail",
    name="detailed"
)

Index on Demand

Only index what you need:
# Index spoken content immediately (cheap)
video.index_spoken_words()

# Index visuals only when needed (expensive)
if user_requests_visual_search:
    video.index_scenes(prompt="Describe the scene")

Batch Processing

For large libraries, process during off-peak hours:
# Queue videos for background indexing
for video in coll.get_videos():
    video.index_scenes(
        prompt="Describe the scene",
        callback_url="https://your-backend.com/webhooks"
    )

Cost Estimation

Factors

  1. Video duration - Longer = more scenes
  2. Frame extraction - More frames = more API calls
  3. Scene interval - Shorter = more scenes
  4. Collection size - More videos = more processing

Example Calculations

VideoConfigScenesFramesCost Factor
1 hour60s, 1 frame60601x
1 hour30s, 1 frame1201202x
1 hour10s, 3 frames3601,08018x

Recommendations by Use Case

Media Archive (Cost-Sensitive)

video.index_spoken_words()  # Always index speech (cheap)

video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time": 60, "frame_count": 1},
    prompt="Describe the main content"
)

Security Monitoring (Quality-First)

video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time": 5, "frame_count": 3},
    prompt="Identify all people, vehicles, and suspicious activity"
)

E-commerce (Balanced)

video.index_scenes(
    extraction_type=SceneExtractionType.shot_based,
    extraction_config={"threshold": 20, "frame_count": 2},
    prompt="Identify products, brands, and pricing"
)

Monitoring

Track indexing and search metrics:
import time

# Measure indexing time
start = time.time()
video.index_scenes(prompt="Describe the scene")
indexing_time = time.time() - start
print(f"Indexing took {indexing_time:.2f}s")

# Measure search time
start = time.time()
results = video.search("query")
search_time = time.time() - start
print(f"Search took {search_time:.3f}s, found {len(results.get_shots())} results")

Next Steps