Skip to main content
Ask questions in plain English. VideoDB uses semantic search to understand intent and return relevant video segments.

Quick Example

import videodb

conn = videodb.connect()
coll = conn.get_collection()
video = coll.get_video("m-xxx")

# Natural language query
results = video.search("when does the speaker discuss climate change?")

# Play matching segments
results.play()

How It Works

  1. Query Understanding - Your query is transformed into a vector embedding
  2. Similarity Matching - Embeddings are compared against indexed content
  3. Relevance Scoring - Results are ranked by semantic similarity
  4. Timestamp Retrieval - Matching segments are returned with timestamps

Search Types

Semantic Search (Default)

Understands meaning and intent, not just keywords.
from videodb import SearchType

# Semantic search (default)
results = video.search("How do I fix a leaky faucet?")

# Explicit semantic search
results = video.search(
    query="How do I fix a leaky faucet?",
    search_type=SearchType.semantic
)
Best for:
  • Questions (“What causes…?”, “How do you…?”)
  • Conceptual queries (“explain the theory”)
  • Fuzzy matching (“something about cars”)
Exact substring matching. Finds literal occurrences.
from videodb import SearchType

results = video.search(
    query="API",
    search_type=SearchType.keyword
)
Best for:
  • Technical terms
  • Proper nouns
  • Exact phrases

Comparison

FeatureSemantic SearchKeyword Search
QueryNatural languageExact terms
MatchingBy meaningBy substring
Example”How to repair pipes?""plumbing repair”
ScopeSingle video or collectionSingle video only

Index Types

Specify which index to search.
from videodb import IndexType

# Search spoken content (default)
results = video.search(
    query="discusses machine learning",
    index_type=IndexType.spoken_word
)

# Search visual content
results = video.search(
    query="person running through a park",
    index_type=IndexType.scene
)

# Search specific scene index
results = video.search(
    query="red car",
    index_type=IndexType.scene,
    index_id="scene-index-xxx"
)

Tuning Results

Result Threshold

Limit the number of results returned:
results = video.search(
    query="funny moments",
    result_threshold=10  # Return top 10 matches
)

Score Threshold

Filter out low-relevance results:
results = video.search(
    query="product demo",
    score_threshold=0.3  # Only results with score >= 0.3
)

Dynamic Score Percentage

Adaptive filtering based on score distribution:
results = video.search(
    query="key insights",
    dynamic_score_percentage=50  # Keep top 50% of score range
)
The dynamic threshold is calculated as:
dynamic_threshold = max_score - (range × percentage)

Search Parameters Reference

ParameterTypeDefaultDescription
querystrrequiredNatural language query
search_typeSearchTypesemanticsemantic or keyword
index_typeIndexTypespoken_wordspoken_word or scene
result_thresholdint5Max results to return
score_thresholdfloat0.2Minimum relevance score
dynamic_score_percentagefloat20Adaptive score filter
index_idstrNoneSpecific scene index ID
Layers and parameters of semantic search showing how queries are transformed into vectors and matched against indexed content

Query Examples

Spoken Content Queries

# Question format
video.search("What are the main benefits of solar energy?")

# Topic lookup
video.search("discussion about renewable energy")

# Speaker search
video.search("when the CEO mentions revenue")

Visual Content Queries

# Object detection
video.search("red car on the highway", index_type=IndexType.scene)

# Action detection
video.search("person running", index_type=IndexType.scene)

# Scene description
video.search("sunset over the ocean", index_type=IndexType.scene)

Multimodal Queries

Combine spoken and visual search for precise results:
from videodb import IndexType

# Search spoken content
spoken_results = video.search(
    query="talks about the solar system",
    index_type=IndexType.spoken_word
)

# Search visual content
visual_results = video.search(
    query="shows planets or galaxies",
    index_type=IndexType.scene
)

# Find intersection (both conditions met)
spoken_times = [(s.start, s.end) for s in spoken_results.get_shots()]
visual_times = [(s.start, s.end) for s in visual_results.get_shots()]

What You Can Build


Next Steps