Skip to content
videodb
VideoDB Documentation
  • Pages
    • Welcome to VideoDB Docs
    • Quick Start Guide
      • Video Indexing Guide
      • Semantic Search
      • Collections
      • Public Collections
      • Callback Details
      • Ref: Subtitle Styles
      • Language Support
      • Guide: Subtitles
      • icon picker
        How Accurate is Your Search?
    • Visual Search and Indexing
      • Scene Extraction Algorithms
      • Custom Annotations
      • Scene-Level Metadata: Smarter Video Search & Retrieval
      • Advanced Visual Search Pipelines
      • Playground for Scene Extractions
      • Deep Dive into Prompt Engineering : Mastering Visual Indexing
      • How VideoDB Solves Complex Visual Analysis Tasks
      • Multimodal Search: Quickstart
      • Conference Slide Scraper with VideoDB
    • Examples and Tutorials
      • Dubbing - Replace Soundtrack with New Audio
      • Beep curse words in real-time
      • Remove Unwanted Content from videos
      • Instant Clips of Your Favorite Characters
      • Insert Dynamic Ads in real-time
      • Adding Brand Elements with VideoDB
      • Elevating Trailers with Automated Narration
      • Add Intro/Outro to Videos
      • Audio overlay + Video + Timeline
      • Building Dynamic Video Streams with VideoDB: Integrating Custom Data and APIs
      • AI Generated Ad Films for Product Videography: Wellsaid, Open AI & VideoDB
      • Fun with Keyword Search
      • Overlay a Word-Counter on Video Stream
      • Generate Automated Video Outputs with Text Prompts | DALL-E + ElevenLabs + OpenAI + VideoDB
      • Eleven Labs x VideoDB: Adding AI Generated voiceovers to silent footage
      • VideoDB x TwelveLabs: Real-Time Video Understanding
      • Multimodal Search
      • How I Built a CRM-integrated Sales Assistant Agent in 1 Hour
      • Make Your Video Sound Studio Quality with Voice Cloning
      • Automated Traffic Violation Reporter
    • Live Video→ Instant Action
    • Generative Media Quickstart
      • Generative Media Pricing
    • Video Editing Automation
      • Fit & Position: Aspect Ratio Control
      • Trimming vs Timing: Two Independent Timelines
      • Advanced Clip Control: The Composition Layer
      • Caption & Subtitles: Auto-Generated Speech Synchronization
      • Notebooks
    • Transcoding Quickstart
    • director-light
      Director - Video Agent Framework
      • Agent Creation Playbook
      • Setup Director Locally
    • Workflows and Integrations
      • zapier
        Zapier Integration
        • Auto-Dub Videos & Save to Google Drive
        • Create & Add Intelligent Video Highlights to Notion
        • Create GenAI Video Engine - Notion Ideas to Youtube
        • Automatically Detect Profanity in Videos with AI - Update on Slack
        • Generate and Store YouTube Video Summaries in Notion
        • Automate Subtitle Generation for Video Libraries
        • Solve customers queries with Video Answers
      • n8n
        N8N Workflows
        • AI-Powered Meeting Intelligence: Recording to Insights Automation
        • AI Powered Dubbing Workflow for Video Content
        • Automate Subtitle Generation for Video Libraries
        • Automate Interview Evaluations with AI
        • Turn Meeting Recordings into Actionable Summaries
        • Auto-Sync Sales Calls to HubSpot CRM with AI
        • Instant Notion Summaries for Your Youtube Playlist
    • Meeting Recording SDK
    • github
      Open Source
      • llama
        LlamaIndex VideoDB Retriever
      • PromptClip: Use Power of LLM to Create Clips
      • StreamRAG: Connect ChatGPT to VideoDB
    • mcp
      VideoDB MCP Server
    • videodb
      Give your AI, Eyes and Ears
      • Building Infrastructure that “Sees” and “Edits”
      • Agents with Video Experience
      • From MP3/MP4 to the Future with VideoDB
      • Dynamic Video Streams
      • Why do we need a Video Database Now?
      • What's a Video Database ?
      • Enhancing AI-Driven Multimedia Applications
      • Beyond Traditional Video Infrastructure
    • Customer Love
    • Join us
      • videodb
        Internship: Build the Future of AI-Powered Video Infrastructure
      • Ashutosh Trivedi
        • Playlists
        • Talks - Solving Logical Puzzles with Natural Language Processing - PyCon India 2015
      • Ashish
      • Shivani Desai
      • Gaurav Tyagi
      • Rohit Garg
      • Edge of Knowledge
        • Language Models to World Models: The Next Frontier in AI
        • Society of Machines
          • Society of Machines
          • Autonomy - Do we have the choice?
          • Emergence - An Intelligence of the collective
        • Building Intelligent Machines
          • Part 1 - Define Intelligence
          • Part 2 - Observe and Respond
          • Part 3 - Training a Model
      • Updates
        • VideoDB Acquires Devzery: Expanding Our AI Infra Stack with Developer-First Testing Automation

How Accurate is Your Search?

Introduction

When you index your data and retrieve it with certain parameters, how do you measure the effectiveness of your search? This is where search evaluation comes in. By using test data, queries, and their results, you can assess the performance of indexes, search parameters, and other related factors. This evaluation helps you understand how well your search system is working and identify areas for improvement.

Example

To keep it super simple let’s use a of 30 seconds.
We can imagine information in video indexed as documents which are “timestamps + some textual information” describing the visuals as there is no audio in this video”.
We can use the structure as
timestamp : (start, end ), description: “string”
So, if we use index_scenes function
At (1, 2) - 29 seconds is displayed
At (2, 3) - 28 seconds is displayed
...
This continues until:
At (29, 30) - 1 second is displayed

Ground Truth

It is the the ideal expected result. To evaluate the performance of search we need some test queries and the expected results.
Let's say for the query "Six" the expected result documents are at the following timestamps:
We will call this list of timestamps our ground truth for the query "Six."

Evaluation Metrics

To evaluate the effectiveness of our search functionality, we'll can experiment with our query "Six" with various search parameters. 📊
The search results can be categorized as follows:
Retrieved Documents 🔍:
Retrieved Relevant Documents: Matches our ground truth ✅
Retrieved Irrelevant Documents: Don't match our ground truth ❌
Non-Retrieved Documents 🚫:
Non-Retrieved Relevant Documents: In our ground truth but not in results 😕
Non-Retrieved Irrelevant Documents: Neither in ground truth nor results 👍
We can further classify these categories in terms of search accuracy:
True Positives (TP) 🎯: Retrieved Relevant Documents
We wanted them, and we got them 🙌
False Positives (FP) 🎭: Retrieved Irrelevant Documents
We didn't want them, but we got them 🤔
False Negatives (FN) 😢: Non-Retrieved Relevant Documents
We wanted them, but we didn't get them 😓
True Negatives (TN) 🚫: Non-Retrieved Irrelevant Documents
We didn't want them, and we didn't get them 👌
💡 This classification helps us assess the precision and recall of our search algorithm, enabling further optimization.

Accuracy

Accuracy measures how well our search algorithm retrieves required documents while excluding irrelevant ones. It can be calculated as follows:
In other words, accuracy is the ratio of correctly classified documents (both retrieved relevant and non-retrieved irrelevant) to the total number of documents. 📊
To get a more comprehensive evaluation of search performance, it's crucial to consider other metrics such as precision, recall, and F1-score in addition to accuracy. 💡🔬

Precision and Recall

Precision is percentage of relevant retrieved docs out of all retrieved docs. It answers the question: "Of the documents our search returned, how many were actually relevant?"
Recall indicates the percentage of relevant documents that were successfully retrieved. It addresses the question: "Out of all the relevant documents, how many did our search find?" 🔍

The Precision-Recall Trade-off

These metrics often have an inverse relationship, leading to a trade-off:
Recall 📈:
Measures the model's ability to find all relevant cases in a dataset.
Increases or remains constant as more documents are retrieved.
Never decreases with an increase in retrieved documents.
Precision 📉:
Refers to the proportion of correct positive identifications.
Typically decreases as more documents are retrieved.
Drops due to increased likelihood of including false positives.

Search in VideoDB

Let’s understand the search interface provided by VideoDB and measure results with the above metric.
This function performs a search on video content with various customizable parameters:
query: The search query string.
search_type: Determines the search method. Keyword search on single video level returns all the documents .
SearchType.semantic (default): For question-answering queries. ( across 1000s of videos/ collection ) Checkout for detailed understanding.
SearchType.keyword: Matches exact occurrences where the given query is present as a sub-string (single video only).
index_type: Specifies the index to search:
IndexType.spoken_word (default): Searches spoken content.
IndexType.scene: Searches visual content.
result_threshold: Initial filter for top N matching documents (default: 5).
score_threshold: Absolute threshold filter for relevance scores (default: 0.2).
dynamic_score_percentage: Adaptive filtering mechanism:
Useful when there is a significant gap between top results and tail results after score_threshold filter. Retains top x% of the score range.
Calculation: dynamic_threshold = max_score - (range * dynamic_score_percentage)
default: 20%
This interface allows for flexible and precise searching of video content, with options to fine-tune result filtering based on relevance scores and dynamic thresholds.

Experiment

Follow this notebook to explore experiments on fine-tuning search results and gain a deeper understanding of the methods involved

Here’s a basic outcome of the default settings for both search types on the query "six" for the above video:
1. Semantic Search Default:

2. Keyword Search:

Outcome

As you can see, keyword search is best suited for queries like "teen" and "six." However, if the queries are in natural language, such as "find me a 6" then semantic search is more appropriate.
Keyword search would struggle to find relevant results for such natural language queries.

Search + LLM

For complex queries like "Find me all the numbers greater than six" a basic search will not work effectively since it merely matches the query with documents in vector space and returns the matching documents.
In such cases, you can apply a loose filter to get all the documents that match the query. However, you will need to add an additional layer of intelligence using a Large Language Model (LLM). The matched documents can then be passed to the LLM to curate a response that accurately answers the query.

Want to print your doc?
This is not the way.
Try clicking the ··· in the right corner or using a keyboard shortcut (
CtrlP
) instead.