Skip to main content
Open In Colab

The Idea

Content moderation can be complex, often requiring multiple tools, manual timestamp extraction, and intricate integration work. Setting up these pipelines involves managing credentials, parsing responses, and stitching everything together. VideoDB simplifies this into a “Prompt-and-Filter” workflow using native AI scene indexing. No external credentials needed. No manual timestamp extraction. Just prompt engineering that creates structured labels (CONTENT_SAFE/CONTENT_UNSAFE) from unstructured video content. The innovation is simple: instead of generic video descriptions, we give the AI a strict moderation role with deterministic output labels. This turns unstructured video into structured, searchable data that can be filtered instantly. Want stricter moderation? Update the prompt. Need different criteria? Change a few lines. It’s content moderation reimagined for the prompt engineering era.

Setup

Install Dependencies

pip install videodb

Connect to VideoDB

Get your API key from VideoDB Console. Free for first 50 uploads, no credit card required.
from videodb import connect

conn = connect(api_key="YOUR_API_KEY")
coll = conn.get_collection()

Implementation

Step 1: Upload Video

We’ll use a Breaking Bad clip with mixed content to test the moderation workflow.
# Upload video from YouTube
video = coll.upload(url='https://www.youtube.com/watch?v=Xa7UaHgOGfM')
print(f"Uploaded Video ID: {video.id}")

# Preview the video
video.play()

Step 2: Index Scenes with Moderator Prompt

This is the core innovation. We give the AI a strict role as a Content Moderator with deterministic output labels. The prompt instructs the AI to analyze visual content for specific inappropriate elements and respond with either CONTENT_SAFE or CONTENT_UNSAFE. This structured labeling transforms unstructured video into searchable, filterable data.
from videodb import SceneExtractionType

# Define strict moderation instructions
moderation_prompt = """
You are a Content Moderator. Analyze the visual content for inappropriate elements:
1. Violence (fighting, hitting, shooting)
2. Weapons (guns, knives)
3. Blood or Gore
4. Drug use
5. Sexual content

If ANY of these are detected, your response must start with:
"CONTENT_UNSAFE: [brief reason]"

If the scene is clean and safe, your response must start with:
"CONTENT_SAFE: [brief description]"
"""

# Index video in 5-second chunks for granular moderation
scene_index_id = video.index_scenes(
    prompt=moderation_prompt,
    extraction_type=SceneExtractionType.time_based,
    extraction_config={
        "time": 5,       # Check every 5 seconds
        "frame_count": 3 # Analyze 3 frames per segment
    }
)

print("Moderation indexing complete!")
Why this works: By enforcing strict output formats (CONTENT_SAFE/CONTENT_UNSAFE), we can use simple keyword searches to filter content. No complex parsing or external API integration needed.

Step 3: Review Scene Indexes (Optional)

Want to see what the AI detected? Check the scene indexes to understand how content was labeled.
# Fetch scene indexes
scene_indexes = video.get_scene_index(scene_index_id)

# Print first 5 scenes
for i, scene in enumerate(scene_indexes[:5]):
    print(f"Scene {i+1}:")
    print(f"  Time: {scene['start']}s - {scene['end']}s")
    print(f"  Status: {scene['description']}\n")
Sample output:
Scene 1:
  Time: 0.0s - 5.005s
  Status: CONTENT_SAFE: The images display title cards with a smoky background...

Scene 2:
  Time: 5.005s - 10.01s
  Status: CONTENT_SAFE: Two men in indoor settings, no inappropriate elements...

Scene 5:
  Time: 20.02s - 25.025s
  Status: CONTENT_UNSAFE: Implied physical confrontation and aggressive interaction...

Step 4: Filter for Safe Content

Now the magic happens. Because we structured the AI’s responses with CONTENT_SAFE labels, we can use a simple keyword search to filter the entire video.
from videodb import SearchType, IndexType

# Search for safe content using keyword search
safe_results = video.search(
    query="CONTENT_SAFE",
    search_type=SearchType.keyword,
    index_type=IndexType.scene,
    scene_index_id=scene_index_id
)

# Get the safe segments
safe_shots = safe_results.get_shots()
print(f"Found {len(safe_shots)} safe segments")

# Inspect first few segments
for i, shot in enumerate(safe_shots[:3]):
    print(f"Segment {i+1} ({shot.start}s - {shot.end}s): {shot.text}")

Step 5: Play the Clean Version

The filtered results come with a stream URL ready for instant playback. No rendering, no waiting.
# Get the stream URL
print("Stream URL:", safe_results.stream_url)

# Play in notebook/browser
safe_results.play()
Here’s the result - a clean version with all inappropriate content removed:

What You Get

  • No external APIs or credentials required
  • Full control over moderation criteria through prompts
  • Instant filtering without video re-encoding
  • Granular 5-second scene analysis
  • Real-time playback of cleaned content
  • Customizable: change prompt to adjust moderation standards instantly

Perfect For

  • Educational platforms serving minor audiences
  • Family-friendly streaming services
  • Corporate training content libraries
  • Social media platforms with content policies
  • Broadcasting companies creating TV-safe edits
  • User-generated content platforms with safety requirements

The Result

What used to require multiple integrations, manual timestamp extraction, and complex video editing pipelines now works with just prompt engineering. Change your moderation criteria instantly by updating the prompt—no re-processing needed. Pure simplicity powered by VideoDB’s native AI indexing.

Explore Full Notebook

Open the complete implementation in Google Colab with detailed explanations and working code.