Skip to content
videodb
VideoDB Documentation
  • Pages
    • Welcome to VideoDB Docs
    • Quick Start Guide
      • Video Indexing Guide
      • Semantic Search
      • How Accurate is Your Search?
      • Collections
      • Public Collections
      • Callback Details
      • Ref: Subtitle Styles
      • Language Support
      • Guide: Subtitles
    • Examples and Tutorials
      • Dubbing - Replace Soundtrack with New Audio
      • VideoDB x TwelveLabs: Real-Time Video Understanding
      • Beep curse words in real-time
      • Remove Unwanted Content from videos
      • Instant Clips of Your Favorite Characters
      • Insert Dynamic Ads in real-time
      • Adding Brand Elements with VideoDB
      • Eleven Labs x VideoDB: Adding AI Generated voiceovers to silent footage
      • Elevating Trailers with Automated Narration
      • Add Intro/Outro to Videos
      • Audio overlay + Video + Timeline
      • Building Dynamic Video Streams with VideoDB: Integrating Custom Data and APIs
      • AI Generated Ad Films for Product Videography: Wellsaid, Open AI & VideoDB
      • Fun with Keyword Search
      • icon picker
        AWS Rekognition and VideoDB - Effortlessly Remove Inappropriate Content from Video
      • Overlay a Word-Counter on Video Stream
      • Generate Automated Video Outputs with Text Prompts | DALL-E + ElevenLabs + OpenAI + VideoDB
    • Visual Search and Indexing
      • Scene Extraction Algorithms
      • Custom Annotations
      • Scene-Level Metadata: Smarter Video Search & Retrieval
      • Advanced Visual Search Pipelines
      • Playground for Scene Extractions
      • Deep Dive into Prompt Engineering : Mastering Video Scene Indexing
    • Multimodal Search
      • Multimodal Search: Quickstart
      • Conference Slide Scraper with VideoDB
    • Real‑Time Video Pipeline
      • Automated Traffic Violation Reporter
    • Meeting Recording SDK
    • Generative Media Quickstart
      • Generative Media Pricing
    • AI Video Editing Automation SDK
      • Fit & Position: Aspect Ratio Control
      • Trimming vs Timing: Two Independent Timelines
      • Advanced Clip Control: The Composition Layer
      • Caption & Subtitles: Auto-Generated Speech Synchronization
      • Notebooks
    • Transcoding Quickstart
    • director-light
      Director - Video Agent Framework
      • Agent Creation Playbook
      • How I Built a CRM-integrated Sales Assistant Agent in 1 Hour
      • Make Your Video Sound Studio Quality with Voice Cloning
      • Setup Director Locally
    • github
      Open Source Tools
      • llama
        LlamaIndex VideoDB Retriever
      • PromptClip: Use Power of LLM to Create Clips
      • StreamRAG: Connect ChatGPT to VideoDB
    • zapier
      Zapier Integration
      • Auto-Dub Videos & Save to Google Drive
      • Create & Add Intelligent Video Highlights to Notion
      • Create GenAI Video Engine - Notion Ideas to Youtube
      • Automatically Detect Profanity in Videos with AI - Update on Slack
      • Generate and Store YouTube Video Summaries in Notion
      • Automate Subtitle Generation for Video Libraries
      • Solve customers queries with Video Answers
    • n8n
      N8N Workflows
      • AI-Powered Meeting Intelligence: Recording to Insights Automation
      • AI Powered Dubbing Workflow for Video Content
      • Automate Subtitle Generation for Video Libraries
      • Automate Interview Evaluations with AI
      • Turn Meeting Recordings into Actionable Summaries
      • Auto-Sync Sales Calls to HubSpot CRM with AI
      • Instant Notion Summaries for Your Youtube Playlist
    • mcp
      VideoDB MCP Server
    • Edge of Knowledge
      • Building Intelligent Machines
        • Part 1 - Define Intelligence
        • Part 2 - Observe and Respond
        • Part 3 - Training a Model
      • Society of Machines
        • Society of Machines
        • Autonomy - Do we have the choice?
        • Emergence - An Intelligence of the collective
      • From Language Models to World Models: The Next Frontier in AI
      • The Future Series
      • How VideoDB Solves Complex Visual Analysis Tasks
    • videodb
      Building World's First Video Database
      • Multimedia: From MP3/MP4 to the Future with VideoDB
      • Dynamic Video Streams
      • Why do we need a Video Database Now?
      • What's a Video Database ?
      • Enhancing AI-Driven Multimedia Applications
      • Misalignment of Today's Web
      • Beyond Traditional Video Infrastructure
      • Research Grants
    • Customer Love
    • Team
      • videodb
        Internship: Build the Future of AI-Powered Video Infrastructure
      • Ashutosh Trivedi
        • Playlists
        • Talks - Solving Logical Puzzles with Natural Language Processing - PyCon India 2015
      • Ashish
      • Shivani Desai
      • Gaurav Tyagi
      • Rohit Garg
      • VideoDB Acquires Devzery: Expanding Our AI Infra Stack with Developer-First Testing Automation

AWS Rekognition and VideoDB - Effortlessly Remove Inappropriate Content from Video

Video content moderation is complex. While today's computer vision and AI have alleviated the manual burden, modifying video streams instantly and intelligently for moderated content remains challenging.
VideoDB ‘s next gen tech helps you leave behind the tedious processes of conventional video editing, and save tons of cost and time for application developers.
image.png
Key components of this blog are:
AWS Rekognition API: Leveraging Content Moderation features for video analysis.
VideoDB: Storing videos in a database tailored for video content, thus enabling the generation of dynamic streams by instantly removing unsafe content.

Setup

Install required packages:
boto3: Use AWS
and
pytube: Download YouTube Videos
videodb : VideoDB Python SDK
!pip install boto3 pytube requests videodb

Helper Functions

We've prepared a handy download_video_yt function to download YouTube videos in high resolution.
import pytube
import os
import time

# Downlaods Youtube video
def download_video_yt(youtube_url, output_file="video.mp4"):
youtube_object = pytube.YouTube(youtube_url)
video_stream = youtube_object.streams.get_highest_resolution()
video_stream.download(filename=output_file)
print(f"Downloaded video to: {output_file}")
return output_file

Downloading Media

Let’s take this from the TV show "Breaking Bad".
video_url_yt = "https://www.youtube.com/watch?v=Xa7UaHgOGfM"
video_output = "video_breaking_bad.mp4"
download_video_yt(video_url_yt, video_output)

Configuration

We need to configure both AWS and VideoDB.

AWS Configuration

filled-flag
AWS Rekognition is a paid API, so please select your YouTube video carefully. Choosing a larger video may incur additional charges.
AWS secrets : aws_secret_key_id , aws_secret_access_key and aws_reigon
Ensure your AWS user has access to necessary policies:
AmazonRekognitionFullAccess and AmazonS3FullAccess
import boto3

aws_access_key_id= os.environ.get('AWS_KEY_ID', "")
aws_secret_access_key = os.environ.get("AWS_KEY_SECRET", "")
region_name = os.environ.get("AWS_REIGON", "")

bucket_name = "videorekog"
rekognition_client = boto3.client(
"rekognition",
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
region_name=region_name,
)
s3 = boto3.client('s3',
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
region_name=region_name,
)

Analyzing the Video for Inappropriate Content

Using the Rekognition API

Upload a video to S3 Bucket and start content moderation using
# Define function to start face search in video
def start_content_moderation(video_path, bucket_name):
response = rekognition_client.start_content_moderation(
Video={"S3Object": {"Bucket": bucket_name, "Name": video_path}}
)
return response["JobId"]


# Define function to get face search results
def get_content_moderation(job_id):
wait_for = 5
pagination_finished = False
next_token = ""
response = {
"ModerationLabels" : []
}
while not pagination_finished:
print(next_token)
moderation_res = rekognition_client.get_content_moderation(JobId=job_id, NextToken = next_token)
status = moderation_res["JobStatus"]
next_token = moderation_res.get("NextToken", "")
if status == "IN_PROGRESS":
time.sleep(wait_for)
elif status == "SUCCEEDED" :
print(moderation_res)
if (not next_token):
pagination_finished = True
response["ModerationLabels"].extend(moderation_res["ModerationLabels"])
return response

#Upload Target video to S3 Bucket
s3.create_bucket(Bucket=bucket_name)
s3.upload_file(video_output, bucket_name, video_output)

#Start Content Moderation using Rekognition API
job_id = start_content_moderation(video_output, bucket_name )
print(job_id)
moderation_res = get_content_moderation(job_id)
print(moderation_res)


Preparing Clips Timestamps

image.png
The Rekognition API flags moments in a video that are inappropriate, unwanted, or offensive by providing timestamps. Our objective is to consolidate timestamps that belong to the same sequence.
Though the offers a method for this, we will employ a more straightforward strategy.
If the gap between two consecutive timestamps is less than a specific threshold , they will be combined into a single continuous scene.
To ensure thorough coverage, we'll also introduce a padding on both the right and left sides of each scene.
Then, we need to do a compliment operation on video from inappropriate clips to get appropriate and safe content clips. Feel free to adjust the threshold and padding settings to optimize the results.
timestamps = []
threshold = 1
padding = 1

for label in moderation_res["ModerationLabels"]:
timestamp = label["Timestamp"]/1000
timestamps.append(round(timestamp))

def merge_timestamps(numbers, threshold, padding):
grouped_numbers = []
end_last_segment = 0
current_group = [numbers[0]]

for i in range(1, len(numbers)):
# if timestamp is with threshold from previous timestamp, consolidate them under same group
if numbers[i] - numbers[i-1] <= threshold:
current_group.append(numbers[i])
# else put last group's end and this group's start in result clips
else:
start_segment = current_group[0] - padding
end_segment = current_group[-1] + padding
grouped_numbers.append([end_last_segment, start_segment])
end_last_segment = end_segment
current_group = [numbers[i]]

grouped_numbers.append([end_last_segment, numbers[-1]])
return grouped_numbers

shots = merge_timestamps(timestamps, threshold=threshold, padding=padding)
print(shots)

Removing inappropriate content from video using VideoDB

info
The idea behind VideoDB is straightforward: It functions as a database specifically for videos. Similar to how you upload tables or JSON data to a standard database, you can upload your videos to VideoDB.
You can also retrieve your videos through queries, much like accessing regular data from a database.
VideoDB enables you to swiftly create clips from your videos, ensuring a ⚡️ process, just like retrieving text data from a db.
Next, we will compile a master clip composed of smaller segments that depict appropriate contents only (i.e filter and exclude the clips with inappropriate content identified earlier)
# upload the video to db
video_url_yt = "https://www.youtube.com/watch?v=Xa7UaHgOGfM"
video = conn.upload(url=video_url_yt)

# generate a stream link of safe_shots by passing values in timeline
stream_link = video.generate_stream(timeline=shots)

# play the video in browser/notebook
play_stream(stream_link)
You can use the stream link with any video player ( video.js, JW player or simple html player ) to embed in your application. In upcoming versions, we are launching our own video player for your ease. It has built-in chapter and search features. Please check to see it in action. We would love to know your thoughts at 👉 contact@vidoedb.io.

Keep building awesome stuff 🤘
Want to print your doc?
This is not the way.
Try clicking the ··· in the right corner or using a keyboard shortcut (
CtrlP
) instead.