VideoDB Documentation

Pages

Examples and Tutorials

AWS Rekognition and VideoDB - Effortlessly Remove Inappropriate Content from Video

⁠

Video content moderation is complex. While today's computer vision and AI have alleviated the manual burden, modifying video streams instantly and intelligently for moderated content remains challenging.

VideoDB ‘s next gen tech helps you leave behind the tedious processes of conventional video editing, and save tons of cost and time for application developers.

⁠

Key components of this blog are:

AWS Rekognition API: Leveraging Content Moderation features for video analysis.

VideoDB: Storing videos in a database tailored for video content, thus enabling the generation of dynamic streams by instantly removing unsafe content.

Setup

Install required packages:

boto3: Use AWS

S3⁠

and

AWS Rekognition⁠

⁠

pytube: Download YouTube Videos

videodb : VideoDB Python SDK

!pip install boto3 pytube requests videodb

Helper Functions

We've prepared a handy download_video_yt function to download YouTube videos in high resolution.

import pytube

import os

import time

# Downlaods Youtube video

def download_video_yt(youtube_url, output_file="video.mp4"):

youtube_object = pytube.YouTube(youtube_url)

video_stream = youtube_object.streams.get_highest_resolution()

video_stream.download(filename=output_file)

print(f"Downloaded video to: {output_file}")

return output_file

Downloading Media

Let’s take this

10-minute video⁠

from the TV show "Breaking Bad".

video_url_yt = "https://www.youtube.com/watch?v=Xa7UaHgOGfM"

video_output = "video_breaking_bad.mp4"

download_video_yt(video_url_yt, video_output)

Configuration

We need to configure both AWS and VideoDB.

AWS Configuration

AWS Rekognition is a paid API, so please select your YouTube video carefully. Choosing a larger video may incur additional charges.

AWS secrets : aws_secret_key_id , aws_secret_access_key and aws_reigon

Ensure your AWS user has access to necessary policies:

AmazonRekognitionFullAccess and AmazonS3FullAccess

import boto3

aws_access_key_id= os.environ.get('AWS_KEY_ID', "")

aws_secret_access_key = os.environ.get("AWS_KEY_SECRET", "")

region_name = os.environ.get("AWS_REIGON", "")

bucket_name = "videorekog"

rekognition_client = boto3.client(

"rekognition",

aws_access_key_id=aws_access_key_id,

aws_secret_access_key=aws_secret_access_key,

region_name=region_name,

)

s3 = boto3.client('s3',

aws_access_key_id=aws_access_key_id,

aws_secret_access_key=aws_secret_access_key,

region_name=region_name,

)

Analyzing the Video for Inappropriate Content

Using the Rekognition API

Upload a video to S3 Bucket and start content moderation using

StartContentModeration⁠

⁠

# Define function to start face search in video

def start_content_moderation(video_path, bucket_name):

response = rekognition_client.start_content_moderation(

Video={"S3Object": {"Bucket": bucket_name, "Name": video_path}}

)

return response["JobId"]

# Define function to get face search results

def get_content_moderation(job_id):

wait_for = 5

pagination_finished = False

next_token = ""

response = {

"ModerationLabels" : []

}

while not pagination_finished:

print(next_token)

moderation_res = rekognition_client.get_content_moderation(JobId=job_id, NextToken = next_token)

status = moderation_res["JobStatus"]

next_token = moderation_res.get("NextToken", "")

if status == "IN_PROGRESS":

time.sleep(wait_for)

elif status == "SUCCEEDED" :

print(moderation_res)

if (not next_token):

pagination_finished = True

response["ModerationLabels"].extend(moderation_res["ModerationLabels"])

return response

#Upload Target video to S3 Bucket

s3.create_bucket(Bucket=bucket_name)

s3.upload_file(video_output, bucket_name, video_output)

#Start Content Moderation using Rekognition API

job_id = start_content_moderation(video_output, bucket_name )

print(job_id)

moderation_res = get_content_moderation(job_id)

print(moderation_res)

Preparing Clips Timestamps

⁠

The Rekognition API flags moments in a video that are inappropriate, unwanted, or offensive by providing timestamps. Our objective is to consolidate timestamps that belong to the same sequence.

Though the

AWS Segment API⁠

offers a method for this, we will employ a more straightforward strategy.

If the gap between two consecutive timestamps is less than a specific threshold , they will be combined into a single continuous scene.

To ensure thorough coverage, we'll also introduce a padding on both the right and left sides of each scene.

Then, we need to do a compliment operation on video from inappropriate clips to get appropriate and safe content clips. Feel free to adjust the threshold and padding settings to optimize the results.

timestamps = []

threshold = 1

padding = 1

for label in moderation_res["ModerationLabels"]:

timestamp = label["Timestamp"]/1000

timestamps.append(round(timestamp))

def merge_timestamps(numbers, threshold, padding):

grouped_numbers = []

end_last_segment = 0

current_group = [numbers[0]]

for i in range(1, len(numbers)):

# if timestamp is with threshold from previous timestamp, consolidate them under same group

if numbers[i] - numbers[i-1] <= threshold:

current_group.append(numbers[i])

# else put last group's end and this group's start in result clips

else:

start_segment = current_group[0] - padding

end_segment = current_group[-1] + padding

grouped_numbers.append([end_last_segment, start_segment])

end_last_segment = end_segment

current_group = [numbers[i]]

grouped_numbers.append([end_last_segment, numbers[-1]])

return grouped_numbers

shots = merge_timestamps(timestamps, threshold=threshold, padding=padding)

print(shots)

Removing inappropriate content from video using VideoDB

The idea behind VideoDB is straightforward: It functions as a database specifically for videos. Similar to how you upload tables or JSON data to a standard database, you can upload your videos to VideoDB.

You can also retrieve your videos through queries, much like accessing regular data from a database.

VideoDB enables you to swiftly create clips from your videos, ensuring a ⚡️ process, just like retrieving text data from a db.

Next, we will compile a master clip composed of smaller segments that depict appropriate contents only (i.e filter and exclude the clips with inappropriate content identified earlier)

# upload the video to db

video_url_yt = "https://www.youtube.com/watch?v=Xa7UaHgOGfM"

video = conn.upload(url=video_url_yt)

# generate a stream link of safe_shots by passing values in timeline

stream_link = video.generate_stream(timeline=shots)

# play the video in browser/notebook

play_stream(stream_link)

You can use the stream link with any video player ( video.js, JW player or simple html player ) to embed in your application. In upcoming versions, we are launching our own video player for your ease. It has built-in chapter and search features. Please check

this link⁠

to see it in action. We would love to know your thoughts at 👉 contact@vidoedb.io.

Keep building awesome stuff 🤘

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.