VideoDB’s Timeline Architecture makes it easy to personalize content to meet users’ requirements. If users prefer not to include curse words in their content, VideoDB allows for these words to be either removed or replaced with a sound overlay such as beep sound.This task, typically complex for video editors, can be accomplished with just a few lines of code using VideoDB.This technique can also serve as a valuable Content Moderation component for any social content platform, ensuring that content meets the preferences and standards of its audience.Let’s dive in!
For this tutorial, let’s take the Joe Rogan clip, where he is trying to trick siri into using curse words 🤣
from videodb import play_stream# Joe rogan video clipcoll = conn.get_collection()video = coll.upload(url='https://www.youtube.com/watch?v=7MV6tUCUd-c')# watch the original videoo_stream = video.generate_stream()play_stream(o_stream)
We have a sample beep sound in this folder, beep.wav. For those looking to add a more playful or unique touch, replacing the beep with alternative sound effects, such as a quack or any other sound, can make the content more engaging and fun.
# Import Editor SDK componentsfrom videodb.editor import VideoAsset, AudioAsset, Timeline, Track, Clip# upload beep sound - This is just a sample, you can replace it with quack or any other sound effect.beep = coll.upload(file_path="beep.wav")# Create audio asset from beep soundbeep_asset = AudioAsset(id=beep.id)
To ensure appropriate content management, it’s necessary to have a method for identifying profanity and applying a predefined overlay to censor it. In this tutorial, we’ve included a list of curse words. Feel free to customize this list according to your requirements.
We’ll use few NLP techniques to identify all variations of any offensive words, eliminating the need to manually find and include each form. Additionally, by analyzing the transcript, you can gain insights into how these sounds are transcribed, acknowledging the possibility of errors.
#install spacy!pip -q install spacy#install dataset english core!python -m spacy download en_core_web_sm# load the english corpusimport spacyimport renlp = spacy.load("en_core_web_sm")def get_root_word(word): """ This function convert each word into its root word """ try: #clean punctuations cleaned_word = re.sub(r'[^\w\s]', '', word) # Process the sentence doc = nlp(cleaned_word) # Lemmatize the word lemmatized_word = [token.lemma_ for token in doc][0] # Assuming single word input return lemmatized_word except Exception as e: print(f"some issue with lemma for the word {word}") return word
First we will identify the timestamps to beep, and then let’s create a timeline using the Track and Clip pattern. Add the video clip to the main track, then loop through the transcript to add beep overlays wherever curse words are detected.
# 1. Filter and prepare curse metadatapadding = 0.15curse_intervals = [ { 'word': w.get('text'), 'start': max(0.0, float(w['start']) - padding), 'end': min(float(video.length), float(w['end']) + padding), 'raw_start': float(w['start']), 'raw_end': float(w['end']) } for w in transcript if w.get('text') != '-' and get_root_word(w.get('text')) in curse_words_list]# 2. Building the Timelinefrom videodb.editor import Timeline, Track, VideoAsset, AudioAsset, Cliptimeline = Timeline(conn)video_track = Track()beep_track = Track()current_time = 0.0print(f"{'WORD':<15} | {'START':<8} | {'END':<8} | {'DURATION'}")print("-" * 50)for interval in curse_intervals: # A. Clean segment if interval['start'] > current_time: clean_dur = interval['start'] - current_time video_track.add_clip(current_time, Clip(asset=VideoAsset(id=video.id, start=current_time), duration=clean_dur)) # B. Muted segment mute_dur = interval['end'] - interval['start'] video_track.add_clip(interval['start'], Clip(asset=VideoAsset(id=video.id, start=interval['start'], volume=0.0), duration=mute_dur)) # C. Beep overlay beep_dur = interval['raw_end'] - interval['raw_start'] beep_track.add_clip(interval['raw_start'], Clip(asset=AudioAsset(id=beep.id, start=0, volume=2.0), duration=min(beep_dur, float(beep.length)))) # D. Professional Print Message print(f"{interval['word']:<15} | {interval['raw_start']:<8.2f} | {interval['raw_end']:<8.2f} | {beep_dur:.2f}s") current_time = interval['end']# E. Final clean segmentif current_time < float(video.length): video_track.add_clip(current_time, Clip(asset=VideoAsset(id=video.id, start=current_time), duration=float(video.length) - current_time))timeline.add_track(video_track)timeline.add_track(beep_track)stream_url = timeline.generate_stream()print(f"\nProcessing complete. Stream URL: {stream_url}")
If you have videos pre-uploaded and indexed, running this beep pipeline is real-time. So, based on your users’ choices or your platform’s policy, you can use information from spoken content to automatically moderate.
Explore Full Notebook
Open the complete implementation in Google Colab with all code examples.