Live Demo

Genpeli — AI Video Editing Pipeline

Genpeli takes raw video footage and turns it into polished social-ready clips — automatically. It uses Whisper for transcription, a judge model for intelligent cut detection, FFmpeg for processing, and word-by-word caption rendering. The entire pipeline runs locally or on serverless GPUs.

The Problem

Content creators spend hours manually cutting videos, adding captions, and reformatting for different platforms. The process is repetitive, time-consuming, and requires expensive editing software. Short-form content demands high volume but each piece still needs manual polish.

The Solution

An automated post-production pipeline. Upload raw footage → Whisper transcribes → a judge model scores every potential cut point (speech energy, sentence boundaries, hook words) → FFmpeg renders with word-by-word captions and normalized audio → export in the right format for each platform.

The Outcome

What used to take hours of manual editing now runs in minutes. Upload raw footage, get back polished clips with captions — ready to post on any platform.

Key Features

AI-powered cut detection — scores speech energy, sentence boundaries, topic changes
Word-by-word captions with speaker-level timing
Audio normalization and enhancement
Platform-optimized export (vertical/square/horizontal)
Serverless GPU processing via Modal.com

Technology Stack

PythonFastAPIWhisperFFmpegModal.comReactTailwind CSS

Interested in this project?

I'd love to discuss the technical details, challenges overcome, or similar projects I could build for you.

View Live App Let's discuss this project