Vijay Kumar - Senior Software Engineer

Project Type:

Content Automation Tool

Core Technologies:

Python, GPT-3.5, FFmpeg

(Private Instance)

Technical Overview

A weekend project that got out of hand - automates video creation by combining stock footage with AI-generated voiceovers and captions.

Workflow: User inputs topic → Python scrapes news/RSS feeds → GPT-3.5 writes script → AWS Polly generates voiceover → FFmpeg stitches clips → Uploads to YouTube.

Cool Hack: Created a "visual rhythm" system that matches clip transitions to voiceover intonation using audio waveform analysis.

Technical Challenges

Video Synchronization

Problem: Voiceover and clips going out of sync for longer videos
Solution: Implemented a chunk-based processing system that handles 60-second segments sequentially

API Costs

Problem: GPT-3.5 costs adding up quickly during testing
Solution: Created a local cache of common responses and implemented strict character limits

Lessons Learned

• Video processing is CPU-intensive - cloud functions are essential
• AI APIs require careful rate limiting
• Content moderation is harder than expected

Future Ideas

• Add custom avatar creation
• Integrate with TikTok API
• Local AI model deployment

Let's 👋 Work Together Let's 👋 Work Together

Vijay Kumar 👋