An AI-first media startup experimenting in synthetic content creation partnered with Oodles to build a minimal video generation platform. The client aimed to replicate talking-head videos using deepfake technology, driven by synthetic audio from AWS Polly. They sought a backend MVP to run experiments—without UI—focusing on real-time rendering precision and infrastructure readiness.
The project aimed to develop an experimental video rendering pipeline using deepfake models synced with AWS Polly speechmarks. The client required a backend MVP with the help of Oodles that processes a source video, aligns it with generated audio, and produces a lifelike talking-head output. Core areas of work included audio-visual synchronization, face animation, infrastructure setup, and system automation.
To help the client achieve their goal, a modular cloud-ready backend was architected, focused entirely on processing speed and deepfake accuracy.
Key Features Implemented: