Audio Description Generator
Create descriptive audio tracks for YouTube videos within minutes.
What it does
The Audio Description Generator app is a tool for creating descriptive audio tracks for short YouTube videos within minutes. Once given a YouTube link, the app fetches the video, title, and description. It then begins by splitting the video into smaller chunks. These chunks, alongside the YouTube data, are first used to create a "context file" using Gemini, this acts as a first pass to detect general details and identify any characters. Then each chunk is used to make a "loudness file" which measures the volume at every interval and a "transcript" (using Gemini) which lists the dialogue from the video with timestamps. All this information is then fed to Gemini once again to create a "script" of observations with timestamps. These scripts are then put through Google Cloud's Text-to-Speech, the resulting audio is stitched back together, and the final result is presented to the user.
Built with
- Web/Chrome
- Google Cloud: Text-to-Speech
Team
By
Ryan Baumgart
From
Canada