Last week, we launched Overdub, an AI voice generator that allows you to create a realistic clone of your own voice. To help you make the most of the software, we’re offering some tips to ensure a production-ready Overdub voice, with crisp fidelity, realistic intonation, and natural expressiveness.
Your training data should be recorded in a quiet, acoustically “dead” room, and you should be using an external microphone.
While Overdub voices can be trained with as little as 10 minutes of audio, we recommend at least 30 minutes. The likelihood of a production-ready voice increases as you increase your training data volume, all the way up to 90 minutes.
Open your training data project to record one of our supplemental scripts.
Styles let you copy the various delivery styles of your real audio recordings. Every Overdub is generated using a Style; your voice comes loaded with one as a default. The default style might not be optimal for the content you are creating. To create a new Style, select a range of real audio (complete sentences are recommended) that’s three to twenty-five seconds long, right-click, and select “Save as Style.” Learn more about setting up Styles.
Periods and commas affect Overdub intonation. Add and remove them to fine-tune delivery.
Right-click on an Overdub (or hover over the clip in the timeline) to convert it to normal Descript audio. Once it’s audio, you can fine-tune the word spacing and sentence boundaries just like any other clip. Learn more about Timeline Editing.
If you’re making an editorial correction of a word and it sounds unnatural, undo and experiment with grabbing another word or two on either side.
If Overdub mispronounces a word, try a different (incorrect, but phonetic) spelling. Once you’ve got it sounding right, you can always convert the Overdub to audio (see “Convert to audio” above) to correct the spelling in the transcript.
Download Descript today and try Overdub for yourself. We have a feeling you’ll be impressed.