Select Language:
If you’re trying to get the audio from Azure’s Text-to-Speech Avatar feature to be higher quality, specifically at 24 kHz, here’s a simple solution that works around the current limitations. The main problem is that when you generate videos using the Avatar feature, the embedded audio doesn’t match the high quality you get from standalone Text-to-Speech (TTS) voices. This is because the Avatar API doesn’t offer options to change the sample rate or audio format.
Azure’s documentation confirms that the Avatar API only allows you to set video-related settings like resolution, codec, background, and bitrate. It doesn’t give options to adjust how the audio is formatted or its quality. In contrast, the standard TTS API lets you choose formats such as “riff-24khz-16bit-mono-pcm,” which is ideal for high-quality audio.
So, how can you get around this? Here’s a simple step-by-step process:
-
First, use the standard TTS API to generate your high-quality audio at 24 kHz. This step allows you to get the best possible audio quality.
ADVERTISEMENT -
Next, create your avatar video as usual, using the Avatar feature. The video will include embedded audio, but it might not be the quality you want.
-
Lastly, replace the audio track in your avatar video with the high-quality TTS audio you created earlier. You can do this easily with a video editing tool like ffmpeg, a free and powerful command-line program.
Here is a simple ffmpeg command you can use:
ffmpeg -i avatar_video.mp4 -i tts_audio.mp3 -c:v copy -map 0:v:0 -map 1:a:0 -shortest final_output.mp4
This command keeps the original video stream intact but swaps out the audio with your high-quality TTS file. Your final video will look the same but will have the clear, high-quality audio you want.
This approach gives you full control over the audio quality in your final video without being constrained by the Avatar API’s limitations.
If you want to learn more about customizing audio formats or other features, check out Microsoft’s documentation:
If this solution helps you, please remember to click “Accept Answer” and say yes if you found it helpful. Thanks for reaching out!