Home » News » Microsoft’s new AI tool can mimic a human vo...

Microsoft’s new AI tool can mimic a human voice with 3 seconds of audio

Deepanker Verma January 11, 2023 Technology

Add Techlomedia as a preferred source on Google.

Microsoft’s new voice AI tool Vall-E can mimic anyone’s voice by analyzing a three-second audio sample. It also tries to preserve the speaker’s emotional tone. Vall-E can be used for text-to-speech applications and speech editing.

Vall-E has built a technology called EnCodec that analyses a person’s voice and uses it to train understand how the voice would sound in speaking different phrases. Even with a three seconds sample clip, this tool can replicate the speaker’s timbre and tone. Now think if you provide a large sample of recording, this tool can create more realistic audio results.

This tool has been trained on 60,000 hours of English speech data from more than 7,000 speakers. Microsoft also provided dozens of audio examples of the AI model in action. You can also check audio examples here.

“VALL-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as an acoustic prompt. Experiment results show that VALL-E significantly outperforms the state-of-the-art zero-shot TTS system in terms of speech naturalness and speaker similarity,” describes the official website of VALL-E.

VALL-E can be used in the production industry and has several use cases. This can help voice-over artists who can now do their job effortlessly.

But it can also be used for wrong things. It can be used to make fake call recordings of politicians and celebrities. It is also a security threat in applications that use voice passwords.

Follow Techlomedia on Google News to stay updated.

Affiliate Disclosure:

This article may contain affiliate links. We may earn a commission on purchases made through these links at no extra cost to you.

About the Author: Deepanker Verma

Deepanker Verma is the Founder and Editor-in-Chief of TechloMedia. He holds Engineering degree in Computer Science and has over 15 years of experience in the technology sector. Deepanker bridges the gap between complex engineering and consumer electronics. He is also a a known Security Researcher acknowledged by global giants including Apple, Microsoft, and eBay. He uses his technical background to rigorously test gadgets, focusing on performance, security, and long-term value.

Recent Articles

Meta Horizon+ Now Includes Xbox Game Pass Starter

Xbox Game Pass Adds Halo: Campaign Evolved, Beast of Reincarnation, and More

New DC Fighting Game DCKO Announced for Android and iPhone

Samsung Launches Its First Galaxy Credit Card to Take on Apple Card With Up to 5% Cashback

Google Fitbit Air May Launch in India Soon as Amazon Listing Goes Live

Sony Unveils LYT-610 Camera Sensor With 64MP Resolution and 4K 120fps Video Recording

Technology

Snapdragon Reality Elite Announced: Qualcomm’s Most Powerful XR Chip Yet

Technology

NVIDIA RTX Spark Is Here: NVIDIA’s New ARM Chip for Windows PCs

Technology

MediaTek Dimensity 7500 Brings On-Device AI and Better Efficiency to Mid-Range Phones

Technology

Qualcomm’s Snapdragon C Could Finally Fix Affordable Windows Laptops

Technology

Microsoft’s new AI tool can mimic a human voice with 3 seconds of audio

About the Author: Deepanker Verma

Recent Articles

Unplug & Explore

Latest Reviews

How-to Guides

Tech News

Apps & Software

Gadgets

Gaming

Best Deals

Related Posts

Sony Unveils LYT-610 Camera Sensor With 64MP Resolution and 4K 120fps Video Recording

Snapdragon Reality Elite Announced: Qualcomm’s Most Powerful XR Chip Yet

NVIDIA RTX Spark Is Here: NVIDIA’s New ARM Chip for Windows PCs

MediaTek Dimensity 7500 Brings On-Device AI and Better Efficiency to Mid-Range Phones

Qualcomm’s Snapdragon C Could Finally Fix Affordable Windows Laptops

Google Is Returning to Smart Glasses, This Time With Gemini AI

Stay Updated with Techlomedia