Alarming AI Clones Both a Person’s Voice and Their Speech Patterns

Gates Keeper

Engineers at Facebook’s AI research lab created a machine learning system that can not only clone a person’s voice, but also their cadence an uncanny ability they showed off by duplicating the voices of Bill Gates and other notable figures.

This system, dubbed Melnet, could lead to more realistic-sounding AI voice assistants or voice models, the kind used by people with impairments but it could also make it even more difficult to discern between actual and audio deepfakes.

Format Change

Text-to-speech computer systems aren’t particularly new, but in a paper published to the pre-print server arXiv, the Facebook researchers describe how Melnet differs from its predecessors.

While researchers trained many previous systems using audio waveforms, which chart how sound’s amplitude changes over time, the Facebook team used spectrograms, a format that is far more compact and informationally dense, according to the researchers.

AI Fake Out

The Facebook team used audio from TED Talks to train its system, and they share clips of it mimicking eight speakers, including Gates, on a GitHub website.

The speech is still somewhat robotic, but the voices are recognizable  and if researchers can smooth out the system even slightly, it’s conceivable that Melnet could fool the casual listener into thinking they’re hearing a public figure saying something they never actually uttered.

You might also like More from author

Comments are closed.