Speaking fluent Gaelic could be only the push of a button away in the world of artificial intelligence.
Retired Army Gen. Michael Flynn offered a demonstration using his podcast, “Inside the SCIF,” in a March 20 post on X.
“This PODCAST features how we are applying Artificial Intelligence combined with innovative applications to broadcast to more diverse audiences in America and around the world,” wrote Flynn, who served as national security advisor to former President Donald Trump.
“You’ll see our network’s President, Brandon Howse in discussions with Colonel John Mills (all speaking Mandarin),” he said.
“Not only are the words translated realtime in audio, but the formation of the words on our mouths are also manipulated so that the viewer can essentially ‘read lips’ along with the spoken words,” Flynn said.
The AI video, which used the Feb. 4 episode of his show, was posted to YouTube on Feb. 21.
As advertised, it features the retired general and others appearing to speak Mandarin, with their lips moving in sync with the words.
Although Flynn noted that he uses the technology to spread the cause of freedom, others point to its potential for serious problems.
Do the dangers of advanced AI outweigh its potential benefits?
“On one hand. I really love this tech. I’m already addicted,” contributor Charlie Fink wrote on Forbes in discussing AI lip-dubbing applications that make it seem a speaker is speaking a foreign language, right down to manipulating the mouth and other facial features to mimic actual speech.
“On the other hand, if this is what the good guys can do, what might bad actors … do, not with the apps these companies offer, but with their own, similar AI apps, purpose-built for the spread of disinformation?” he said.
“I know, this takes some of the fun out of it for me, too. While we marvel at all these amazing new applications, I am fearful of their price,” Fink wrote.
According to the business site Fast Company, Hollywood is looking longingly at emerging AI lip-dubbing software because existing efforts to go global are not meeting expectations.
“But even as companies invest in quality script translations and better performances by voice actors, dubbed entertainment often still looks as cheesy as old kung fu films and Mr. Ed, turning audiences off. No matter how good the sound is, it seems wrong. Lips don’t lie,” Burt Helm wrote for the site.
“The lips are always, always the last piece that nobody’s solved for,” said Jonathan Bronfman, co-founder and CEO of the visual effects company Monsters Aliens Robots Zombies.
MARZ markets what’s called LipDubAI, which, in Helm’s words, promises that “Marlon Brando will mumble in Mandarin; Jim Carrey will gesticulate in German, and Arnold Schwarzenegger’s English … well. AI is making more progress every day.”
The technology behind the app merges audio analysis with video content to generate lips that move the way they should if the speaker were speaking the language being heard.
Jacob Ridley wrote on PC Gamer that the tech does not stop there.
“Google researchers have found a way to create video versions of humans generated from just a single still image,” he wrote. “This enables it to do things like, generate a video of someone speaking from input text, or changing a person’s mouth movements to match an audio track in a different language to the one originally spoken.”
“It also feels like a slippery slope into identity theft and misinformation, but what’s AI if not with a hint of frightening consequences,” Ridley said.
He said the technology is not perfect, but noted if “it were a more perfect technology, it’d be even more worrying to think about how this technology could be used to create deep fakes, spread misinformation, or steal identities.”
“We’ll get there one day, and I for one hope we have some handle on how to deal with this stuff a bit more by then,” Ridley rote.