Artificial Intelligence (AI) voice cloning technology was used to spread disinformation in the run-up to the Madhya Pradesh assembly elections last year, an investigation by Decode has found.
Decode reached out to two experts separately to analyse four videos, which were viral in the run-up to the 2023 assembly elections in Madhya Pradesh and which we strongly suspected to have been made with AI voice cloning technology.
The experts independently confirmed that the four videos contained AI generated audio.
It was not clear who created the doctored videos though.
The disinformation in question includes three videos of Bharatiya Janata Party leader Shivraj Singh Chouhan and one video of Congress party leader Kamal Nath, both of whom were chief ministers of the state previously.
“Both algorithms confirmed the audio samples are AI-generated with significant confidence,” Siwei Lyu, a professor at the department of computer science and engineering at the University at Buffalo, New York State told Decode.
Lyu’s team ran an analysis with two AI-synthesised audio detection algorithms, one developed by his group and the other a third party state-of-the-art method.
“Our method also suggested that the audios share many statistical similarities with the audios created from ElevenLabs’ online text-to-speech generation system,” he said.
The findings are concerning as India counts down to a general election this April-May. They also offer a teaser of how political parties can abuse generative AI voice technology to mislead voters.
Decode earlier reported how AI voice clones impersonating the voices of celebrities in India are being used to dupe people online.
Deep Fakes Spell Deep Trouble For Democracy
The doctored videos, which contain old videos of the two politicians and their cloned voices in Hindi, were shared by party workers and supporters on social media during October to November 2023.
The AI voice clone in the Kamal Nath video claimed that he planned to cancel the Laadli Behna programme - a welfare scheme introduced by the BJP in the state.
Meanwhile the AI voice clones in the Shivraj Singh Chouhan videos purported to show him being unsure of the BJP’s prospects of returning to power in the state.
The AI synthesised voices sound nearly identical to the voices of both leaders and cannot be discerned by merely listening to them closely.
All four videos were debunked by BOOM Live and several other Indian fact-checkers last year.
Madhya Pradesh assembly elections took place on November 17, 2023 saw several doctored videos in the lead-up.
Chouhan was replaced by Mohan Yadav as the chief minister after the BJP recorded a landslide victory winning over 160 seats out of 230 seats.
BOOM Live reported how videos of popular game show Kaun Banega Crorepati were doctored to target Shivraj Singh Chouhan.
The first ever use of AI-generated deepfakes in a political campaign in India was by the BJP during the Delhi legislative assembly elections in 2020.
The Election Commission of India has so far not come out with specific guidelines that regulate the use of artificial intelligence in political campaigns.
ALSO READ: Video Of Shivraj Singh Chouhan Interacting With Hindu Saints Viral With Fake Audio
“Compared To Images And Videos, AI Audios Are Harder To Detect”
Siwei Lyu is a SUNY Empire Innovation Professor and founding co-director of Center for Information Integrity (CII) at the University at Buffalo, State University of New York.
Lyu agreed that there is a growing trend of AI generated voice clones with new services and algorithms such as that of ElevenLabs, on the market.
However, he tempered concerns of AI-based voice clones' potential to spread misinformation in an important election year where over 50 countries will go to the polls in 2024.
“I am sure it will be used in this way, but we often get facts from many other channels. Hearing one AI-generated voice may not be enough to influence everyone's opinion, especially those who keep an open mind and will do fact check. Critical thinking is still the best defence,” Lyu told Decode.
New York City-headquartered ElevenLabs is a voice AI research and deployment company and the hottest start-up in this space attracting huge interest.
If the name sounds familiar, it’s because ElevenLabs’ technology was recently used by Pakistan’s incarcerated former prime minister Imran Khan’s party workers to craft a speech for his followers using his voice clone.
“Compared to images and videos, AI audios are harder to detect because human auditory perception is different from visual perception, with the artefacts subtler to notice and can be masked by environmental factors more easily (e.g., replayed over a noisy phone line)” Lyu said.
Decode also reached out to Loccus.ai - a Barcelona-based AI start-up that has built an AI voice detection tool. The company confirmed that all four samples were very likely AI generated.
How Convincing Are AI Voice Clones? Hear It Yourself
Decode independently created two voice clones of Shivraj Singh Chouhan and Kamal Nath to demonstrate how easy it is to clone someone’s voice and how realistic cloned voices sound.
The entire process did not require any knowledge of coding. The voice clones were generated within seconds.
The voice clone is a message in Hindi urging people not to believe everything they see online but instead send any suspicious messages to BOOM's WhatsApp tip-line (+91 77009 06588).
Professor Lyu said that generative AI tool providers should consider providing countermeasures to detect the use of such technology.
“ElevenLabs have already provided a tool that can be used to detect AI generated voices from their systems, possibly based on some hidden watermarks. I think this is a welcoming move. I think providing such countermeasures is an option most genAI tool providers should consider,” Lyu said.
Decode has reached out to ElevenLabs for comment. The article will be updated if we hear back from them.