At least three artificial intelligence (AI) voice clones were peddled to spread disinformation on social media just days before India’s capital went to vote.
Two videos made with fake graphics and AI voice clones of Hindi news anchors purporting to show Aam Aadmi Party's (AAP) West Delhi candidate Mahabal Mishra ahead in opinion polls, were shared on Facebook and X (formerly Twitter).
Separately, a “leaked audio” claiming to be a phone call between Rajya Sabha MP Swati Maliwal and YouTuber Dhruv Rathee, also went viral.
The fake audio call made with synthetic voice clones of Rathee and Maliwal was used to target the popular YouTuber as well as target Delhi Chief Minister and AAP chief Arvind Kejriwal and his wife Sunita Kejriwal.
Maliwal has alleged that Kejriwal’s close aide Bibhav Kumar assaulted her at the chief minister’s residence earlier in May. The AAP has rejected Maliwal’s allegations and accused her of being a pawn of the Bharatiya Janata Party (BJP).
The three voice clones were shared separately on various social media platforms just days before Delhi went to vote on May 25, 2024 in the sixth phase of the Lok Sabha 2024 elections.
Decode also found a Facebook page named West Delhi Welfare Society promoting the two fake videos claiming AAP’s Mahabal Mishra would win his constituency by a huge margin. About ₹4500-₹5000 each was spent on promoting the two posts.
A post on X that shared the fake Maliwal-Rathee synthetic audio had over 91,600 views. X’s Community Notes were also not visible on posts sharing the fake audio.
Voice Clones Are A Preferred Vector Of Disinfo
AI voice clones, which cost little to nothing to create, have been a popular vector of disinformation during the general election.
Decode earlier reported how several voice clones were used to spread disinformation during the Madhya Pradesh assembly elections late last year.
However, many industry experts on the deepfake detection side feel this election cycle has not seen as many deepfakes as they anticipated.
"We anticipated being extremely busy, processing around 50,000 to 100,000 deepfakes. However, over the entire three-month period, we managed to process only about 800 to 1,000 deepfakes, which was significantly below our expectations," Professor Mayank Vatsa, whose department at the Indian Institute of Technology Jodhpur has built a deepfake detection tool called Itisaar, told Decode.
"I believe the use of sophisticated deepfake technology is still relatively uncommon," he added.
"Most of the existing deepfake research and generation efforts, predominantly in the West, focus on the English language. For Hindi or other local languages, significant investment and dedicated teams are required to train the models, which not everyone can afford or accomplish. This likely impacted the overall generation of sophisticated deepfakes where synthetic or altered audio in local languages is seamlessly lip-synced, similar to what is achieved in English," Professor Vatsa explained.
Digvijay Singh, co-founder of Contrails AI - a company developing deepfake detection systems, also had a similar take.
“A lot of people, including me, were anticipating a tsunami of deepfake content around misinformation. However that didn’t exactly happen,” Singh told Decode.
Deepfakes: No code, No Problem
Ajmer, Rajasthan based Divyendra Singh Jadoun, founder of Polymath Synthetic Media Solutions told Decode he stopped taking on political projects after the second phase of polling due to ethical concerns about requests made by political parties.
But voice cloning technology has been dumbed down to such an extent that anyone can make a voice clone for free using the many websites that offer this feature. No coding skills are required to create the same.
"Now anybody, even a 10 year old kid can create a voice cloning or call recording like the Dhruv Rathee and Swati Maliwal one (voice clones)," Jadoun, who goes by the moniker 'The Indian Deepfaker,' said.
"There are so many websites available, in less than five dollars you can clone the voices of ten people and the character limit is up to one lakh. If anyone wants to do face swapping it’s also very cost effective. You just have to upload a single image and a target video and you'll get a deepfake video in less than three minutes," Jadoun said.
"For these kinds of unethical things these political parties are not reaching out to any companies because it’s very accessible. They are creating (deepfakes) on their own or (through) other people who have bad intentions...I don’t know but anybody can create it," Jadoun added.
Political Parties Are Testing The Waters With AI
From using voice clones of popular leaders for voter and cadre outreach to creating deepfakes to promote their own candidates and target rivals; political parties in India are dipping their toes in the AI stream in myriad ways this election.
At the same time, increasingly convincing deepfakes have led to a few politicians trying to dismiss genuine videos as deepfakes.
In March this year, AAP functionaries made voice clones of Arvind Kejriwal in Hindi and English bringing to life his message to his followers supposedly written during his custody with the Enforcement Directorate.
The Congress party was called out for peddling AI voice clones of actors Aamir Khan and Ranveer Singh, separately, to target Prime Minister Narendra Modi and the BJP.
Meanwhile, the BJP, which was first off the blocks to embrace deepfakes back in 2020, has relied on its sprawling network of shadow pages on Facebook and Instagram to create satirical AI based videos targeting its rivals.
Ambiguous social media guidelines when it comes to AI content have enabled these videos to circulate freely while evading any fact-checking.
Thus deeper problems of disinformation and social media platforms such as Facebook and X unwilling to bring political parties to heel have largely been underreported.
India Specific Deepfake Generative Algorithms Needed For Sophistication
A wide spectrum of synthetic content in India has made it hard to figure out if deepfakes have had any influence on the electorate. A longer study is needed to determine the same.
“We should see this election as a budding ground for deep synthetic media manipulation as an attack vector. As sophistication and accessibility improves, I’m sure the same folks will come up with something much more impactful,” Contrails AI’s Singh said.
When asked if the fear of deepfakes have been overhyped, Professor Mayank Vatsa answered with an emphatic "no".
"If we have Indian specific deepfake generative algorithms in local languages then we will see more challenging outputs coming in," he explained.
"The Indian research community was slightly late due to computational resource constraints which limited the generation of foundational models and generative AI models tailored to the Indian context, specifically for audio and videos. However, we are making progress. While these technologies have positive applications, they can also be misused," Professor Vatsa said.
The professor said, "I will give an analogy of cricket because I love cricket. A bowler has six balls but only needs one good ball to take a wicket - he need not have all six good balls. Similarly, just one or two highly convincing deepfakes can cause significant disruption. Fortunately, we haven't seen this happen yet, which I again attribute to the lack of advanced text-to-speech and generative deepfake algorithms for the Indian context.”