Can AI voice cloning benefit journalism and be ethical?

An AI generated image to visualize voice cloning. — The technology of AI voice cloning has vastly improved in the past two years. This image was generated using GenAI (ChatGPT).

By CLAIRE DUFOURD

AI voice cloning technology has advanced over the last few years, showing up in multiple fields such as marketing, entertainment and more.

This technology uses artificial intelligence (AI) to replicate a person’s voice based on only a few minutes of audio recording. For local newsrooms facing staff or financial shortages, voice cloning could offer new ways to deliver audio content to their audience. But it also raises some concerns about transparency and trust practices.

“So what’s happened within the past year is that we’ve gone way beyond just text-to-speech, which was already really helping people who are visually impaired,” says Sheena Rossiter, a professor at MacEwan University who co-leads a research project focused on voice cloning and its impact, along with Prof. Angela Misri from Toronto Metropolitan University. “And now you can truly make yourself a multilingual, expressive, emotional voice replication, which could really help transmit information much faster.”

With the help of this technology, newsrooms could convert written stories into audio quickly, publish them in multiple languages or make content more accessible for people with visual impairments.

In small newsrooms, where reporters already tend to take on multiple roles, AI-produced audio could save time and enhance the quality of publications. This is where Misri says opportunities could be impactful if done right, aiding in everything from creating theme music and stings to generating alternate language versions of podcast episodes.

“I think at its core, voice-cloning AI will allow journalists to participate in more of the creative process around their stories,” says Dmitry Mironov, a research assistant on the project. “Where previously, one large story would require a journalist, a camera operator, and an audio person, we’re now at a point where all of these things can be done by one individual.”

When used with consent, voice cloning could allow journalists to stay consistent in their audio production even when they are unavailable to record, offering a time-saving alternative to classic recording sessions.

But the risk associated with any AI usage, especially when it comes to life-like replicas, is the issue of trust and transparency between newsrooms and their audience.

“I think disclosure and transparency are so crucial when it comes to AI use, and the audience’s trust in media has already eroded so much in the last decade,” Rossiter says. “Being transparent about how this technology is being used and how accurate the information is in order to maintain trust of the news consumer is vital.”

Voice cloning has already widely been used for scams, with UNESCO reporting instances of serious medical scams and EY Cybersecurity reporting that nearly 25 per cent of Canadians have had experiences of deep-fake fraud calls.

In a local context, a false recording attributed to a reporter or a public official could be harmful to the trust of the community they live in and report on. Even legitimate newsroom use could cause confusion for an audience if reporters are not transparent when disclosing when audio has been made using AI.

“If we’re speaking from the perspective of journalists, I believe AI voice cloning software will be used in the near future to replace labour-intensive roles in the journalistic process. On paper, this is good for both the industry and the individual journalist. Stories come out faster & with less effort,” Mironov says. “However, funding has been scarce in the industry, and unless there’s a massive change soon, newsrooms are going to have to find a means to operate with a reduced budget, which could result in the displacement of even more journalists.”

Voice-cloning technology also raises the question of who owns a journalist’s voice when it comes to synthetically reproducing it for publishing purposes.

“Legislation and laws don’t work as fast as technology moves, so you’re always sort of stuck in that game of the hare and the tortoise,” Rossiter says. “I do see that there will be a great need for new policies around generative AI, but the question is really how fast it can go, and if it can catch up.”

Rossiter also points out that reporters and freelancers could be at greater risk, with newsrooms potentially being able to use their voice after their contract ends. This trend has already started entering the public’s mind, with people like radio personality David Greene suing companies that licensed their voices without their consent.

Without clear policies and ethics, newsrooms could replace human labour while using a reporter’s voice for audio recordings. Rossiter recommends prioritizing consent and transparency when using this new technology.

“I think when it comes to any kind of AI technology, transparency is always key, right?” she says. “So any policy or regulation should prioritize this to make sure no information gets lost between the primary source and the published piece.”

Can AI voice cloning benefit journalism and be ethical?

Related Posts