Contact Amanda
Send a message and receive her reply immediately.
Feel free to write your questions and project details here - your information is safe.
It was a Thursday of August, 2023 when somebody called me from Italy. I couldn't pick up the phone immediately, so I received an email right away: the multinational company was studying to launch a new service and they wanted to consult me on the viability of this new service in Brazilian Portuguese.
Peoples' and companies' names, along with anything that can identify them or couldn't be disclosed, were omitted.
We jumped on a meeting 30 minutes later and we had a warm and exciting discussion about this new service they were considering exploring: AI voice production and post-production, also known as Artificial intelligence voice synthesis and editing.
I was indicated by an inside person (who I bring very dearly) to be the sound & BRPT language specialist to:
In summary, I was requested to combine my experience and knowledge as a linguist and a sound engineer specialist in voice and voice over in Brazilian Portuguese to assess the viability and the output's quality of this new service.
It was an interesting business proposal from my perspective as an audio engineer and a linguist. I was going to:
So at this point, you might be convinced that this is not an article of a voice actor demonizing and criticizing the era of AI synthetic voices. It's an article from a sound engineer who specializes in voice and has an open mind and a true desire to understand the state of the industry and the real viability, risks, cons and pros of AI voice service.
Only by being open, technical, and honest, I could arrive at a lucid, unbiased conclusion (at the end of the article).
In September 2023, the company prepared a testing project - just like a real voiceover project, with the client's briefing, requirements, and script ready to be recorded.
In late September and throughout October and November 2023, we had our first real project, and this time the client required AI voiceover synchronization with the videos in English.
Based on the experience from both the testing phase and the real project, I wonder why they chose AI voice for such a project.
It could be their initial aspiration, but it's not what they received at the end of the project. Currently, the AI voiceover sounds more like a weird, unbalanced voice actor than a genuine robot voice.
Additionally, if the goal was to have a Voice of God narration (neutral but not robotic, of course), a professional voice actor would be the shortest way - or better, the only way.
By the way, the AI voice's timbre in Brazilian Portuguese didn't sound distant at all. In some instances (when there weren't glitches, tremolos, and weird cadences), that segment of voice could have been perceived as being from a human. It sounded like a real person's voice (lacking communication skills) from São Paulo (Paulista neutral accent) which recording was messed up and truncated.
So see, the timbre of an AI voice is not a technical issue for AI companies anymore. They have arrived at a development stage where AI's voice timbre is quite similar to a human's timbre.
The problem now is with pitch, consistency, quality of vowels and consonants, and many other aspects of the human voice that make them so suitable, versatile, and desirable.
And by the way, AI companies are intensively investing more and more in a conversational and natural tone for their synthetic voices. So I don't think the final client was looking for a distant, robotic voice. I can't see how it could have benefited them in this project. And a robotic voice was not what they took as an output, anyway.
No, because the math doesn't make sense. If the AI voice was generated by the final client itself, surely the costs would have been lower for them. However, because of the middle company and the human QA, they were paying large amounts for a renowned company and skilled professionals in each language so we could fix the mistakes and improve the perceived quality of something that was already poor in its genesis.
Absolutely no. Well, the middle company had to deal with about the same amount of people, we, audio engineers and linguists.
Possibly.
Possibly.
That's possible too. However, since AI voices are being used more commonly in low-budget and amateur projects, there's no competitive advantage for serious businesses in being connected with this vocal aesthetic of low quality and rude speech.
I finished this article knowing that I had much more to report and share with you about Artificial Intelligence, Generative AI, Voice Cloning, and Speech Synthesis.
This subject doesn't end here. Send me an email to subscribe to my newsletter and become an insider. Get original and fresh information and discover what nobody is saying about AI & voice.
See you soon,
Send a message and receive her reply immediately.
Feel free to write your questions and project details here - your information is safe.