Will Computers Ever Replace Human Voice Actors?

Technology has seen tremendous advancement in the past decade and continues on to this day with new capabilities like machine learning, artificial intelligence, text-to-sound (TTS) technology, voice recognition, robotics, etc. making us feel skeptical as to how much of this development, particularly in the field of artificial intelligence, is the human race willing or able to take. In an interview with the Australian Financial Review, engineering genius Steve Wozniak, who co-founded Apple with Steve Jobs, has warned that “computers are going to take over from humans, no question” and goes on to say that “like people including Stephen Hawking and Elon Musk have predicted, I agree that the future is scary and very bad for people”. Some prominent technologists argue that “algorithmic regulation” should replace current public service models. Algorithmic regulation proposes that automated systems should replace the role of human decision-makers and policy-makers because such systems are more efficient in comparing outcomes to desired objectives through big data analytics. These startling predictions and arguments with regard to artificial intelligence point to only one direction, that indeed, we may soon be seeing the day when computers will become more intelligent than humans and may eventually take over our jobs. In fact, this unnerving reality has led dozens of the world’s top artificial intelligence experts to sign an open letter encouraging researchers to focus on the huge potential benefits of artificial intelligence while avoiding potential pitfalls.

Moreover, an NBC news article entitled Nine Jobs that Humans May Lose to Robots reported that robots are currently analyzing documents, filling prescriptions, and handling other tasks that were once exclusively done by humans. These include pharmacists, lawyers and paralegals, drivers, astronauts, store clerks, soldiers, babysitters, rescuers, and sportswriters and other reporters. Add to this list the use of computerized voices in place of real human voices for radio forecast broadcasts of the National Weather Service in Alaska since June of 2014. Similarly, in the entertainment industry, some of the biggest movies of all time are animated films. Movies like Frozen, Shrek, The Lion King, or the Toy Story series have probably used a sophisticated digital animation technique called “motion capture” more popularly known as mocap. And, while the voices of these digital characters have been rendered by human voice over actors, some advertising agencies, production companies, and other industrial companies like mobile phone manufacturers have used digital voices to “speak” to their customers or users thru Text-to-Speech (TTS) technology.

This trend in artificial intelligence and digital technology may actually pose a big threat to humans. We may soon find ourselves wanting for jobs while the workforce is being dominated by robots and the movies by digital actors given life by digital voices. It’s a scary thought – and it could happen – but here’s the good news.

Rick Robinson, an IT Director for Smart Data and Technology for Amey, one of the UK’s largest engineering and infrastructure services companies, and previously, IBM UK’s Executive Architect for Smarter Cities presented an exceptional argument telling The Urban Technologist that, “I’m convinced that the current generation of Artificial Intelligence based on digital technologies will not re-create anything we would recognize as conscious life and free will; or anything remotely capable of understanding human values or making judgments that can be relied on to be consistent with them”. He further stated that, “I think that when most people think of what defines us as humans, as living beings, we mean something that goes further: not just the intelligence needed to take decisions based on knowledge against a set of criteria and objectives, but the will and ability to choose those criteria and objectives based on a sense of values learned through experience; and the empathy that arises from shared values and experiences” and believe in the fact that “the human world and the things that we care about can’t be wholly described using logical combinations of computer programs and data” then we can perhaps be assured that “digital technology cannot wholly replace human workers in our economy; it can only complement us”.

In the entertainment industry, for example, according to Ed Leonard, CTO at Dreamworks, told the BBC they do not see mocap as a threat to the actor’s craft. “If you wanted to recreate and make an actor photo-real digitally, we have the technology to do that,” he said. “But it will never replace actors – that doesn’t make any sense. Talent is about the expression of that performance.” And goes on to say that “generally the results of motion capture just aren’t good enough” and “you want expressiveness, you don’t want literal translation. It’s come a long way but in terms of using it for animated films it’s not what we’re looking for.”

Similarly, today’s technology hasn’t reached the point where it can match human voices. While there has been a huge improvement over the robotic voice, computer-generated voices are still a far cry from human voices. But this is just for animated films.

What about voice actors for radio and television commercials, narrations, and audiobooks? How long will it take before they could be replaced? We think it’s still a long way off. But let us know your thoughts below.

CM Serra

CM Serra has a long history of working with voice talent. Starting out as an assistant and working her way up to a Casting Director, she's seen it all. Her forte may be working with 'old school' agents, but she's witnessed the voice over industry move online in recent years.

  • Great post CM! I’ve thought about this before and although voice technology is getting better and has taken the place of some real voice overs, specifically in the automated phone service section. I don’t believe computers can replicate the emotion a human voice can create. I look at the human voice like an instrument. Computers have been able to simulate musical instruments for decades now but haven’t eliminated real musicians. You can use a computer to make a guitar solo but I don’t believe you’d be able to replicate the emotion that a guitarist would put into it or the small little intricacies that go into playing that instrument, like a Jimi Hendrix solo. You can probably create something similar, but it would definitely be missing something. Hopefully the same will always be for voice overs as well.