In the customer service industry, your accent dictates many aspects of your job. It shouldn’t be true that there is a “better” or “worse” accent, but in today’s global economy (although who knows for tomorrow’s) it helps to sound American or British. While many are taking accent neutralization training, Sanas is a startup with another approach (and a $ 5.5 million funding round): using speech recognition and synthesis to change the accent of the person. speaker in near real time.
The company has trained a machine learning algorithm to quickly and locally (i.e. without using the cloud) recognize someone’s speech on one side and output the same words on the other. with an accent chosen from a list or detected automatically from the other person’s speech.
It fits right into the operating system’s sound stack, so it works right out of the box with just about any audio or video calling tool. Right now, the company runs a pilot program with thousands of people in locations ranging from the US and UK to the Philippines, India, Latin America and beyond. Supported accents will include American, Spanish, British, Indian, Filipino, and Australian accents by the end of the year.
To tell the truth, the idea of Sanas bothered me a little at the beginning. It sounded like a concession to fanatic people who consider their accent higher and think others below them. Technology will remedy this … by accommodating the fanatics. Great!
But even though I still have a bit of that feeling, I can see that there is more to it. Basically, it’s easier to understand someone when they speak with an accent similar to yours. But customer service and technical support is a huge industry and mostly performed by people outside of the countries where the customers are located. This basic disconnect can be corrected in a way that puts the burden of responsibility on the newbie worker, or whoever puts it on the tech. Either way, the difficulty of making oneself understood remains and needs to be addressed – an automated system simply makes it easier to do this and allows more people to do their jobs.
It’s not magic – as you can see in this clip, the character and cadence of the person’s voice is only partially preserved and the result is considerably more artificial:
But technology is improving and like any speech engine, the more it is used, the better it gets. And for someone who is not used to the accent of the original speaker, the version with the American accent may very well be more easily understood. For the person in the support role, that likely means better outcomes for their calls – everyone wins. Sanas told me the pilots are just getting started, so there are no numbers available for this deployment yet, but testing has suggested a dramatic reduction in error rates and an increase in call efficiency. .
It’s good enough anyway to attract a round of funding of $ 5.5 million, with the participation of Human Capital, General Catalyst, Quiet Capital and DN Capital.
“Sanas strives to make communication easy and frictionless, so that people can speak with confidence and understand each other, wherever they are and with whom they are trying to communicate,” CEO Maxim Serebryakov said in the statement. press announcing the funding. It’s hard to disagree with this mission.
While the cultural and ethical issues of accents and power differences are unlikely to go away, Sanas is trying something new that could be a powerful tool for the many people who need to communicate professionally and find their ways of working. word are an obstacle to this. It’s an approach worth exploring and discussing even though in a perfect world we would just understand each other better.