Artificial intelligence is surely the “thing” of our very near future, and it seems Microsoft is making a real first of things in this field.
Recently, Microsoft researchers have been successful in creating a speech recognition software which, as claimed, “is able to hear language as accurately as humans.” The company broke its own record for most advanced speech recognition software, and the technology was revealed in detail in a paper published on Monday.
According to the paper’s findings, the software had a word error rate of 5.9 percent, which equivalent to that of human beings.
Microsoft explained this in an article:
The research milestone doesn’t mean the computer recognized every word perfectly. In fact, humans don’t do that, either. Instead, it means that the error rate – or the rate at which the computer misheard a word like “have” for “is” or “a” for “the” – is the same as you’d expect from a person hearing the same conversation.
Just over a month ago Microsoft was able to achieve an error rate of 6.3 percent after relying on deep neural networks. This technology works similar to how the human brain works and interprets information using specialised graphics processing units (GPUs) allowing an unprecedented software learning speeds.
The milestone will have far-reaching implications on our day to day life. For starters, Microsoft’s products like personal assistant Cortana and Xbox would be a whole lot better at understanding voice commands. Accessibility software, like instant transcription services would also be more precise.
It can also lead to a value addition to Microsoft’s productivity tools like Office. And MS Word’s dictation feature getting the accuracy of near-human levels is an enticing prospect indeed.
But in the larger scheme of things, this marks a major milestone for AI research. Geoffrey Zweig, from Microsoft’s Speech and Dialog research group shed light on this prospect and points out that the next phase to build a software involves not just transcribing human speech but also trying to make sense of it. And this latest development in transcription is a huge step towards achieving that ultimate goal.
What are your thoughts on the incredibly accurate speech recognition technology?