Behind the scenes: voice tech history in the new millennium

Photo by Priscilla Du Preez on Unsplash
Photo by Priscilla Du Preez on Unsplash

They say voice recognition technology has seen more progress in the last 50 months than in its first 50 years.

Now that you know how voice technology came to be, let’s take a look at what happened next, in the 21st century.

So far, it’s been a hell of a ride!

2001. Microsoft Speech

From 2001, when Office XP came out, Microsoft provided limited speech recognition for Microsoft Office apps. You could use it to dictate text into any Office program, select menus and other Office features. Microsoft Plus! for Windows XP allowed you to navigate Windows Media Player using voice commands. The trick was that speech recognition software didn’t come preinstalled. So after you installed it as a separate component, you had to train Microsoft Speech to recognize your voice. 

Only in 2007, when Vista was released, Microsoft integrated speech recognition into their apps. 

2008. Google Voice Search

In 2008, Google released the Voice Search app for iPhones. It allowed users to perform a Google search by simply dictating a query (so strange it used to be such a big thing only a decade ago, huh?). 

Since it was a mobile app, Voice Search made speech recognition available for millions of people. By the way, already in 2008 the app relied on a cloud infrastructure: data was offloaded and processed at Google’s data centers, which made the app work really fast. 

Not only that, in 2010 Google was training its language models on 230 billion words typed in search requests at Google.com to predict what a person uttering a query will be saying next. 

2011. Siri

In 2011, Apple launched Siri on its iPhone 4s. Siri too relied on cloud computing, but unlike its predecessors it could recognize speech and act upon what it heard. Upon its release, Siri was praised for voice recognition capabilities and contextual knowledge of user information like calendar appointments, but received criticism for its stiff user commands and lack of flexibility.

Either way, this marked the dawn of voice assistants as we know them today.

For those of you who wonder why Siri is called like it’s called: the name was invented by Dag Kittlaus, one of its co-founders. In Norwegian, Siri means "beautiful woman who leads you to victory". By the way, Steve Jobs wasn’t a big fan of that name

2011. IBM Watson and ‘Jeopardy!’

Something else happened in 2011. IBM Watson, a computer capable of understanding questions in natural language and answering them, beat two human champions at the game of ‘Jeopardy!’. 

Source: The New York Times
Source: The New York Times

Although Watson proved itself to be imperfect, it could still tackle convoluted and often opaque statements, which had been IBM’s purpose all along. If you want to see how it all went down, check out this video recap

2014. Cortana and Alexa

What followed was an explosion of voice assistants. 

In April 2014, Microsoft announced Cortana, an assistant similar to Siri and named after a character in the Halo video game franchise. It uses Bing search to perform tasks such as setting reminders and answering questions.

Interesting fact: to develop its digital assistant, the Microsoft team interviewed human assistants. These interviews were an inspiration for several features in Cortana, including its "notebook" feature.

And in November 2014, Amazon introduced Alexa, a voice assistant that lives in the Echo, a voice-controlled speaker. While the other two assistants, Siri and Cortana, are just features of the devices they are used on, the Echo is entirely dedicated to Alexa.

Unlike its rivals, Alexa is called Alexa for a very pragmatic reason: the name has a hard consonant with the X, which allows it to be recognized with higher precision.

2018. Full-fledged voice AI

Among all these technologies, a new stream has emerged and is evolving at breakneck speed. There is now voice artificial intelligence that employs speech recognition and speech synthesis to automate entire areas of voice communication. 

Ours is one such technology. What makes Dasha stand out is its proprietary tech that allows it to sound like a human. You can request a demo to find out what I’m talking about. 

Whether voice interfaces are the next big thing is a statement open to debate.

But I know one thing for sure: voice has always been the most natural way to communicate. And voice technology has come all this way to make sure it’s here to stay. 

Voice technology timeline: the new millennium
Voice technology timeline: the new millennium

Related Posts