The Future of Bots: Voice as the Primary Computer Interface
Published on May 12, 2021

In my college days, I had serious doubts about voice ever becoming a viable computer interface, but the rapid advancements in bot and AI technology have me reconsidering that stance. I spoke with Microsoft Technical Evangelist Martin Beeby and Opearlo Co-Founder/CTO Oscar Merry about what they envision as the future of bots and AI. They both see voice-led computing happening sooner than many expect.

Watch the clip of our conversation here, or scroll below the video to read the transcript.

Voice as the Primary Computer Interface (Full Transcript)

Martin Beeby (Technical Evangelist at Microsoft): No, I suspect the only thing about this whole bot space is that I've only been involved really in building stuff for about six to seven months. And from the beginning of that time to now, things have massively changed about how we build bots, how we recommend it to customers to build bots. And I suspect that this is a really fast moving industry.

There's lots of different channels coming up all the time, there's lots of different capabilities. I think we're really on the cusp of what's gonna come next. And I suspect in six months time or maybe a year's time, things will look very, very different to what they look right now.

Sam Machin (Nexmo Developer Advocate & Alexa Champion): It's definitely moving very fast. What always surprises me is for years I've always thought speech was certainly never going to be an interface. I mean, how long have we had things like DragonDictate or that kind of idea that you could talk to your computer to...I remember when I was at school, I mean, that's 15 years ago or something, the idea that I wouldn't have to type out my essay. I could just read my essay to the computer and it would do the typing. And it never worked and things. And now…

Martin: That's exactly how, like that touch metaphor I was using about 2004, that's exactly how touch interfaces were. Lots of people would say you could never use touch as the main input for your system. You always have to have mouse and keyboard, it's just the way it is. And then that proved not to be true.

And I think that while speech is a much harder problem to solve, it will be solved. And it will be solved very quickly. And there will be a point in time when the input to systems will generally be led by voice. I think that's scary but also really interesting because that changes lots of things.

Like, if you don't need devices anymore with screens, then there's lots of different scenarios which open up. And it changes the way we interact with technology. It's just a matter of when that's gonna happen, not if, I think.

there will be a point in time when the input to systems will generally be led by voice

Sam: Interesting. So you actually think we'll end up driving our computers more, doing our work as well as just going via voice interfaces and AI-type assistants or…

Martin: Precisely. That's completely conceivable. And that's definitely the way that speech systems are being developed. They're trying to understand more and more natural language, but I think that they will become ambient computing devices which you don't necessarily have to have a screen in front of you. You don't necessarily have to have a touch interface. It will be much more voice-led. I don't see any reason why that couldn't be achieved in the next 10 or so years. And that will be the operating system, your speech will be your operating system.

your speech will be your operating system

Sam: Cool and scary at the same time, I think. But we're still's Star Trek.

Martin: That's why if you look at every major technology company, that's why they're investing so much in speech. Because that's the way that they see their operating systems being used in the future. That's why it's so important and critical and why billions of dollars of research are going into speech right now.

Sam: Oscar, did you...was there anything else you wanted to mention?

Oscar Merry (Co-Founder & CTO at Opearlo): I think we've covered it all off. I think one thing that I always talk about when I'm at meetups or on talks like this is just that I think the future that Martin's describing will come quicker than a lot of people realize. I mean, we'll get to a point very quickly where every home has a digital assistant in it and people just start using that as the first, kind of, interface to their brand. And I think that's gonna happen a lot faster than a lot of people realize.

we'll get to a point very quickly where every home has a digital assistant in it and people just start using that as the first ... interface to their brand.

Martin: Exactly. It happened with touch like that. Like it was literally a space of a year where touch stopped being rubbish to being excellent. And it's gonna happen with speech, I think.

Oscar: Yeah. And right now a lot of people don't realize or don't believe that that will happen because the reality is that the technology still isn't quite there yet. And people are still having these frustrated experiences that we've been talking about. So a lot of people are saying, "No, that technology’s rubbish." But there will be that moment where the technology suddenly gets good enough that the majority of people's experiences are really good. And that will be it.

[Editor’s Note: Watch the full one-hour discussion on the state of AI bot technology.__]**

Sam MachinVonage Alumni

Sam Machin has nearly 20 years of experience in the communications industry and a strong track record of innovation. His personal projects mostly revolve around Amazon Alexa, Sam created the AlexaPi project allowing developers to build thier own Alexa powered devices and he is recognised by Amazon as an Alexa Champion.

Ready to start building?

Experience seamless connectivity, real-time messaging, and crystal-clear voice and video calls-all at your fingertips.

Subscribe to Our Developer Newsletter

Subscribe to our monthly newsletter to receive our latest updates on tutorials, releases, and events. No spam.