The era of voice search and voice-operated software is upon us. As a developer I live and die by the keyboard, but I can already see the signs: Like many people, for example, I talk to my Android phone (for example, "Navigate to Lowes [or Starbucks or Harris Teeter]") to get directions.
In Mary Meeker's 2016 Internet Trends Report, she reports that Google Voice search queries have risen by a factor of seven since 2010. I've also noticed that my 12-year-old son does nearly all of his searches via voice -- and my girlfriend texts me this way on a regular basis. Also, the company I work for, Lucidworks, recently announced a new partnership with IBM to integrate Watson and text-to-speech capabilities into our enterprise search product.
The technology works a lot better than it used to, and it's easier to integrate into applications. If you develop for Android or iOS, you can easily hook into the APIs for speech recognition. But speech recognition doesn't begin and end with simple speech-to-text and voice commands.
Understanding the intent of the search is a very contextual task, especially with spoken language. Moreover, people tend to use more words in natural spoken language than when they are confronted with a search bar. There are more "noise words" in spoken language than in a normal textual search.
These are significant AI challenges. But as we overcome the context problem, developers will learn that more can be done with voice than with text. Emotional context will play a role. If you're looking for a gas station, do you want the cheapest one or the closest one? The emotive content of your voice could imply that. Sure, you might clarify, but you might not have to.
Your talkative future
The voice-driven epoch isn't about search alone. It will affect the entire way we interact with computers. In the not-too-distant future, keyboards will be considered "quaint," as Scotty famously described them in "Star Trek IV."
But that shift also demands a whole new UI. Here's an ancient illustration of what I mean: When Windows 95 came out, IBM had integrated voice commands into its PCs. At the time, I was working as a salesperson at Office Depot, and it quickly became apparent how impractical voice commands were. The windowed interface didn't lend itself to this form of interaction at all.
I mean, how the hell do you move a window out of the way of another window and resize them both to fit on the screen in an efficient manner with voice commands? You don't. You ditch those windows (and probably Windows) altogether. A voice-driven UI doesn't use the same motifs. You never see a windowed interface on "Star Trek."
Speaking of "Star Trek," when people start coding or doing something technical, they always switch to a tactile interface (OK, not exactly tactile -- it looks more like a microwave keyboard overlaid with art nouveau renderings of a circuit board). But is the regression to "typing" necessary? True, I can't imagine using a voice interface to code in Scala. Maybe new languages (devoid of parenthesis, unlike Scala -- and my articles) will be developed that are specially suited to voice.
Websites surely won't look the same and will offer new navigation paradigms. You'll say "show me deals on shoes," and what you get back will probably be better organized and more contextually sensitive than your average website ("deals" && "shoes"). Moreover, I won't want to scroll or say "next page" a lot, so the interactions will have to be personalized. The system should already know I want men's shoes and I don't want hard-heeled shoes due to my Achilles' tendonitis. Maybe it knows I prefer dark colors. Maybe I told it or maybe it analyzed my behavior.
Is this a website at all? Sure, if I'm shoe shopping, I'll want a visual representation, but if I'm talking maybe the machine is talking back. Maybe it shows me shoes, then asks: "Are you looking for a particular type of shoe? What purpose are these shoes for? Are you wearing them hiking or to a party?"
The era of voice search will change everything from how we interact with machines to how we code. Many of the technologies we need are already available to us today, while others are yet to be invented. The effect on user interfaces could be more profound than the switch from punch cards to keyboards.
This sweeping change won't come all at once. Today isn't the day to throw out your keyboard. But it might be the day to start thinking about redesigning your website to be truly voice-accessible.