Our Voice-Controlled Computing Future: How Workers Use Voice Commands Today

Voice-controlled intelligent assistants offer a tantalizingly productive vision of end user computing. Using voice commands, users can extend the computing experience to not just mobile scenarios, but to hyper-mobile, on-the-go situations (such as while driving). With wearables like Google Glass, voice command promises even deeper integration into hyper-mobile experiences, as this video demonstrates. And voice controlled intelligent assistants can also enable next-generation collaboration tools like MindMeld.

In spite of this promise, there remains a lurking sense that voice control is more of a gimmick than a productivity enhancer. (As of the time I posted this blog, a Google search for Siri+gimmick yielded… “about 2,430,000 results”). To see where voice control really stands, we surveyed information workers in North American and Europe about their use of voice commands.

Information workers’ use of voice control today:

In reality, many information workers with smartphones are already using voice commands – at least occasionally. Our survey revealed that:

  • Over a third of information workers with smartphones use voice commands at least occasionally. Some 37% say that they do use voice recognition on their smartphones – at least sometimes. But not many use it often; only 11% of information workers say they use it “all the time” or “regularly.” Instead, a majority describe their use as “occasional.”
  • Workers use voice for short-task, on-the-go computing scenarios. Topping the list of applications at 56% is texting – a frequent choice for on-the-go communications. Next on the list are search (46%) and navigation/directions (40%). All three of these fall into what we can call short-task computing activities – sending a text, executing a search, or finding directions each require little direct attention, and are often completed while on the go.
  • Taking/recording notes is the top productivity application. Fourth on the list is “taking/recording notes” (at 38%). While note-taking is a keyboard-substituting scenario, recording notes resembles a tape recorder. Further down the list in productivity were composing full emails, creating calendar entries, and having the smartphone read something back (like an email or document).

Overall, our survey shows that a sizable percentage of workers are starting to embrace voice command. Yet it will be some time before voice controls join keyboards and mice as computing mainstays. Why? Although technical hurdles continue to fall, the market remains stifled by:

  • The platform wars. Voice control – and the critical intelligent assistants associated with it – are pawns in the OS platform wars, as Google, Apple, Microsoft, and others attempt to differentiate their experiences. In consequence, what a user can achieve with Google Now differs from what she can do with Apple’s Siri for iOS, as this video demonstrates. 
  • Other types of fragmentation. Confusingly, within the Android ecosystem itself there is competition – at least on market leader Samsung’s phones. Samsung S Voice offers a different experience from Google Now, though both can be used on a Samsung smartphone. This is exactly the kind of duplication that could confuse users, though recently Samsung created a new “Drive Mode” experience for the Galaxy S4 that could help differentiate the two.
  • Lack of continuity across form factors. Apple’s Siri isn’t available on the Mac; it came to the iPad later than it arrived on iPhones. Microsoft just introduced robust voice control into the forthcoming Xbox One, but it’s not identical to the experience in Windows. Nuance Communications’ Dragon fills part of the cross-platform gap by offering PC, Mac, and mobile versions, but it’s more of a dictation solution than an intelligent assistant. Until users know what to expect from voice control, adoption and acceptance will be hindered. (Note user frustrations with IVR systems!)

The bottom line: Expect voice control and intelligent assistants to continue to grow in popularity among workers and consumers. But expect speed bumps along the way as a variety of vendors compete to craft our voice controlled computing future.

J. P. Gownder is a Vice President and Principal Analyst. Follow him on Twitter: @jgownder 

Comments

Voice input, text output

Your survey confirms what I have long been predicting would result from the advent of multi-modal smartphones, the convenience and efficiency of speech for input, when practical, and the output in text when appropriate. Mobile user situations will dynamically determine what is actually chosen by individual end users, e.g., in a meeting, driving a car, noisy public environment, etc.