At every stage of technological advancement, critics and commentators have endeavoured to predict its likely impact on society. Propelled by a fear of change, this is best evidenced by the “Millennium Bug” era, where we were finally partying like it was 1999 in the shadow of a potential global technology meltdown.
Just prior, as the internet gained momentum and erased our dependency on institutions such as the encyclopaedia and Yellow Pages, founder of 3Com Robert Metcalfe predicted it would “soon go spectacularly supernova and in 1996 catastrophically collapse.”
Fast forward to the present day, and we’re on the cusp of a step change in how we interact with technology. This is led by our desire to engage with both our devices and environment in a more naturalised way. Research from Futuresource suggests recent uptake and demand for devices featuring voice has continued to surge, with long-term forecasts suggesting 110 million units are expected to ship in 2022.
It’s therefore important to clarify some key points about how our voice is used, and why the potential of our voice assistants will only be achieved with pinpoint accuracy and a high degree of contextual awareness. To do this, we must first clarify some common voice technology myths and misconceptions.
“It’s recording everything I do…”
Contrary to popular belief, voice-enabled devices in the home are not recording everything you say. There is a crucial distinction between ‘listening’ and ‘recording’.
Whilst voice-enabled devices are ‘always-on’, all they’re doing is waiting for the user to activate them using a specific wake-word. Delivery of the wake-word (be that “Hey Google…” or ”Alexa…”) is their cue to capture the voice string (the command) that follows. Unless the wake-word has been spoken in the vicinity of the device, the voice string is not recorded.
Part of the problem is a lack of understanding about precisely how the technology functions, which leads to misplaced fears around data privacy. Certainly, further education is needed as we transition to a world of ‘ambient technology’, a future in which unseen enablers exist all around us.
Before long, voice-enabled devices will exist everywhere and will be usable by everyone. As digital assistants evolve to become part of the fabric of daily life, the trust issue will fall away, with consumers safe in the knowledge data shared with the ambient computing environment is secure.
It’s also worth remembering how comfortable we are with mobile phones, which have more sensors in them than any smart speaker today. In the same way mobile phones became integral to the operation of our everyday lives, ambient technology will be at the heart of the way we communicate and function.
“My voice assistant never understands my commands”
Whilst the accuracy of speech recognition isn’t perfect, the latest far-field voice capture technologies detect voice commands to an extremely high degree of accuracy. This is the case even when a command is delivered from across the room in a busy environment, and when the user speaks softly.
Accuracy rates will only improve as the technology is refined and matures. After all, voice is still in its relative infancy, which is often forgotten due to the success of the global smart speaker market.
3 Things That Will Change the World Today
Soon, not only will voice assistants understand what you’re saying, but precisely who is speaking, and even why. The AI-fuelled digital twin will observe the user, anticipate knowledge gaps and idiosyncrasies, and coordinate itself with the user’s preferences, schedule and specific needs. Relevant information will be delivered to the user before they even knew it was needed.
No longer will we be tied to a device that is deaf to the specific way we’d like to use it. As devices become more intelligent, they’ll gain a contextual understanding of the user’s desires and requirements, meaning that the implications of imperfect voice capture aren’t felt as acutely as they might be today.
“Voice assistants are just glorified speakers”
There is no question that public consciousness has woken up to the power of voice, and smart speakers have played a key part in that growth. Futuresource’s latest estimates for home consumer electronics with “built-in” voice assistants have been revised upwards from 2017-18 to 75.6 million units, based on increased shipments in the media streamer category.
The current level of growth is undeniably positive for the industry, but this doesn’t mean that voice is living up to its full potential. Media streaming devices such as smart speakers are responsible for a considerable majority of the revenue generated by the voice-enabled market, which means voice is penetrating only a fraction of its potential pool of applications.
With the latest silicon, it’s easier than ever for manufacturers to integrate voice functionality into any smart home device. Advances in engineering are spurring a wave of product implementations transforming the way humans interact with their devices, from healthcare companion robots to set-top boxes. As more and more devices are developed with in-built voice functionality as standard, the current aperture will shift, and the public will embrace the full scope of voice technology’s potential.
The various myths and misconceptions orbiting voice technology stem from a fundamental misunderstanding about how the technology works, and what it’s trying to achieve. This is not a position that will change overnight, but rather will shift incrementally over the coming years as we transition to a world of ambient computing, in which unseen enablers exist all around us, accessible to everyone. This world will be driven by devices that achieve a contextual understanding of users and their idiosyncrasies, and work behind the scenes to optimise the everyday. As voice technology blends into the background, it will become as humdrum and as widely used as the mobile phone.