Get ready to talk with Smart Devices

Aju Kuriakose

Controlling devices at home and workplace with voice was what we saw only in SciFi movies a few years ago. However with the advancement in AI, NLP etc. this has become a reality now. The number of devices with which we can interact via voice at home and office has grown in past 3-4 years.

Amazon has an undisputed leadership in this and its ecosystem is way ahead in market penetration compared to its competitor. There are many more companies in the foray including OK google and Cortana from Microsoft. There are also many smaller companies like Cyberon, Conexant all working towards enabling voice controlled devices.

As per Strategy Analysis voice could capture up to 12% of Industrial-IoT applications by 2022 and In the consumer segment, voice has the potential to capture up to 18% of application in the 2020 to 2022 timeframe.

Using voice to control devices will dominate and become the most preferred way to interact with IoT devices in coming years. It is because voice is the most natural way to communicate. Amazon and Google have made it very easy to voice enable smart devices. It is also very affordable today as devices don’t need to add much processing power because they leverage the cloud infrastructure to do the heavy lifting. Many companies are going to leverage this model, but there are companies like sensory which is trying to push the voice enablement to the edge. Its TrulyNatural is an embedded large vocabulary continuous speech recognizer system for devices which may not be cloud connected like a toaster or a space heater or a coffee maker.

Device manufacturers will have to pick bets on which technologies to integrate into their product for voice control as the landscape is still evolving. Most likely companies will end up integrating 2-3 leading voice enablement API’s into their product lines.

The key factors to be considered while making the technology choice is (a) Signal to noise ratio. These smart devices will be used in a variety of environment and the ability to capture voice by reducing background sounds is very important (b) Identification & Isolation – The system should have the capability to separate command from other ambient sounds. (c) Capture – Most times the source could be moving or at distance from the mic ( 1 meter to 10 meters) based on the environment in which it is installed. The devices should have the capability to accommodate this.

However, there are privacy and security issues which will have to be addressed as the popularity grows because voice-enabled devices are always listening for the wake-up keywords. So if someone hacks into the system, they will be able to listen to your private conversations. Also, ethical usage of the information obtained becomes important because companies trying to build on their NLP and AI algorithms may decide to listen to all our conversations to strengthen their capabilities.

Interacting with devices via voice reduces multiple steps in our daily lives when we control them. ( Example controlling a thermostat at home, in past we had to get up, go to the thermostat and press buttons multiple times to set the desired temperature. Today we can do it just with a simple statement ” Set temperature to 72 Degrees.”) . So irrespective of the challenges it may bring, we will continue to expand the boundaries of voice capability to control devices. As speech changed humans forever, enabling voice-based commands to communicate with everyday devices will change the world forever.  

Link to article on Linkedin