Guest blog from our singlecore friends
With all the hype around Motorola’s new smart phone, we wanted to sit down with our DSP analytics guru, Milind Borkar, and find out more about the buzz surrounding this phone and the speech recognition capabilities associated with it. In this Q&A, Milind answers questions related to speech recognition in the market today and possibilities in the future.
What is different about the capabilities in the recently released Motorola X?
[MB] Contrary to typical speech recognition solutions seen in the market today that require ‘push to start’, the Moto X phone is bringing ‘always listening’ speech recognition technology to the hands of the consumer. This is enabled by a technology called ‘phrase spotting’, where the speech recognition algorithm can detect the presence of predefined utterances in the presence of noise and other speech. It is the electronic equivalent of your own response when you hear someone mention your name in a noisy environment like a train station. This enables the Moto X phone to be used in a truly handsfree manner, eliminating the need to push a button to activate the speech recognition functionality.
Is this a trend, you can envision catching on in other end equipments? If so, what are some examples or possibilities?
[MB] This is certainly a trend beyond smart phones, enabling a convenient way to interact with various objects in our everyday lives. Multiple smart television manufacturers have already released voice controlled TVs. Microsoft has also announced voice command features on the upcoming XBOX One game console releasing later this year. The possibilities are limitless. Imagine the impact on home automation: the ability to control your blinds and lighting, or your TV, set top box and music player, all by simply speaking to them. Imagine a new generation of toys that respond to your voice. And the usage of this technology is not only limited to convenience, it also addresses health concerns. Wouldn't it be nice to avoid touching light switches and door knobs in places like hospitals and public restrooms, but instead use your voice?
What type of processor is ideal for this application?
[MB] Phrase spotting implementations for speech recognition typically require a fully embedded solution as it enables low latency operation. Additionally some end equipment may not have internet connectivity, so a fully embedded implementation becomes a requirement. If the end equipment is portable and battery powered, then low power consumption is a key requirement to ensure that an ‘always listening’ implementation does not drain the battery in an unreasonably short duration. These features are enabled by a combination of (i) a processor architecture that is optimized for real time audio processing to allow a low MIPs implementation with low latency response, (ii) large on chip memory to reduce system power consumption and system cost by eliminating the need for external memory and (iii) low active power consumption.
Does TI have a DSP that is capable of enabling speech recognition?
[MB] TI’s C55x family of DSPs is very well suited for low power fully embedded speech recognition applications due to the following features and benefits:
- Active power consumption as low as 0.15mW/MHz, an architecture optimized for speech and audio processing, and integrated hardware accelerators allow voice wake up implementations consuming less than 5mW in always listening mode
- 320KB integrated on-chip memory eliminates the need for external memory while implementing up to 40 voice commands in phrase spotting mode
- Scalable family with a broad range of performance and memory options, along with an efficient C compiler, allows customers to add other functionality beyond speech recognition
If customers are interested in TI DSPs for speech recognition, how can they get started?
[MB] TI is enabling low power speech recognition solutions on the C55x family of DSPs by working closely with Sensory Inc., industry leaders in low power speech recognition software. Due to the customizations required for each engagement, we are presently only able to engage with select high volume opportunities. To discuss potential engagements, please contact your local TI or Sensory sales office.
Additional resources:
- More information on TI’s embedded analytics capabilities is available here.
- View TI partner, Sensory’s, blog posts on this topic
What are some cool products you can envision using speech recognition features?