Speak Freely – Part I

by Eyal Inbar, Marketing, NEC Unified Solutions

100,000 years ago humans started to speak so they can free up their hands. Is it going to happen again? I’d love to get rid of my keyboard….

A few weeks ago I wrote this short sentence on twitter and got many comments back. I thought it would be a good idea to share a bit more about my view on speech and voice technology and the way I believe it will change the way we communicate

In some odd way, it turns out that technology took us back in time to some very old habits where we find ourselves communicating again with our hands, using the keyboards on our computers or cell phones. I believe that the same reasons that made humans move away from using their hands 100,000 years ago will be the main driver for going through that evolution once again – productivity and improved speed of communication. In the 21st century this will be driven by voice technologies that will extend our ability to speak not only with humans, but also with machines.

Why Speech? Why now?

The Ability to speak is one of the most basic capabilities that a human being has. A person with as low as 50 IQ can speak, a child will start learning words from almost his first day and will usually start speaking by the time he is 2 years old.

Considering that personal computers took off in 1975 with the introduction of the first PC kit (MITS), for many of us, typing is almost a second nature, but ask an older person or someone that has never seen a computer and he will tell you that there is no way typing is easier than talking.

Although it is not natural, with the evolution of technology and the integration of our personal computers to the network, internet and business processes, typing became much more productive and efficient than speaking.

Speech technology on the other hand, has been here for a long time, a promise that was never fulfilled, up until recently. Over the last few years speech took off in a major way. There probably isn’t a single day you go without putting it to use. Be it  your car’s and GPS system, your corporate office directory, you call to the airline booking desk when checking your flight status and more recently it is available on your cell phone and smart-phones applications such as Google Voice Search.  It seems as if it took off over night,

There are many technical reasons that made this possible including new innovations in the technology of speech recognition, standard protocols such as VXML, increased CPU power, mobile technology and more. I will not talk about any of those in this blog, instead, I’d like to share with you my view on the business reasons that made and will make speech and voice solutions so important for customers and businesses in the upcoming years.

Next time I’ll discuss how to connect with customers on a deeper level.

Communications-Enabled Business Needs Better Interfaces

NoJitterLast week I was in Japan at NEC’s iExpo, and a few of the demonstrations that were shown there made me think about the way we interface with today’s systems. These demos weren’t the typical lame demos that many vendors show at conferences, like simply demonstrating when you pick up the phone the presence status on the user changes to “on phone.” NEC ran a series of demonstrations showing how, by using unified communications mixed with things like facial recognition systems, ID scanners and biometrics, the way users and systems work together can radically change many of the business processes we have.

I also want to make a comment on NEC as a company. The breadth of what NEC sells outside Japan is only about 30% of the product portfolio available in Japan. For a company that’s been trying to raise its US profile, it would seem expanding its portfolio would really help. A single source solution in Japan would require several US vendors to deliver all of the product required for the same solution here. This would obviously require NEC to learn to sell solutions to line of business managers, and many of the products would be sold by being pulled through the solution, but I see that as being a necessary requirement in a few years anyway.

As far as the demos go, one example was this two foot tall robot that NEC created (I don’t remember the name of it but perhaps someone from NEC could post it here and people could look it up). The robot’s eyes were cameras, it is able recognize speech commands in Japanese language (which is no easy task) and was cable of acting on a set of commands that would be initiated off “triggered” events. NEC is also working on emotion-detecting software that is a level past facial recognition, where the expression on someone’s face would be interpreted. A practical example of the use of this robot could be having the robot be part of the staff in a day care center. The robot could mingle among the kids and if the child needed someone or something the child could make a request, the request interpreted along with the face scanned to indentify the child, this information could be combined with the presence status of the day care staff (or maybe even the parents) and a request or message sent off in a number of different communications method (hence the importance of the tie in to unified communications). This could easily be applied to elderly care, hotel environments, airports, retail stores or any other environment where services need to be provided to someone without having to use a keyboard or mobile phone as an input device.

There are many ways to input information into a system, with the most common ones today being through a keyboard, dial pad on a mobile phone and, in some cases, using SMS on a mobile device. These are obviously widely used for entering information, but they can be slow, difficult to use when mobile or just not practical in certain situations (like toddlers at a day care). The most common additional types of input devices I see are:

* Speech recognition. This works well for users that are in hands-constrained environments or when the user isn’t capable of using a device. This could be a worker on a factory floor, a mobile worker in a car, a small child or bed ridden individual. It’s true there have been speech recognition systems built in the past that, I admit, were not very good but the technology will only continue to get better.

* ID badge scanners or proximity cards. This is a bit big-brotherish, but these can be used to track workers as they move around a building or campus. These can also be loyalty cards or ID bracelets given by airlines, schools, hotels or casinos that people could carry with them. Mostly people think of these devices as authentication devices, but the ability to provide a customer’s or worker’s location can add another level intelligence to a business process.

* Facial recognition. This would be used primarily for dual authentication or used to identify people where identification can be used for security or other purposes, such as healthcare.

* Environmental systems. The environmental systems in schools can be used to provide additional information on the temperature of the room, noise level or other conditions that can alter the comfort or situation of a person.

None of these on their own is the “next big thing,” it’s the combination of these integrated with traditional input methods and unified communications that can significantly alter the way people interact with each other and the way people interact with machines. We’re still a number of years for this to be mainstream but new ways of getting information into systems are sorely needed to drastically change the way we live and work.

Written by Zeus Kerravala, The Yankee Group

Get Your Complimentary Report! ($399 value)

Andrew Borg, Senior Research Analyst in charge of the Wireless & Mobility Practice at Aberdeen Group, is conducting an online survey for the upcoming benchmark report on the next inflection point in enterprise wireless LANs.  Andrew asked that we share this survey with our customers and associates.  Please feel free to participate at your convenience.


In appreciation for your taking the 15 – 20 minutes to answer this survey, we will provide you with a complimentary copy (each a $399 value) of:
– The findings of this study when they are available on December 1, 2008, PLUS the
Voice Over WiFi in the Enterprise Aberdeen Benchmark Report, available immediately after your submission of this survey.