Voice recognition and control on a budget - and all the things which can go wrong!

Published by at

We've all been impressed by Apple's launch of their 'Siri' voice interrogation technology in the new iPhone 4S. But it should be borne in mind that something along the same lines (though admittedly nowhere near as adaptable) has been possible for ages on Symbian, even on extreme budget hardware. Just as a reminder, and with some comments on whether this is the way forward for smartphones generally, here's a demo of the free Vlingo in action on an old S60 3rd Edition handset.

Notes:

  1. This is NOT an attempt to 'show that voice on Symbian can match the iPhone 4S'. FAR from it. Siri on the iPhone 4S is far more intelligent, far more adaptable and far more future proof. This IS an attempt to remind people that they can play with voice recognition without spending a fortune on a new iPhone. And yes, I know that Android has a number of voice interrogation functions too. As does Windows Phone. This is All About Symbian though. I was using a Nokia N86, but it could have been the N95 or even a budget 6120c etc. 
      
  2. That this was filmed in windy conditions outdoors - on purpose. I could have tested voice control in a nice quiet living room, but I didn't want to make things too easy for the software! Apologies for the gusty wind every now and again in the E7's microphone (filming the segment...!)

 

In the video, I refer to caveats to voice interrogation. There are, in fact, quite a number of caveats and disadvantages. Which is not to say that voice control isn't the future - or at least a crucial part of the future (watch an upcoming Phones Show for my vision of where we'll be in 2015). But, right now, I see the following as impediments to voice interrogation of our smartphones in daily life:

  • Background noise. Away from cosy reviewer office tests, there's traffic noise, other people talking nearby, glasses clattering and music playing down the pub, wind gusts, in-car rumble and engine noise. Although both Vlingo (above) and Siri seem to cope surprisingly well with some degree of background noise, it's ultimately a problem and a factor in recognition results being less than perfect.
     
  • Broadcast speech. TV sets and radios playing, most of which is speech, usually at reasonable volume. And it plays havoc with Vlingo, Siri or similar. Can you get away from the TV and radio?
     
  • Accents. Current voice systems just about recognise classical American, English and some European tongues. But even I have trouble understanding many of the dialects and accents in my own country, I'm sure it's the same for many others. How on earth will voice software cope? Given huge resources and enough samples, it could be done. But not... yet. 
     
  • Social factors. Testing all this in our reviewer office is fine. Testing it at home is fine - and may well save you some time. But what about on the train, on the bus, in the airport lounge, on the plane, in the open plan office, and so on? Your chatter into your phone will be bad enough for your companions. What about their chatter on all their voice interrogated phones too? It'll be an audio/aural nightmare!
     
  • Bandwidth. All current systems farm out most of the recognition to servers online. Which means you need a fairly fast Internet connection on your phone for voice interrogation to work. Away from home/office wi-fi, you'd better be in an urban area and hope for a decent 3.5G connection. Anything less and it'd be far faster to type what you wanted to do or to enter.
     
  • Mischevious offspring. By which I mean your kids. Whenever there's any kind of voice recognition going on, they love, simply love messing around "Fifteen, chickens, octopus...", talking nonsense at the same time as you, to try and throw the software off. Usually successfully!
     
  • It can be less efficient. We've all seen the Siri demos by Apple and some of them (e.g. setting a reminder) will save time. But quite a lot of voice interrogation uses (however cool) don't. For (a trivial) example, it's far quicker to look at the weather widget on my homescreen than to painstakingly ask voice recognition software "What's the weather going to be like today?" and then wait for an answer too....

That's quite a list, as I'm sure you'll agree. Can you think of other potential problems?

I'm also sure that voice interrogation will make its way into mainstream phones at some point in the future, but as just one input/control option, optimised for true hands-free use (e.g. when driving), alongside more traditional, more silent, more hands-on and less demanding methods of interaction with our smartphones.

Comments welcome if you've had experience of voice on Symbian or another mobile platform. Oh, and if you haven't tried Vlingo yet on Symbian, it's free in the Nokia Store!

Steve Litchfield, All About Symbian, 18 October 2011