Steve looks at a novel 'voice to everything' solution...
I have to confess to being a little cautious about 'speech to...' utilities - the theory is great but then I think of offices (and trains) full of people all telling their phones what to do and what to enter and the world quickly becomes a very noisy place!
Enter Vlingo, launching on both S60 3rd Edition (officially for just the
Designed to run in the background for quick access (a possible problem on the N97 with its current limited RAM, but that's not Vlingo's fault and should be less of an issue when the N97's v20 firmware appears), there's also a token N97 widget that gets created when you run the application for the first time. Mind you, it's only a shortcut to the running application and this could have been done by simply setting Vlingo as an application shortcut on one of the existing panels. A missed opportunity - the widget only makes sense if you could tap and hold the homescreen panel to 'hold and speak' - though I suspect limitations on the N97's widget architecture made this impossible, sadly.
The full application has an ultra-simple user interface: two buttons. 'What you can say' brings up a handy help screen, showing the commands that are recognised and giving some examples, while 'Hold and speak' is the way 'into' Vlingo's functionality.
There are half a dozen use cases, in terms of input that Vlingo supports, and they produced somewhat different results, depending on the amount of text being recognised. My gut feel is that the speech to text engine struggles once more than a handful of words require working on in quick succession.
Search the web
Even trying slightly tricky input, e.g. "music instructors" and "washing machine repairs" didn't phase Vlingo and the results were brought up on Google (in a customised instance of Web) within a couple of seconds. This is the application at its most impressive, although you do have to then tap (or click) away to get further - you can't just speak again to select a match and get more information that way. So the voice input only helps for the initial query. For devices without a convenient qwerty keyboard though, Vlingo has the potential to be quite a bit faster than other ways of entering a search query.
This feature is already part of most S60 phones, of course (on the standby screen, hold down Green/Dial or the Left or Right softkeys, or use the dedicated key, depending on interface and model), but it's taken to a healthier and more accurate level here with the ability to say 'Call Rafe Blandford at Home', for example, and the software looks up the contact, parses the phone number and initiates the call. This worked well in my tests and, potentially, this is the most useful aspect of Vlingo (for me, at least). Voice dialling using the built in S60 function produces much more erratic results and the recognition engine in Vlingo is definitely superior.
Sending a text (or email)
Here's where things start to get trickier. Not only is there a name to recognise, but we get to see how well (or how badly), Vlingo copes with general spoken text (and note that all my testing was done in a whisper quiet office and with me leaving a clear gap between clearly enunciated words). The results, I have to say, were mixed. Interestingly, recognition was often better if I spoken freely, with no gaps, implying that the speech engine is looking at patterns of word combinations and not just words on their own. However, in the results there were plenty of wrong words and 2 out of 10 times the recipient name was guessed wrongly as well - I'd hesitate to recommend this feature for real world, practical use. Here are two examples of things going badly wrong:
In the left example, the wrong Simon was picked from my N97 Contacts; on the right, as for the text - well, even I can't work out some of what I said...
Creating a new To-do or Note
This function works (seemingly) somewhat better because there's no name guessing having to be performed at the same time and because the resulting text isn't going to be viewed immediately by a recipient, so it can be a tiny bit inaccurate, i.e. you can fix it later and at least you'll have got the gist of your text input into the device. There are usually enough errors that you'd want to do any editing fairly soon after speaking though, before you forgot what the text involved.
Adding a Facebook status update
This is a particularly interesting and 'joined up' function. The theory's the same as above, except that what you dictate goes into a Facebook 'status' update. The first time you use the 'Facebook update' command, you're led through a slightly clumsy set of log in and authorisation screens, but thereafter the application is known by Facebook and it's actually very easy and painless to dictate something. For a chatty system like Facebook, dictated text may be more acceptable, though it's disappointing that there's no way of including a 'dictated using Vlingo voice recognition' message or similar at the end of the update - as it is, as with texts and emails, you'll always be tempted to mention the medium of your text input as an excuse for the inevitable wrong words.
Still, a useful facility and one which I haven't seen on any other phone voice system.
Launching an application
Again, this functionality duplicates something that's built into every S60 device, but I'll give Vlingo a pass here because most S60 owners don't even know their phone can do this - at least by installing Vlingo the user will have some expectation that this functionality is possible, so that's a step in the right direction.
Overall, I was impressed by Vlingo's ambition, the idea of (say) driving along in my car and dictating a complete email or text for sending with one tap is simply stellar. However, there are problems to bear in mind:
- You have to keep the touchscreen button pressed continually while speaking. If the whole point of speech input is to keep your hands free as much as possible then this isn't a good way of proceeding. And have you ever tried keeping your finger steady on a touchscreen while driving with your other hand on a typical imperfect road surface?
- The name recognition isn't perfect, especially when you've got multiple contacts with the same Christian name. This is one bit which HAS to be right. After all, you can send a message with a few errors and with a line at the bottom (e.g. "Generated by Vlingo") and maybe the recipient will take the errors into consideration and work out what you were trying to say. But when the recipient himself/herself is wrong in the first place, your message is going to totally the wrong person - which could be inappropriate - and nothing whatsoever is going to the person you really wanted. Yes, you can spot that the name is wrong, but then you've got to correct it manually and... well, you might as well have picked it manually in the first place.
- General text recognition isn't really good enough. Considering that I was testing Vlingo under 100% perfect circumstances, then the recognition rate was disappointing. Take Vlingo out into the world, into an air-conditioned, people-filled office, or onto a train or bus or taxi, and the results are going to be quite a bit worse.
Add in that the basic 'say a name' and 'launch an application' voice funtionality is already built into your phone anyway and I'd find it hard to recommend Vlingo across the board. However, it's fun (it's very Sci-Fi to speak to a 'computer' and watch while it tried to interpret your text!) and totally free to try out - maybe you'll have better results than me and fall in love with the utility? Comments welcome on its performance or on any interesting use cases!
Steve Litchfield, All About Symbian, 6 October 2009
Reviewed by Steve Litchfield at