Time to pick up the pace as we get closer to WWDC. It’s once again time to take a look at areas that I hope Apple will address in iOS 7. In the last installment, we looked at Apple Maps- not exactly the company’s finest hour. In fact, this botched feature rollout is probably what prompted Scott Forstall’s departure from Apple, and the subsequent restructuring of the company’s management. In this installment, we move on from the disaster of Maps to mere disappointment: Siri and Dictation.
The Sunny Side of Siri
Siri was the hot new feature demoed at the release of the iPhone 4S a year and a half ago. The initial reviews were generally positive and it, along with the new voice dictation feature that debuted at the same time, saw a lot of early use as people got familiar with their new iPhones. While both Google and Windows Phone offered some level of voice search and dictation at the time, Siri’s virtual assistant features were novel, and the Dragon dictation engine that Apple licensed from Nuance was state of the art.
Some other positives of Siri and Dictation include:
- Siri is fully and continuously aware of contextual relationships between both people and things. It also remembers previous statements, allowing for a conversational style.
- It is capable of spotting scheduling conflicts and allows users to reschedule meetings and appointments on the fly.
- It’s a great tool for text messaging, especially while driving or while your hands are occupied.
- The easy to follow voice prompts and ability to change the text of a message or appointment that isn’t correct make the service ideal for novice users.
- The inside jokes and the sometimes cheesy, sometimes humorous responses feel contrived after a while, but show an effort from the engineering team to create a service that doesn’t feel or interact like a computer.
- Dictation is available anywhere in iOS that you can use the keyboard.
- Apple’s Dictation includes Dragon’s custom editing commands, adding a lot of power when it comes to using voice to accurately dictate things like proper names, product names, acronyms, and alternate spellings.
A big reason why Siri got those initial positive reviews was because it was a strong first step. The service was released in beta, which was a different move for Apple. They typically strive for more polish with new products, but the release of the iPhone 4S had already been pushed back to the Fall. They needed to get the software shipped, and attached the beta tag let users know to expect occasional issues, as well as a few loose ends. Again, for how ambitious Siri was at release, it worked rather well in my opinion.
However, once the novelty faded, so did a lot of people’s interest. While some of us still find Siri useful, the problem is that, even if you don’t have issues with it, it’s never right there in front of you. While it’s fully integrated into iOS, it isn’t essential for anything other than voice commands. Once the newness wears off, a lot of users end up forgetting that it’s there, and rarely use it. This weakness has been underlined even more by the release of Google Now a year ago. In contrast to Siri, that service performs some parallel user assistant functions, but does it through predicative notifications. This is currently a big advantage, because it keeps Google Now in front of the user.
By Apple’s own definition of Siri as beta software, they still have work to do. The question is, what is required for Apple to remove that beta tag? And better yet, what can Apple do to take Siri and Dictation to the next level, making them an indispensable part of the iOS experience? Let’s take a look at some of the issues, and potential solutions.
“I’m sorry, but I can’t do that right now. Try again later.”
If you’ve used Siri since its release, odds are that you’ve gotten this response a few times. At least she has the decency to let you know what’s going on. When Dictation fails, the familiar three ellipses just disappear, along with your voice submission. Both responses get old fast.
Unfortunately, there isn’t a way to tell who’s end the problem is on. I can tell you from my own experience that Siri rarely fails me when I am on WiFi at home or at work, or have a solid LTE cellular connection. I think I could count the number of times on my fingers, and most of those were close to the launches of the iPhone 4S and 5, when usage was at its highest. On the other hand, I’ve had plenty of failures on 4G connections and below. However, sometimes requests will go through with one bar, and will fail with five, making it difficult to pinpoint the exact cause. Is it the network, or is it Apple? I think the network is usually the culprit, but Apple has definitely had issues with all of the cloud services going down at different times.
Down on the Farm
All that said, it falls to Apple to do everything they can to remedy this situation, because if users can’t depend on Siri, they won’t use it. First of all, they are already beefing up their cloud presence with new server farms currently under construction in Reno, NV and Prineville, OR. These facilities will join Apple’s Maiden, NC center, which was the subject of much discussion after the announcement of iCloud in 2011.
These server farms are a step in the right direction, but if Apple wants to be considered in the same class as Google, it’s going to take a lot more than three. Currently, they are known to have at least 37 worldwide (20 in the US, 1 in Canada, 12 in Europe, 3 in Asia, and 1 in South America), and there may very well be more. If have ever wondered why Google’s voice search responds faster than Siri, look no further for a reason. Google has more total servers, and those servers are distributed throughout the world.
Unfortunately for Apple, there is no short term fix for this problem. They are hard at work, and they have the cash to follow through, but server farms don’t just jump out of the ground. They take time to build and commission, not to mention the process of selecting a site and going through all of the mandatory design and approval processes. So, for right now, Apple needs to maximize the server capacity it has and find alternate solutions to this issue, without taking their focus off of future growth.
Offline is In
One thing that Apple can do to both help offload server capacity and make Siri and Dictation more reliable, would be to make as many functions as possible available locally on iOS devices. In iOS 3, Apple introduced Voice Control, which let users perform a handful of voice activated functions, such as dialing of contacts and music lookup. In fact, it’s still there and can be used if you disable Siri under Settings-Siri.
This three and a half year old feature is built into the iPhone version of iOS, and had no ties to the Internet, which confirms that basic voice tasks can definitely be handled by iOS devices. Add in the option to use offline voice dictation, which Google added to Android last year in their Jellybean update, and you would really have something. The bottom line is, the more Apple can offload tasks from the servers, the more capacity they’ll have for the more difficult, Internet-dependent tasks and for peak traffic times. This is something Apple should definitely implement in iOS 7 if they are serious about building reliable cloud services that users will trust. Whether they can do this or not, is a different question that I will get to in a moment.
The Dragon Slayer
Remember earlier, when I mentioned what an advantage it was for Apple to use Nuance’s Dragon voice tech? Well, it was in some respects. Licensing their state of the art turnkey voice recognition system for iOS prevented Apple from stumbling while trying to cobble together an engine of their own. It was also ready for immediate deployment on mobile, as Nuance has had their own Dragon voice dictation and voice search apps in the iOS App Store for some time now.
So Apple using Dragon was a big win for both companies two years ago. What about now? Unfortunately for Apple, the landscape of mobile voice services has changed a LOT since then, and not in their favor. Google has really flexed the muscles of its server capacity advantage, making their super fast voice search features available on both Android and iOS, and now even on Chrome on both mobile and the desktop. What exactly does this have to do with Nuance? More than you might think.
While there is no definite word from the parties involved, there have been reports that Apple is limited in what it can do with voice search in Siri by the terms of their licensing agreement with Nuance. To illustrate this point, think about how disjointed Apple’s implementation of search is with Siri and Dictation right now.
First, you have the iOS Spotlight Search screen. Here is a blank canvas that’s just dying to be exploited to further. You can use the keyboard or voice dictation to search for items on your iOS device, but a web or Wikipedia search for the same term has to be triggered separately. The results show up in Safari, rather than in the Spotlight screen.
In another odd twist, Siri is not able to search for some of the items that a Spotlight Search can. For instance, it can play music, but it currently cannot search for it. Siri also can’t search for specific Text Messages or Emails.
Then, you have the fact that, even though you can edit a Siri search or command request with the keyboard after it has been sent, there is no way to type out and send such a request. As much as I love having access to quality voice recognition tools in iOS while in the car, or when my hands are occupied, it isn’t always the best method for search. There are also times when using voice recognition is inappropriate. As such, it makes absolutely no sense for Apple to limit Siri’s capabilities to voice only.
This is where the agreement with Nuance is potentially rearing its ugly head. According to some unsubstantiated reports, this agreement prevents Apple from allowing text input into the system, as well as any kind of on-device dictation capabilities. If true, this is a nasty 1-2 punch that really holds Siri and Dictation back.
If you need proof of this, just take a look at the way Google Now and Search have merged together on Android over the last year. It is now a seamless system that can take voice or text input, and use it to give the user control over device functions and web search. While Google Now and Siri are fundamentally different, it would benefit from a similarly unified interface. It would also help to cure the service’s lack of visibility within iOS. Merging it with Spotlight would tie it to a core element of the OS, not just a long button press.
It’s pretty clear that both Apple and Google view voice technology as critical to the future of their ecosystems. If Apple is truly committed to taking Siri to the next level, then they are going to have to either renegotiate the terms of their agreement with Nuance, or preferably just bypass that little detail by buying them. Neither of these option would be easy to pull off. As much size, money, and power as Apple has, Nuance still has the upper hand. They and Google are pretty much the only major players left in consumer voice technology. Apple has to have a voice engine for Siri, so Nuance has all the leverage in the world on this playing field. Apple would have to offer something very special to gain a more beneficial arrangement.
As for buying Nuance, that would be a tough pill for any company to swallow. Sure, Apple has billions of dollars seemingly just lying around (not really, but you’d think so listening to a lot of tech analysts and writers), but laying out enough cash to buy a successful company that has a market cap of $6.05 billion would be a tough sell to investors and the board. The answer to that question is another question- Just how critical is voice technology to Apple? What would they gain? Well, if you’ve ever used Dragon’s mobile apps, and then try Dragon NaturallySpeaking on a PC, you’d have a pretty good idea. Suffice it to say, there is a MAJOR difference in what these two products do. I bought a discounted copy of the previous version for Windows, and was really surprised at how well it worked, and how much flexibility it added. Apple’s version of Dictation is just the smallest tip of that iceberg.
At the end of the day, owning is far different than licensing, and if Apple did buy Nuance, then their engineers would be free to integrate state of the art voice tech into every corner of iOS and OS X. Siri won’t be a mainstream success if Apple doesn’t figure out a ways to get it out in front of users and make it more productive. Buying Nuance would instantly give them a clear path to get there. Would it be worth it?
The Tangled Web
While I have always thought that Siri was effective in most of its limited scope of features, the way it handles web search has always left me a little flat. It is capable of launching a search in Safari using your chosen default search provider, but this falls far short of what Google is now offering on Android. Users are quickly presented with results both from their devices, and from the web. It is a much smoother process, making Siri seem disjointed and slow in comparison.
So why did Apple set Siri up this way? I won’t go into detail, since I discussed this at length in an earlier installment of this series, but Apple spent a lot of time an effort removing Google from the core of iOS. One of the few ties that remains is their status as default web search provider (which is well deserved). However, before Apple sends a query that isn’t specifically spoken as, “Do a web search for…” they run it by Wolfram Alpha first. If Siri can a return an answer from there, it will. I don’t think that’s an accident.
Again, this puts Apple in a tight spot. Do they want to bring Siri’s web search capabilities up to date? If so, will they still try and limit the data from iOS users going to Google? Can they? If they do allow Google web search data to be automatically presented, can they do it in such a way that it works as well as the Google Search app for iOS? I’m glad I don’t have to answer these questions, because Apple is in a difficult position between keeping a competitor at a distance, and making their products work as well as possible.
A couple of months ago, I wrote an article about rumors of Apple and Yahoo working on new partnerships. At the time, I posed that Yahoo’s Marissa Mayer might be a very willing ally against their common competitor. Could it be possible that Apple would try to sidestep Google Search for Yahoo? That would be a bold move. It could make sense for both companies, but only if Apple implemented such a thing in the right way. For example, they could automatically serve up high ranking search recommendations from Yahoo via Siri, but still offer the user the option to launch a Safari web search, just like they do now. Whatever happens, don’t be surprised if you hear of a new Yahoo partnership of some kind at the WWDC keynote. Who knows, maybe we’ll even see Marissa Mayer on stage.
“Siri, What does the future look like?”
Change, of some sort. What kind? I wish I knew. So where does that leave Siri? There are tons of small scale services and features that Apple can add, like they did with sports scores and info and Open Table last year. And there are still annoying inconsistencies to fix in what is already offered, such as only being able to create an appointment in the default calendar, and Siri’s inability to read any email messages to you. What I focused on here are the big picture issues, which in truth, will probably be addressed further out than iOS 7. However, I will be keeping my fingers crossed during the keynote for a Spotlight/Siri mashup. This is the week to have hope!
As for Siri herself, if she knows anything about the future, she’s keeping it to herself.
What do you think about Apple’s Siri and Dictation? What features would you like to see added to Siri? How well does Dictation work for you? Where does Apple need to go with voice integration? I would love to hear your thoughts and suggestions. Feel free to let me know in the comments below, or on Twitter @jhrogersii, or Google+.