“OK Google …”
I whispered this for the first time into my laptop using the new Chrome Extension last November. It allows you to activate voice search on your laptop or desktop without lifting a finger or having to put down that slice of pizza.
Google is obviously training us to get used to search by talking not typing, since release of the plugin. The voice recognition was impressive. For a moment I thought “eh … big deal”.
But, I also wondered:
- How does the voice recognition work?
- Will typing “DIE”?
- If typing dies, will keywords gradually shift towards “natural language”? (And away from caveman queries)
- Has Google already adjusted their search engine to account for more natural language keywords?
- As search marketers, is there anything we should be doing now to adapt?
My mind was spinning with questions. But before I attempt to answer them, we need to jump back to September 27th 2013 and clarify a few things.
“OK Google… Show Me…”
We’ve all looked like this since the dat Hummingbird came out:
I want to take you back to Google’s 15th Birthday Celebration. They talked about several things that day, such as;
- Conversational interaction with the Knowledge Graph
- Conversational interaction with Google Now
- An improved and more streamlined design across Google products
- Users interacting more and more with devices by talking, not typing
- Hummingbird – a full re-write of their search algorithm
Why is this important to voice search? They are all different topics, different technologies - and talking about them at the same event doesn’t make them any more related then pretzels and caramel even though they both made it into my pint of Ben & Jerry’s ice cream.
To understand voice search, we have to understand what voice search is not.
Gotta Keep ‘Em Separated
So let’s clearly define what each of the following elements of Google are. This is extremely important to understanding voice search so we know what voice search isn’t.
Google.com - You know. The search engine. It’s the interface which you access in a web browser to search Google. Enter a query in the search bar and google.com returns web results, images, map listings, videos, ads, knowledge graph results and more. You can type or speak your searches.
The Google App for Android and iOS - In these you can do a variety of Google things. You can search as if you were on Google.com – You can access your calendar, Google maps, Google voice etc. The app is a mobile program in which essentially houses all Google has to offer on mobile devices and tablets.
The Knowledge Graph – This is the massive database of “entities” or “things (not strings)” that Google has
scraped sourced from hundreds of places around the web – most popular of which are Wikipedia, Freebase, IMDB, etc. The knowledge graph it’s self is not any one UI. It is the data and relationships of that data, which gets returned and displayed in a variety of ways within Google search and also Google Glass.
Google Now – This is sort of an add-on to Google search on mobile devices only (at the moment, although it’s rumored to be coming to desktop). Google Now does not return traditional web results. Rather, it returns “cards” of info such as the weather, reminders, nearby events, directions and current commute time to work, alerts and lots more.
Google Now is not the same thing as the Knowledge Graph and it is not the same as Google search, although in terms of UX it sits just below the search form in Google mobile apps (thus why I call is an “add on”) Google Now’s goal is to give you contextual (I’d even say hyper-contextual), timely information and to also predict your needs.
Voice Search – Or sometimes called “hands free” search. This is possible on all devices to search Google. Voice search is not synonymous with the knowledge graph or Google Now, despite the fact people often lump them together.
Voice search is simply an alternative input method to enter text into the search box. You can access all functionality of Google Now or the Knowledge Graph voice commands by typing. This means voice search is simple an input method.
Search Commands – THIS is what can be associated with Google Now and certain Knowledge Graph results. A search command in general is a phrase or search sequence in which you can trigger specific types of results (such as Knowledge Graph comparisons) or Google will “remember” you have been talking about the Eiffel Tower in a series of adjacent searches.
Commands are also very much how Google Now functions. For instance when you say “remind me to …” this triggers the reminder Card.
Here’s the thing. I call these search commands because these are not the same thing as “voice commands” because technically you can access these search commands by typing.
Thus, it’s more accurate to think of commands like “layman’s search operators”. You have to enter them as they are designed to function or they don’t work, similar to title: or inurl:
Voice Activation – Otherwise known as “OK Google” or “OK Glass”. This is also not the same thing as “commands” or “voice search”. Voice activation is purely a process in which you “wake up” Google with a phrase it is listening for in the background. It then is ready for you to perform a voice search or use a command within your voice search.
The only thing the Chrome “OK Google” extension does, is make it possible to initiate a voice search, with your voice. Searching with your voice vs. typing gives you the same exact search results.
Hummingbird – Let’s. Count. The. Many. Interpretations. Of. Hummingbird. Again – many people based their analysis of what Hummingbird was, based upon other topics that happened to be talked about at Google’s birthday party. But, I believe they are not directly related to Hummingbird at all.
- Hummingbird is NOT Google Now
- Hummingbird is NOT voice search
- Hummingbird is NOT the Knowledge Graph
- Hummingbird is NOT search commands
- Hummingbird is NOT even entirely semantic understanding or query re-writing
Listen again to Amit’s statements from the Birthday bash. Pretend they are independent statements as if they did NOT follow anything that was just said at the event. This is AFTER everything he says about voice search, the knowledge graph etc.:
[30:30]“Now, before we wrap up I wanted to just give you one more update on a change that we have already made and most of you probably haven’t noticed. We have changed Google’s engines mid-flight – again.
I’ve been working on search for over twenty years and I have seen first hand how search is all about the technology that powers it. When I first joined Google, we created something that you all know today as PageRank, a radically new algorithm to assign importance to various web pages. Over the years, we improved PageRank quite a bit, but in 2010 we completely changed our algorithm to something that we called “Caffeine”.
And very recently, we did it again. Having worked on search for all these years, I can honestly say back in 1998 I would never have imagined that someone could build an algorithm as powerful and as advanced as what we have just rolled out. Internally we are calling it Hummingbird.
Hummingbird makes results even more useful and relevant, especially when you ask Google long complex questions. And it affects about 90% of our searches worldwide. Yes, Hummingbird affects about 90% of our searches worldwide.”
That’s it! Nothing else in the video before or after that was about Hummingbird at all. In short, I think Hummingbird is a sophisticated combination of;
- Social and other non-link “popularity” signals (a modern day PageRank)
- Hyper-contextual understanding
- Advanced personalization (the switch to SSL by default one week prior was no coincidence in my opinion)
- The Panda / User Metric / Quality algos rolled into the main algorithm
- Topic classification of web pages and sites
My point is - I do not believe Hummingbird to have much to do at all with “understanding queries” “re-writing queries” or anything like that.
IMO, it’s more like the creation of a “super algorithm” far more sophisticated than PageRank combining all of the elements above in one system.
Tell Me, How Does Voice Search Work?
Bill Slawski has talked about this many times, showing us patents that have to do with Google and voice search. And I suggest you check out his articles. I’d like to mention what seem to be two big components of voice search, more specifically voice recognition technology:
- Voice search is context and session based
- Voice search relies heavily on “learning”
Voice Search And Context – Not (Entirely) What You Think
I want to emphasize here that there is an element of “context” that has nothing to do with ranking and returning search results. It has everything to do with figuring out what you’ve said – to then give you the right search results. Check out figure 2 from this patent Geotagged environmental audio for enhanced speech recognition;
This is a process which:
- Detects where you are located – your car, outside, in a city etc.
- Then using “acoustic modeling” it filters out common extraneous noises from your voice “utterance” (what Google calls them – sounds kinda caveman to me).
Yes … Google Now benefits you with event recommendations, reminders when you get to work based upon allowing Google to see where you are at all times.
But at the same time, Google needs you to ‘disclose’ where you are with your mobile device to accurately recognize what you speak into the device.
If you don’t believe location based noise modeling has gotten very sophisticated, check out this video from two years ago on the Google Voice Search YouTube page
The first Google Voice Search Underwater
Learning To Recognize Spoken Searches
In this patent selecting speech data for speech recognition vocabulary it seems to rely heavily on prior query logs to track things like vocabulary size, how recently a particular word was searched – to continuously learn how to perform voice recognizing with more accuracy;
The point of showing you all of this is to convey how much processing happens just to recognize what you’ve said before even taking the query and returning the search results.
Voice Recognition – Bringing It All Together
With this last patent Speech Recognition Based Upon Variable Length Context I’m going to share seems to bring context and learning together as you can see in this diagram.
So, it is important to remember that contextual search is also to figure out what the heck you said. Not just realize you’re in NYC and when you say “pizza” you want pizza shops in NYC and not Kentucky.
Will Typed Search Die?
Amit Singhal believes that typing searches will soon be as archaic as “dialing up with AOL”. He believes that with the increase in devices without keypads, we are going to rely heavily on voice interaction.
Typing Is Not The Modern Day “AOL”
While I do see the increase in mobile devices, wearable devices and so forth, I think the analogy of “typing” to “AOL” is weak and that typing will be here for a long time (although I can see the blog posts coming, “Typing Is Dead”)
- First of all, adoption to high speed Internet was easy because it required no behavioral change on the part of the user. It was a clear benefit and cost savings (as far as I am aware). It was an improvement (speed) and simplification (always connected).
- Learning to talk instead of type though, requires a change in behavior, a learning curve, not as compelling benefits (yet) and more complex technology.
Lack Of Search Suggest & Instant Is A HUGE Disadvantage
You know what we all rely on WAY more than we realize? This:
Aside from the major source of content ideas search suggest provides, we don’t even realize how much we take this for granted when searching. Search suggest (as far as we know it now) is not possible with voice search. It might be, but we haven’t seen it.
Strings, Not Things
In a recent article, David Amerland does a great walk through of his thoughts on OK Google. However, (and I love David’s work, I think it’s fantastic) in this case he stipulates in three examples that by finally adding “show me” to his query, Google is understanding the intent of his search and that it is the best result. I have found otherwise though and I do NOT believe Google has rolled much of any of this sort of processing to adapt for expected voice search resulting in more natural searches.
Let’s look at a few examples of me attempting to “talk naturally” to search Google, intentionally not using known search commands.
Above: This seems natural enough to say. If I was on my mobile phone “talking naturally” to Google. But Google fails to understand my intent here. A friend would know I probably want tour dates or ticket listings. I don’t even get his official website.
Above: I removed “tell me” but still didn’t have luck.
Above: I try another phrase that seems natural to say. “I want to see John Legend perform”. We start getting close with Ticketmaster, but Google is returning three YouTube videos.
This isn’t half bad, as technically they are videos of him performing (albeit not live, so still not that great). But Google doesn’t seem to understand my intent.
Above: Finally! This gets me two ticket listing sites up top. It seems I definitely still need to clearly specify “buy John Legend tickets” (and the I want to is a bit extraneous). (Is Google “synonym matching “I Like” for “I want” – see the bolding? – so it hasn’t thrown away that sting and re-written the query).
Above: It’s interesting to note here that as I try to type a “natural” search, Google Suggest doesn’t even complete “per…” with “perform”. Still some weak understanding happening here.
Above: Here I am trying a known “command” … “show me”. Natural enough to think a user talking into their phone or even desktop might attempt to converse with Google in a natural way. However, this result isn’t even close! The results are some sort of book.
Above: If I remove “show me” now I get better results. Google is even matching “Mother’s Day” when the prior inclusion of “show me” didn’t get even close.
Above: Look at all the string matching here for “show me”! Where is the “natural language” understanding? Again, I don’t see evidence of query adjustment outside of known
Above: Here I’ve added “for sale” to the end. I am thinking this clear transactional extension will trigger shopping results (or at least clarify my intent). But despite slightly better results, Google is still bolding “show” as if I am searching for that string.
Above: Google! I thought you were my BSEFF (think about it). But, WHAT’S UP? I politely and casually tell you I need a great dentist nearby and you give me that.
Above: Not even close. PPC is killing organic. They get it. I want to buy ink for my printer. Nice Rap Genius result poking out of the bottom there.
I Don’t See A Dramatic Shift Towards “Natural Language” Searches
Not for your everyday, NON Google Now, NON Knowledge Graph, “normal” searches. One major point of the whole exercise above was to demonstrate that:
- Google is not currently providing great results for what I, as a user “talking naturally” would entice me to perform all searches in a natural way.
- We may adopt to using the known search commands, but beyond that if the results aren’t helpful I’m going to simply speak “John Legend tickets” instead of typing. Speaking is faster, but I’m still going to say the same search I know will get the best results, unless compelling evidence shows otherwise.
Everyone’s equating conversational search with voice search, as if that will actually happen across the board, when it is really designed for specific commands. Google Now commands and Knowledge Graph commands are totally separate from organic search results. The data being returned is largely not webpages but rather just that, “data”, which has been sourced and organized apart from the organic web page engine.
What Can We Do To Prepare For Voice Search?
I do not believe there is anything immediate actionable (like we all wish there was) such as “go put more long tail keywords in there”. Not just yet. But, there are still ways to be preparing for what could come.
1. Use The Technology – I believe as search marketers, it is our responsibility to be active users of new technology. We need to get intimate with it to really fully understand how it looks and feels to experience from a user’s perspective. Use Google Now. Experiment with voice search and Knowledge Graph Commands.
2. Stays Heads Up For Possible Changes – IF voice search does change the way we search. If it does force Google to improve how it processes natural queries. This is going to change a lot of things:
- New metrics – if voice search takes off, I wouldn’t be surprised to see things like voice search volume voice search as a segment of organic traffic source. Google wants us to please users right? It would make sense they give us this data to segment out voice vs. typed searches.
- A separate voice algo – wouldn’t it make MORE sense to switch to a voice algo when someone speaks his or her searches? It makes no sense to me that if types searches are supposedly going to be different from spoken searches that Google would run the same algo for both.
- Adoption of voice searching will not change typed searches when we are on a desktop/laptop device.
3. Think For Yourself - I do think it is our responsibility to constantly question our initial assumptions and beliefs about what all this technology means, what Hummingbird is. There is a LOT of echo-chamber-ish effect happening.
I mean, I’m guilty of it. Until I studied the Google patents, Google statements, and search results for myself, I was pretty compelled to believe any of the dozens of Hummingbird explanations I’d heard.
Remember: don’t take my OPINION (because that’s what it is) blindly. QUESTION it, do your own research and form your OWN opinion.
We need MORE people willing to question things assumed to be facts. That’s what I’ve done here. I could be completely wrong. And if you think I am, please tell me why below.