Google's new 'Hummingbird' algorithm is about queries, not just SEO

Tweaks to the way the search engine processes questions as queries show Google's evolution into natural-language AI system

HTML coders inevitably panic when they hear word of another tweak to Google's search algorithms. Small wonder much dander was raised when Google announced Thursday it had silently rolled out a new algorithm, code-named "Hummingbird."

Even though Google's tight-lipped about the exact technical details of Hummingbird, TechCrunch noted the most important element of the changes revolve around how search queries are processed. Instead of simply parsing searches one word at a time, the new algorithm is tuned to parsing questions posed by users, then organizing the results in terms of the most valuable answers to those questions first.

Phrasing something as a question clearly biases results toward Q&A-type pages -- not just sites like Stackexchange, Quora, or Yahoo's Answers site, but any page where a question is featured prominently at the top and answers are ranked below. One- or two-word search terms still return basic information; three or four words return more what-if or how-to type searches.

Because the changes were deployed without warning, it'll be all the easier to determine if your own site's page rankings were affected by Hummingbird simply by looking at results for the last couple of weeks. Google estimates that some 90 percent of searches are affected.

So far the effects haven't been anywhere nearly as drastic as Google's "Panda" optimization back in February 2011. Panda was designed to penalize pages with attributes common to link farms or other low-quality sites: keyword stuffing, heavily duplicated content within the same domain, many links on one page to the same pages, and so on. While Panda did push many junk sites further down into Google's page ranking, it also forced a great many legitimate site owners to retool much of their own content.

The most revealing thing about Hummingbird is how it reflects an ongoing change in the demands people are making on search engines. Keyword queries are giving way to longer, more sophisticated questions, driven at least in part by voice searches. The more Google enables such things, the more people expect the results to be useful and complete, so further ramp-ups of this kind are almost certainly on the way.

Google's main "skunkworks" projects at this point all seem to revolve around natural-language processing of one kind or another. They're using the data harvested from the Web as a whole to aid in language translation; with Hummingbird, they may well be using the click-through behavior for results returned from questions to retool Google all that much more into an AI that knows at least as much -- if not more -- about the real world than we do.

It's also important to not lose sight of how Google keeps all this tech close to its chest. The fact that Google was tight on the internal details of what changed with Hummingbird shouldn't come as much of a surprise. Google's business model has always been about both monetizing search results and keeping a tight grip on just how those search results are processed, ranked, and returned -- the better to defray real competition.

