Google Artificial Intelligence and Hummingbird

Est. Reading Time: 5 minutes

In a previous post, I stated that Google named its new algorithm after a hummingbird because a hummingbird is quick and precise and has the tremendous ability to recall information about every flower it has ever visited. The name implies a more robust and faster search engine algorithm.

But besides extraordinary powers of recall, I overlooked the important fact that hummingbirds are also very smart. Hummingbirds have been observed learning new behaviors in the wild. Also, a hummingbird’s brain is 4.2% of its body weight. That percentage is the largest proportion in the bird kingdom. Maybe all of the above reasons explain why Google named the new algorithm Hummingbird. But above all, just like a hummingbird, the new search engine algorithm is pretty smart.

Many have pointed out Hummingbird’s new powers of semantic search. But I believe there is another component to Hummingbird. It’s an area of artificial intelligence researchers have been seeking to perfect for some time.

Almost four years ago Google’s Amit Singhal, the head of Google’s core ranking team, spoke to EnGadget about his dreams for search. His number one and most difficult goal was to include information that doesn’t come from text but from images. At that time, Amit describes the “computer vision algorithms” as still in a basic form. Most of the information about an image still came from text surrounding an image.

Peter Norvig alluded to the importance of images in his interview with Marty Wasserman for Future Talk in the fall of 2013 not long after Hummingbird was implemented at the end of August 2013.

Marty Wasserman: So it sounds like replicating vision is one of the most important things? Having a camera look at an object and interpreting what that object is?

Peter Norvig: I think that’s right and I think it’s a useful task and it connects you to the world so we have a broader connection then just typing at a keyboard. Now if a computer can see, it can interact a lot more and be more natural. And it’s also important in terms of learning. Because we have been able to teach our computers a lot by having them read text. There is a lot of text on the internet so you can get a lot out of that. Make a lot of connections and know that this word goes with this other word and other words don’t go together. But they are still just words and you would really like your computer to interact with the world and understand what it is like to live in the world. You can’t quite have that but it seems like video is the closest thing.

Marty Wasserman: I think Google’s worked on this problem a lot. You’re trying to interpret words. Basic search, and you’re an expert on search, doesn’t know what a word means but it can tell how frequently it occurs. But the next level of search would be to have a better understanding of what the word means so it can figure out the nuances of what the person is asking for.

Peter Norvig: That’s right. So we, you know, the first level is just they ask for this word and show me the pages that have that word. The next level is to say what did that word really mean and maybe there’s a page that talked about something but uses a slightly different words that are synonyms or related words. So we’re able to do that – figure out which other related words count and which ones don’t. And then the next level is saying well you asked me a string of words and it’s important what the relationships are between those words. And figuring out that out. So we have to attack understanding language at all levels and understanding the world at all levels what are these words actually refer to in the world.

In the interview, Peter Norvig stresses the importance of deriving meaning not only from text but also from images. This requires a computer to look at picture on a website and determine what those images are. Object identification sounds simple and is easy for humans to do, but it is extremely difficult for computers. However, Google has recently made significant advancements in this area.

In June 2013, the Google research blog announced that by using deep learning, Google had moved a step towards the toddler stage.

Images no longer have to be tagged and labeled to be identified.

This is powered by computer vision and machine learning technology, which uses the visual content of an image to generate searchable tags for photos combined with other sources like text tags and EXIF metadata to enable search across thousands of concepts like a flower, food, car, jet ski, or turtle… We took cutting edge research straight out of an academic research lab and launched it, in just a little over six months. You can try it out at– Chuck Rosenberg, Google Image Search Team

What does this mean for search? Images become more important as Google can label them more precisely with or without your help and associate those images with other textual concepts on your website and the knowledge graph. Doing so allows Google to enhance classification of websites and obtain a better understanding of how to match search user intent, the sometimes subtle nuances of search queries, and the best matching webpage.