Google is the biggest search engine in the world. The interface that people use to perform a search on Google is called Google Search. Right now, Google processes over 40 000 search queries every second. This is more than 3.5 billion searches every day.
Undoubtedly, that’s a very impressive number. But how does Google Search work? What happens when you search for something on Google? How does Google choose which results should be displayed to who? This post briefly explains how Google Search works. To keep the post simple, we won’t go into too much technical detail.
Matt Cutts explains Google Search
Even though Matt is no longer working at Google, the information remains accurate to this day – and probably will forever.
This is a short 3-minute video from 2010, but it contains a lot of useful information. Google Search evolves over time and the amount and types of questions Google asks change, but the principles of Google Search remain the same.
Matt touches on the following topics:
Google’s index of the web
Google has a machine (actually a bunch of machines) called Googlebot that scans (or crawls) the web repeatedly to find new or updated content published on websites. It saves any content it finds in Google’s Index. After the content is indexed, Google scans the information Googlebot found.
Google dissects the content in several ways to figure out exactly what it is about, why it exists, and how it can be helpful to searchers. Google also scores the page that contains the content using PageRank. More on PageRank later.
Search engine spiders
As mentioned above, Google has a search engine spider called Googlebot that crawls the web in search of new and updated content. It crawls a page that Google already knows about or a page that you manually added to Googlebot’s crawl list.
Googlebot finds all links on that page (linking to other pages) and schedules those pages to be crawled. In most cases, you don’t need to tell Google about a new page – Googlebot will find the page by following links to it.
When Google started, crawling, indexing and publishing the index was a long process. Finding fresh content and returning the fresh results in search results literally took several weeks. Today, Google can crawl most of the web several times in a single day. Google has also learned what types of websites have which types of content – and how often it changes.
For example, Google knows that news sites regularly publish new content and archive websites mostly keep content for historical or research purposes. Google also knows that documents (PDFs like WhitePapers, specification sheets, etc.) don’t change too often, but active pages change more frequently.
Google’s questions for Google search results
If you think someone buying a new car has a lot of questions, you haven’t seen the Google Search Engine analysing a web page. Google has a lot of questions, about everything on a page.
Obviously, when you properly markup the content of your web pages, you can help answer a bunch of those questions right off the bat. You can help tell Google what your page is about, and Google can show it to people who are looking for what your page has to offer.
All the specific questions that Google asks when ranking a page isn’t public knowledge. However, some of them are – and it is important to understand them. Knowing some of the questions can help shed some light on the topic. However, that is outside the scope of this post.
PageRank – Google’s view on a page’s importance
PageRank is Google’s formula for ranking a page’s importance compared to other pages. It basically comes down to the number of links to a page (which Google counts as votes). Every link to a page is a vote of confidence by the referring page to the linked page. The more links (or votes) a page has, the more important it must be.
It get’s a little more interesting though, because Google also looks at WHO is linking to (or voting for) a page. A single link from an important website (that already has lots of links pointing to it) can be much more valuable than 10 000 links from unimportant or spammy websites.
With every link, a percentage of PageRank flows from the original page to the linked page. It boils down to how many links you have on the page. The more links you have, the less PageRank each link gets from your page.
The Snippet of a page in Google Search Results
For those who don’t know what a snippet is, this is what a snippet looks like in the Google Search Engine Results Pages (SERPs).
The snippet is displayed to the searcher to give the searcher information about the page. In the snippet, the words that the searcher searched for are usually bolded to indicate the relevance of the page. Also, pages that you have visited previously appear in a different colour. Google does this so that you know you have already visited that page in the past.
The searcher usually makes a decision (largely based on the snippet) if the result is what they are looking for. The snippet of your page is very important. Usually, you can define what the snippet should look like with your title and description. Although, Google might show other information based on the page’s content if they choose. Mostly, Google only does this if you didn’t enter a title and description.
Google Ads in Google Search results
Google is a company that does many different things. Most people think of Google only as a search engine. However, Google is also an advertising company, an e-mail provider, storage service, maps provider, software developer, video and media company – and many other things. To give you an idea of Google’s reach, just look at the Google products page. Google is massive and has a hand in most things related to technology.
As an advertising company, Google shows ads (related to your search) in the search results. Companies buy these ads from Google – to display their advertisements in Google’s ad space. I was very surprised when I heard that nearly half of searchers can’t tell ads from search results. However, Matt makes it quite clear which are paid advertisements and which are organic search results.