How Google Works?
How Google Works?
Posted on 4th Apr 2016 09:09:11 in Digital Marketing
By: Elamurugu Sundararaj
There are 4 major functionalities on which it is all built.
- Web Crawling
- Search Display or Results
- When we search using Google, we are actually searching through an index of web pages. To gather the raw material for the index, Google’s web crawling robot called Googlebot, sends a request to a web server for a web page, it then downloads the page.
- When Googlebot downloads the webpage, it finds all the links on it and adds them onto a queue, where each of those links will also have to be crawled and information gathered. For each of those new pages, it does it again and again and stores the data. This technique is called deep crawling.
- To make sure that Google’s index is as up to date as possible, Googlebot needs to crawl the same pages continually. Sites that frequently changes, such as news site need to be crawled constantly throughout the day, while sites that rarely change may only need to be crawled once a month. Googlebot performs calculations on pages it crawls, determining how often they change and based on that decides how often to crawl that site. Pages that frequently change and so must be visited frequently are called fresh crawls.
- Googlebot extracts the full text of every page it visits and sends that information to the indexer.
- The indexer receives the text and stores it in the database. The index is sorted alphabetically by search term. Each index entry contains the list of pages on which the search term or keyword appears. The indexer doesn’t index commonly used words, called stop-words such as the, on, is, or, of, and, why and doesn’t store single digits, single letters and some punctuation marks.
Performing the Search
- When we visit Google, the page we see is delivered to us by a normal web server. When we type in our query, it’s sent back to the web server. The web server then takes the query and forwards it to the Google’s index servers.
- Google’s index server receive our request and match it to the most relevant documents, The method Google uses to match queries to documents is Google’s “secret sauce”, the key to its ability to return the most relevant results, Google uses hundreds of factors to decide which documents are most relevant , including how popular the page is called Google PageRank, where search term is found within the page and if we use multiple search terms how close those terms are to one another in the page, It doesn’t stop at examining the sheer popularity of a page. If a page is linked to from popular pages, that page will have a higher rank than if it is linked to from unpopular pages.
- When the index server determines the results of the search they send the query to Google doc servers. They retrieve the stored documents which include site name and links and snippets that summarize each page.
- The doc servers send the results back to the web server, which in turn sends the results to the person doing the search. The user browses through the results and can click a link to get to any page.