Since Tim Berners-Lee published the world’s first web page back in August 1991, the Internet has grown exponentially to become the world’s greatest source of data and information. The worldwide web has transformed almost every aspect of modern life, allowing levels of interaction and forms of communication that were previously unimaginable just 30 years ago. By conservative estimates, the iIternet is now thought to consist of at least 1,200 petabytes of data. In reality, the web is much bigger, but with servers and web services being added all the time, it’s nearly impossible to ascertain a truly accurate figure. Rather, this estimate is based on the amount of data thought to be stored by the four big tech companies – Microsoft, Amazon, Google and Facebook. Nonetheless, even just working with these figures, it’s hard to get your head around the size of the web. To put 1,200 petabytes into perspective, it’s estimated that the human brain can store around 2.5 PB of memory data. 1 PB is equivalent to over 4,000 digital photos per day, over your entire life. A high-end laptop these days features a 1Tb hard drive – or 1000 gigabytes of storage – and one million gigabytes equals a petabyte. 1,200 petabytes are a truly colossal amount of data.
The Unstoppable Rise of Search
Amongst this sea of data, search engines like Baidu, Bing and Google have become an invaluable tool to find, organize and distribute data. Search is widely considered to have been the driving force behind the growth of the Internet, making sense of the immeasurable swathes of content online and helping us identify and locate information in a timely fashion. Without search engines, the Internet would amount to an incomprehensible jumble of sites and pages, essentially just a largely unusable collection of unstructured and unindexed data.
As well as the large, generic search engines, over recent years there has been a huge growth in specialized, industry-specific search engines for everything from flights to holiday accommodation. Sites like Airbnb.com vacation rentals and Octopart.com, an electronic parts search engine, fill the gap that the more generic search engines simply can’t fill – although, it’s quite likely you’ll first find these specialist indexes through an initial search on Google.
Crawling and Indexing
Search engines reference and organize data on the Internet by sending out crawlers (often referred to as bots or spiders) to traverse the web, find sites and pages and follow links to new content. The bots then return this data to a centralized index, containing key information including:
- The site’s URL (i.e. web address) and individual pages names and addresses
- The topic of a site or page, ascertained by ‘reading’ its content and looking for keywords
- How recently a page or site has been updated (i.e. is the content current)
- How many inward links point to a particular site
- The popularity of a website
- User engagement levels
Search Engines and Algorithms
Amassing this level of detail about sites and their pages allows the search engine to build up a complex database. Used alongside an algorithm, search engines can then generate specific results based on user searches. Without question, the defining factor in the outstanding growth of Google in its early days was the search engine’s unique algorithm, initially based on the idea of PageRank. Google’s inventors Sergey Brin and Larry Page were the first to devise this more advanced algorithm that looked at the perceived popularity and quality of the content of a site by studying its inward links. The rest is history, and Google quickly went on to become the world’s most popular search engine and is still the most visited website in the world today, accounting for around 92% of all online searches.
And Just When You Think You Figured Out Search… Google Changes the Game
Google’s algorithms are a complex system used to retrieve data from its search index and instantly deliver the best possible results for a query. The search engine uses a combination of algorithms and numerous ranking signals to deliver webpages ranked by relevance on its search engine results pages (or “SERPs”). In its early years, Google only made a handful of updates to its algorithms. Now, Google makes thousands of changes every year. Most of these updates are so slight that they go completely unnoticed. However, on occasion, the search engine rolls out major algorithmic updates that significantly impact SERPs and the marketers who rely on Google to drive traffic to their sites. Their most recent core update was rolled out in May of 2020. Broad core updates like this are designed to produce widely noticeable effects across search results in all countries in all languages. Sites will inevitably notice drops or gains in search rankings when a core update rolls out. Changes in search rankings are generally a reflection of content relevancy. Meaning that f content has gained relevancy since the last update, it will be moved higher up in rankings. The opposite is also true. Then, there’s newly published content that didn’t exist at the time of the last update. That all has to be reassessed against previously existing content. To put it simply, rankings can move around quite a bit!