Well, I’ll start by answering my own question: Google is a brilliant search engine, a fantastic invention when it came on the scene 15 years ago, and never surpassed. It’s a huge and very valuable company, and search is still its core business. But…
In certain respects it has got worse. 1. The cache is harder to find and sometimes absent. 2. The buying-up of Usenet newsgroups has helped to preserve them, but at the expense of a bastardisation of the system and some problems of data retrieval. 3. Attempts to diversify have only had a 50-50 success rate. Buzz buzzed off, Wave came and ebbed, G+ is still a niche product, Chrome is okay. 4. Its purity has been compromised by the ownership of Youtube, which it promotes ahead of other video sites, and by the continually increasing amount of advertising on the main search results page, which is especially making it difficult for children to use Google effectively.
But all that’s by the by. The topic I come back to time and again is Results. When you click Search the first thing you see is something like “About 10,000,000 results” and these numbers are quoted far and wide, often as an excuse for doing no real research whatsoever. The problem is that this figure is plucked out of thin air. Well, I exaggerate, but the algorithm that spews them is little better than woo. It’s a homeopathic product, a placebo designed to comfort the user. This is a shame, and I’m at a loss to understand why Larry Page treats his customers in this way. I understand that coming up with an accurate figure within a fraction of a second is difficult, but is it unreasonable to expect a ballpark figure?
The wanton inaccuracy is partly a by-product of Google’s increasing desire to second-guess the user by correcting our typos and eggcorns for us. The hit count in a way reflects all the possible hits for any part of our search string and possible variations thereof. As you can imagine, this inflates the results massively.
Example 1: dadge – “228,000 results”. Actual results: 525.
Example 2: Az a baj – “20,200,000 results”. Start going through the pages of results and you quickly find that most of the pages don’t contain this whole phrase at all. (It means “That’s the problem” by the way.) What you have to do to correct this is click “More search tools” on the left and then click “Verbatim”. The revised figure is 61,000,000 results(!) but at least the required phrase exists in the listed webpages. Actual results: 454
So, people, don’t forget your grain of salt when you’re using Google.
Update: I’ll add more examples as I find them “in the wild”.
“the ongoing war in Iraq”, quoted figure: over 700,000,
current figure: “269,000 results”,
actual results: 440.
Downton Abbey, quoted figure: 109 million
downton abbey 59.2 m
“downton abbey” 19.7 m
downton abbey (Verbatim) 101 m
“downton abbey” (Verbatim) 68.7 m
actual number of webpages listed: 757, (Verbatim) 415
Every single one of these numbers is wrong!