Search Engine Showdown
[an error occurred while processing this directive]

Search Engines Statistics: Database Overlap
by Greg R. Notess

Data from March 6, 2002

Pie Chart Little overlap despite database growth!

Searches Used:
Total Hits:
Specific Pages:
4 small ones
334
141

This analysis compares the results of four small searches run on ten different search engines. The four searches found a total of 334 hits, 141 of which represented specific Web pages. Of those 141 hits, 71 were found by only one of the ten search engines while another 30 were found by only two.

Several of the largest search engines have shown significant growth since Feb. 2000, when the overlap comparison was last run. Even so, almost half of all pages found were only found by one of the search engines, and not always the same one. Over 78% were found by three search engines at most. Each pie slice in the chart represents the number of hits found by the given number of search engines. For example, the by 1 (71) slice represents the number of unique hits found by one (and only one) search engine.

Even with three Inktomi-based databases (iWon, MSN Search, and HotBot), there was not identical overlap between the three. However, the Inktomi database are relatively similar.

Search Engines Analyzed:

  • AltaVista
  • AllTheWeb
  • Direct Hit
  • Google
  • NLResearch
  • Teoma
  • WiseNut
  • And the Inktomi crew:
    • iWon
    • MSN Search
    • HotBot

See the more detailed analysis of unique hits to gain a sense of how the 71 pages found by only one search engine were distributed.

Previous Comparisons:

  • Feb. 2000: Five searches on fourteen search engines. 795 hits, 298 unique pages. 110 found by only one search engine.
  • Sept. 1999: Five searches on thirteen search engines. 326 hits, 140 unique pages. 66 found by only one search engine.
  • May 1999: Five searches on eleven search engines. 267 hits, 122 unique pages. Over half found by only one search engine.
  • March 1999: Four searches on ten search engines. 202 hits, 97 unique pages. None found by more than five search engines.
  • Jan. 1999: Four searches on ten search engines. 176 hits, 83 unique pages. None found by more than six search engines.
  • August 1998: Four searches on five search engines. 103 hits, 70 unique pages. None found by all five search engines.
  • May 1998: Four searches on five search engines. 95 hits, 77 unique pages. None found by all five.
  • Feb. 1998: Four searches on five search engines. 103 hits, 62 unique pages. Three found by all five search engines.
  • October 1997: Four different searches on four search engines: 220 hits, 12 found by all four
  • September 1997 and June 1997 found no pages in common among four small searches on the four largest search engines at those times. (No charts available.)

While decisions about which Web search engine to use should not be based on size alone, this information is especially important when looking for very specific keywords, phrases, and areas of specialized interest. See also the following statistical analyses: