Search Engine Showdown
[an error occurred while processing this directive]


All the Web moves into first
Search Engine Statistics: Relative Size Showdown
by Greg R. Notess

Data from search engine analysis run on Oct. 9, 2000.

Bar chart 5K + Fast's All the Web Finds Most
+ Google a close second
+ Northern Light passes AltaVista

This special analysis only compared the five largest search engines at the time of Fast's launch of a newly enlarged database, with iWon's advanced search representing the Inktomi GEN3 database. For these 25 completely new single word queries. All the Web found more total hits than any other search engine. Google came in a close second. Northern Light passed both AltaVista and iWon to move into third place.

When analyzed using the total number of verified search results from all 25 searches, the Fast database at All the Web ranked first. The exact total number of hits for each of the search engines is as follows:

All the Web

4,929

Google

4,720

Northern Light

3,768

iWon Advanced Search

3,608

AltaVista

2,904

However, just because Fast found more total hits does not mean that on individual searches it will always find more hits. One some of the 25 searches, other search engines found more than Fast. Google actually had more first place finishes on individual search than Fast, even though it had fewer total hits. In two cases, there were ties for first place.

Google

Had top score on 13 out of 25 searches, with two ties (one with Northern Light and one with iWon)

All the Web

Had top score on 10 out of 25 searches, with no ties

iWon Advanced

Had top score on 3 out of 25 searches, with one tie with Google

Northern Light

Tied for top score with Google on 1 out of 25 searches

AltaVista

No top scores on any of these 25 searches

This comparison is based on the reported number of hits from each database, verified by visiting the last page of results whenever possible. The number of records that many search engines can display is often different from the number that the search engine first reports. While this comparison is not a measure based on precision, recall, or relevance, it is an important indicator of the number of records that a searcher can find. It measures the effective database size . For earlier size showdown winners, see the links to older reports and the top three from each at the bottom of this page.

Specific Database Notes

Fast is available at several sites, most notable All the Web and Lycos. Since the new Fast database was first made available at All the Web, that search engine was used for this comparison. According to Fast, the new database was schedule to be made available on Lycos by Oct. 15, 2000.

Google includes some results (URLs) that it has not actually indexed. When it counts all the indexed and unindexed URLs, it claims over one billion. But as these examples show, the effective size is considerably less, since most searchers will see very few of the unindexed hits. These URLs that have not been crawled can be readily identified by the lack of a extract and the "cached" link. Google also clusters results by site and will only display two pages per site with additional hits available under the [ More results from . . . ] link. The numbers used here were painstakingly derived by checking all hits for each site, not just the ones that Google displayed initially. In addition, when Google finds less than 1,000 results, a note after the last record states:

In order to show you the most relevant results, we have omitted some entries very similar to those already displayed. If you like, you can repeat the search with the omitted results included.

Clicking the "repeat the search" option will bring up more pages, some of which are near or exact duplicates of pages already found while others are pages that were clustered under a site listing. To get the true total count of all pages that Google can retrieve, these extra pages were also included. However, from a practical searching perspective, most searchers would never find these pages.

iWon Advanced Search uses an Inktomi database which also pulls records from the Inktomi GEN3 database. On the basic iWon search, only one page per Web site is shown, and no access to the additional pages that might have been found on that site are available. The Advanced Search shows all pages, unclustered by site. Since in the July 2000 comparison, iWon found the most hits of any Inktomi partner, it was used in this analysis

Northern Light automatically recognize and search the English-form of word variants and plurals. For that reason, only non plural terms are used. Only the Web portion of Northern Light was searched, not their Special Collection. Northern Light also clusters hits by site with no ability to disable the site clustering. The number of reported hits was used, rather than trying to verify the number under each site. Northern Light is typically fairly accurate in its counts and presents both the total number of hits and the number of sites.

AltaVista clusters results, but this analysis used the Advanced Search with the option set so that results were not clustered by site. AltaVista is notorious for inconsistencies in reporting the number of hits it finds. (Although to be fair, most of the other search engines compared this time had similar inconsistencies in the number of hits reported and what they actually displayed.) Each search result set was checked and only the number of hits available for display was counted. Since the advanced search can only display the first 1,000 results, none of the search terms found more than that number. Because AltaVista can time out on a search and not give a full results set, their total database size may be under-represented here. However, it does reflect what searchers can find when using AltaVista.

Other Search Engines were not included in this study, but only the top five. Since Excite has had some stronger showings in the past, the first few searches were also run on the English-language database from Excite, but since it was not finding nearly as many as the top five, it was excluded from this analysis as well.

Disclaimer: This special size analysis was funded in part by Fast Search and Transfer, Inc. in conjunction with the launch of their new database. It uses the exact same techniques and methodology as used in the regular, unfunded Search Engine Showdown comparisons. The 25 search terms were all different than the usual ones but were chosen without input from Fast. Fast did determine the timing of the comparison (right after their new database was launched) and did grant permission to publish these results. In no way did this funding influence the results.

More details on the study's methodology provide an example of the comparison process used here.

Older Reports with Largest Three at that Time
July 2000: iWon, Google, AltaVista
April 2000: Fast, AltaVista, Northern Light
Feb. 2000: Fast, Northern Light, AltaVista
Jan. 2000 (supplement): Fast, Northern Light, AltaVista
Nov. 1999:Northern Light, Fast, AltaVista
Sept. 1999:Fast, Northern Light, AltaVista
Aug. 1999:Fast, Northern Light, AltaVista
May 1999:Northern Light, AltaVista, Anzwers
March 1999:Northern Light, AltaVista, HotBot
January 1999:Northern Light, AltaVista, HotBot
August 1998:AltaVista, Northern Light, HotBot
May 1998:AltaVista, HotBot, Northern Light
February 1998: HotBot, AltaVista, Northern Light
October 1997:AltaVista, HotBot, Northern Light
September 1997:Northern Light, Excite, HotBot
June 1997:HotBot, AltaVista, Infoseek
October 1996:HotBot, Excite, AltaVista

While decisions about which Web search engine to use should not be based on size alone, this information is especially important when looking for very specific keywords, phrases, and areas of specialized interest. See also the following statistical analyses: