Greg R. Notess |
ON THE NET
Searching for Current News
DATABASE, June 1999 |
...the general Web search engines are not very effective for searching news sites. |
Most of the portals, search engines, and subject directories offer top headlines, stock quotes, sports scores, and other popular news items. Finding breaking news and popular topics is general straightforward on the Web of today. But what about a searchable database of news? What is available on the Internet for free that provides the most current and some portion of a back file of news articles? And how do these free Web products compare to the commercial news databases?
Instead, searching for news beyond today's top headlines calls for the use of specialized, searchable news databases. TotalNEWS, News Index, Excite's NewsTracker, Northern Light's Current News, and the news sections on Yahoo!, HotBot, and Infoseek all offer such a searchable database.
All these news databases index freely available news stories published on the Web by a local newspaper, network television news, or other recognized news sources. In many cases, these Web-accessible stories duplicate what would be found in the print (or other) media, but in some cases these sites offer Web-only stories.
These news databases are built just as the general Web search engines build their databases. Spider programs visit all the designated news sites and index the available pages. The news database spiders are programmed to visit far fewer sites, but they visit them much more frequently to be able to keep up with the most current news stories. Some news databases cover newswire stories. Rather than being indexed by a spider traveling to external sites, the wire stories may be loaded directly and then indexed.
TotalNEWS defaults to a Boolean AND operation on searches with more than one term. No OR, nesting, or + or - symbols can be used. Phrase searching is available with the unofficial standard of surrounding the phrase with double quotes. TotalNEWS also exhibits some peculiar behavior. After finding several records on one search, I tried searching for a phrase displayed by one of the results. That secondary search found nothing, not even the record found by the previous search. Apparently, the TotalNEWS spider does not successfully index the full content of all the pages in its database.
News Index does not display dates, but the documentation specifically states that it is not an archive. News Index is designed to help find current articles. Even so, it appears to have about a week's worth of stories. Given the lack of a list of sources and only a relatively recent time frame, News Index can delivers surprisingly high numbers of hits when compared to the archive available from TotalNEWS.
Searching under the Current News tab works like other searching in Northern Light. Full Boolean and phrase searching, the + and - symbols, and truncation are all supported. There are no stopwords and no case sensitivity. Results can be sorted by date or relevance, and broad subject categories can be used as limits. With its 15-minute updates, Northern Light's Current News is the most up-to-date of these searchable news databases, at least for the newswire stories.
On the back end, it covers date searching up to a month of archived news. Search features include almost full Boolean searching, supporting AND, OR, and nesting. Despite the advice above the search box, the NOT operator results in an error message. Truncation, phrase searching, and date sorting are available. So while the HotBot News Channel covers neither the most sites nor the oldest material, it has good searching capability and the most current of the news Web site indexes.
The search features are like the general Infoseek features. It supports case recognition, the + - system, and phrase searching, but has no Boolean operators or truncation. Unlike most of the other news databases, there are no stopwords in Infoseek's databases. The Advanced News search offers several breakdowns of the News databases, including separate searching of Reuters, PR, and Business Wire. There is an option of All News Sources which sounds like it would combine both the News Wires and the National News databases. Unfortunately, it actually just searches the News Wires database and not the National News sites.
More recently, Excite added newswires to the search as well. These are separated and displayed above the news Web sites. The wires include only very recent news. By putting them on the page with the Web News for older coverage, Excite offers a broad spectrum of news resources.
Searches on Yahoo! News default to an AND, but no Boolean operators can be used. The + and - symbols and phrase searching are available. Yahoo!'s News section is designed more for browsing than searching, but it functions as both.
Some of these news databases are integrated, to a degree, with general search engines. Excite and Yahoo! present easy access to their news search engine results from the results of a general search. Infoseek takes a slightly different approach in that a news search can be chosen from the main Infoseek search box on the top of the Go page. Northern Light and HotBot have separate links for their news searches.
In short, the news search engines show no consistency in quantity of results. Since they index different sources and cover different periods of time, it might be expected to find one with considerably more coverage than the others. But if these four search examples are representative of the larger databases, it varies depending on the search term and the date of the search as well.
For a metasite providing links to individaul news archives on the Web, try the SLA News Division's "Newspaper Archives on the Web." |
For one reason, some of the Web-based news resources are not indexed by the commercial databases. And they are free. While certainly not yet up to the standards of their commercial cousins, they are improving in both search features and in database size, scope, and coverage. Northern Light offers very current newswires while HotBot's News provides frequent indexing of Web-based news sites. Both can sometimes be more current than many of their commercial cousins.
One of the biggest drawbacks with these searchable databases of news is their slim coverage of older material. Yet more of the news sites, from Pathfinder's Time to local newspapers, are establishing ever-deeper archives. Since none of the search engines cover all the archives, they need to be searched separately. Just go directly to their Web site.
For a metasite providing links to individual news archives on the Web, try the SLA News Division's "Newspaper Archives on the Web" (http://metalab.unc.edu/slanews/internet/archives.html). Arranged by state, the basic table format features the name of the paper, its city, links to the archives, date range of the archive, and cost, if any.
News is a popular information commodity on the Web. While many sites offer top headlines for browsing, researchers looking for more details can delve into these news search engines for a look beyond the headlines.
Excite's NewsTracker
http://nt.excite.comHotBot News Channel
http://news.hotbot.comInfoseek News
http://www.infoseek.com/newsNews Index
http://www.newsindex.comNewspaper Archives on the Web
http://metalab.unc.edu/slanews/internet/archives.htmlNorthern Light's Current News
http//www.northernlight.com/news.htmlTotalNEWS
http://www.totalnews.comYahoo! News
http://dailynews.yahoo.com
Default | Boolean | Case | Phrase | Dates | Updates | Source | |
---|---|---|---|---|---|---|---|
TotalNEWS | AND | AND | No | Yes | Year | hours | Web sites |
NewsIndex | OR | AND, OR | No | No | Week | hour | Web sites |
Excite's News Tracker | OR | Full, +, - | No | Yes | Months | hours | Wires & Web |
Northern Light's Current News | AND | Full, +, - | No | Yes | 2 weeks | 15 minutes | Newswires |
HotBot News Channel | Phrase | AND, OR, ( ) | No | Yes | Month | 30 minutes | Web sites |
Infoseek News | OR | +, - | Yes | Yes | Month | hours | Wires & Web |
Yahoo! News | AND | +, - | No | Yes | Week | hours | Wires & Web |
search term: | mongolia | kalamazoo | maritime | tosco |
---|---|---|---|---|
TotalNEWS | 94 | 31 | 43 | 16 |
News Index | 6 | 19 | 58 | 20 |
Excite's NewsTracker | 13 | 42 | 69 | 7 |
Northern Light's Current News | 42 | 12 | 117 | 20 |
HotBot News Channel | 37 | 18 | 79 | 38 |
Infoseek National News | 7 | 36 | 20 | 6 |
Infoseek Wires | 23 | 3 | 61 | 11 |
Yahoo! News | 7 | 5 | 40 | 25 |
Communications to the author should be addressed to Greg R. Notess, Montana State University Libraries, Bozeman, MT 59717-0332; 406/994-6563; greg@notess.com ; http://www.notess.com.
Copyright © 1999, Online Inc. All rights reserved.