History of Search Engines

Last updated: July 25, 2005

(1) Search-Database Service
(2) Search Engines: Crawlers, Robots, Spiders
(3) Timeline


(1) Search Database Service


There are 3 major players in the search-database service in the U.S as well as in Japan.  Yahoo! is an important player because it's one of the oldest directory service on the world wide web (since February 1994) and is still the most popular.  Google is not as old as Yahoo!.  It started in September 1998 and has now a directory of more than 8 billion websites (as of November, 2004).  MSN Search also started in September 1998.  Its relative importance comes from the facts that Microsoft has incorporated its Internet Explorer into the Windows operating system and that Internet Explorer's default home page is www.msn.com (for North America).  It is also true that MSN search is popular because of Microsoft's internet service.  MSN as an internet service provider has 9 million subscribers in the U.S. and is the 2nd largest although it is much smaller than Time Warner American Online, which has 26.5 million subscribers.  There is no surprise that msn.com is the 2nd most frequently visited site next to yahoo.com in the U.S.
     One of the most powerful and oldest searchable directory entities came into existence in 1994.  It was David File and Jerry Yang of Stanford University Ph.D. candidates in electrical engineering who created a searchable directory to organize a collection of their favorite web sites.   And they founded Yahoo!  As the number of search requests increased, they developed a searchable database with descriptions and categories.  For a long time, Yahoo! manually entered and categorized the sites that were submitted.  Therefore, the number of listings is way smaller than that of Google though they automate some of the tasks now

     Yahoo!'s free submission service is over for commercial sites as well as even for personal sites in some countries including Japan.  There have been many changes for commercial sites after Yahoo's acquisition of Inktomi and Overture.  Overture has introduced to Yahoo! pay-per-click advertising service, which allows business owners to bid for the placement of their web site under the sponsored search listing.  Inktomi's Site Match, which now has changed its name to Search Submit Express, accepts submission of a web site for review with a one-time fee of $49 plus a cost-per-click fee.   Upon acceptance, it guarantees a revisit of Inktomi's Slurp (crawler) for every 48 hours.
     Digital Equipment Corporation's Alta Vista, which went online in 1995, was the first search directory that used advanced search techniques like natural language inquiries and Boolean operators (AND, OR, NOT).  Alta Vista's natural language inquiry allows the user to enter a sentence to come up with right key words.  Alta Vista also allowed the user to access newsgroups to retrieve articles on the web.  Infoseek also began its directory service in 1995.  It was nothing more than Yahoo! at the beginning.  Then they provided additional services like UPS tracking and news to become a popular online site.

Once being an important player as a listing provider, Looksmart went online in October 1996.  They provided manually-edited listings to such major companies as AltaVista, HotBot and MSN in the late 1990s.  Then its importance as a mayor directory provider quickly diminished when its tie with MSN ended in late 1993.  Looksmart has also lost its status because they made numerous changes in submission service fees after the beginning of 2000 and then introduced pay-per-click rates.
     NorthernLight.com went online in 1997 with 30 employees.  They once had one of the largest set of search directories that offered full text documents covering magazines, journals, books and newswires.  Their diverse online business library is called the Special Collection.  Northernlight ended their free publicly accessible web search in January, 2002.
     NorthernLight, which is now part of Northern Light Technology, Inc., offers several members-only up-to-date categorized search services focusing on business and industry.
     There isn't much to say about MSN Search. It was originally launched in September 1998 for Microsoft's Internet Explorer.  It is used for the default home page of Internet Explorer.
     A public search directory service, Open Directory Project (ODP), also started in 1998. According to ODP, its human-edited directory is the largest of all.  According to ODP, as of July 2005, they have more than 4 million listings that are edited by about 70,000 people.  It's listings are distributed to hundreds of companies including AOL Search, Gigablast and Google.  ODP is actually owned by AOL/Time Warner.  Its original spirit is such that the submission of web sites and the use of the directory data are free of charge.

 

(2) Search Engines: Crawlers, Robots, Spider

 

In 1993, Matthew Gray, a student at MIT, wrote a perl script called the World Wide Web Wanderer to measure the size (the number of servers) and the growth of the world wide web.  Then it was used to capture URLs.  This index of URLs was called the Wandex.   Since it caused the network-wide degradation of performance (It sometimes accessed the same site hundreds of times a day.), Matthew's Gray's Wanderer was quite a controversial project.   Anyhow, the Wandex was the first robot on the web.
     What also appeared in 1993 following the Wandex is Martijn Koster's ALIWEB.  ALIWEB also indexed web sites. It actually allowed webmasters of participating sites to post their own page description.  And this process didn't cause the network slowdown since it didn't require large bandwidth.   The problem of this indexing process is that the system required the submission of a special indexing file, which many people did not know of.

      Furthermore, once a popular search engine Excite has its roots in the development of search software.  In early 1993 six undergraduate students at Stanford University started a project called Architext.  They used statistical analysis to effectively find world relationships for Internet searches.  And they released search software, Architext, for webmasters to use on their own websites.
     In 1994, a University of Washington student Brian Pinkerton developed desktop software called WebCrawler.   While other bots stored titles, URLs and first 100 words or so, WebCrawler indexed entire pages.  And users could search the full text of each document.  WebCrawler became so popular that the network system at the University of Washington was devastated during the daytime hours.  An early success of Webcrawler brought Lycos such search engines as Lycos, Infoseek and OpenText within 1 year.
     Lycos came out of a research project developed at Carnegie Mellon University in 1994.  Dr. Michael Mauldin's original research was to calculate the size of the web using a spider robot called 'wolf spider,' which is Lycosidae in Latin.  The wolf spider walked from site to site through page links.  And in no time Lycos' size of catalog went unimaginable at the time. By November 1996, Lycos has indexed more than 60 million documents.
     In late 1996, Infoseek introduced its full indexing engine called Ultra, which collected 25 million urls.  And Infoseek started providing search results to Netscape in December 1995.
     There was a problem by mid-1995 in pulling search results.  Different search engines came up with different sets of results.  So, in 1995, a Master's student Eric Selburg and associate professor of computer science at the University of Washington developed MetaCrawler.  This search engine accessed Lycos, AltaVista, Yahoo!, Excite, WebCrawler and Infoseek simultaneously to come up with the best results.  The way it worked is such that it gathered results from various search engines onto one page and then reformat them.  Savvy Search, which was developed at Colorado State University, is similar to MetaCrawler.  Savvy Search, which is faster but less reliable than MetaCrawler, accesses up to 20 search engines at a time.
     In 1996, Inktomi introduced an important spider HotBot to the web.  HotBot was originally developed at UC Berkeley by assistant professor of computer science Eric Brewer and a Ph.D. candidate Paul Gauthier. It was a powerful search engine at the time because it was said to index 10 million documents per day.
     Although it is a late comer, Google is now one of the most powerful and innovative search engines.   Two Ph.D. candidates in computer science at Stanford University Larry Page and Sergey Brin had a collaborative research project in the mid 90s.  And they developed a search engine called BackRub.  BackRub uses a unique idea of algorithm at the time that analyzes the back links pointing to the original website.  Today, a more advanced version of this inbound link calculation technique is called PageRank. In a sense, PageRank calculates the number of votes that are cast to the original website, taking into account, also, the quality of the casters.

(3) Timeline 

 

March 2004 AltaVista switches to Yahoo! Search
March 2004 AlltheWeb switches to Yahoo! Search
March 2004 Yahoo!'s Site Match enabled by Overture
February 2004 MSN Search using MSNbot
February 2004   Yahoo discontinuing Google's listings
January 2004 New MSN Search Beta introduced
December 2003 MSN-LookSmart contract expires
October 2003 Yahoo! acquires Overture
June 2003 Google starts its AdSense for web site owners
April 2003 Overture acquires AltaVista
April 2003 Overture acquires Fast (AlltheWeb)
March 2003 Yahoo! acquires Inktomi
October 2001 GoTo changes its name to Overture
December 2000 Google's Toolbar introduced
October 2000 Google's AdWords starts
June 2000 Yahoo!'s search enabled by Google
August 1999 AlltheWeb starts
July 1999 Disney acquires Infoseek
January 1999 At Home acquires Excite
? 1999 The Mining Company renamed as About.com
September 1998 Google starts
September 1998 Microsoft starts MSN Search
June 1998 GoTo starts its sponsored search service
June 1998 Open Directory Project (ODP) starts
May 1998 Yahoo!'s search enabled by Inktomi
September 1997 GoTo starts
August 1997 Northern Light starts
February 1997 The Mining Company starts
October 1996 LookSmart starts
December 1995 AltaVista starts
October 1995 Excite starts
September 1995 Inktomi starts
June 1995 MetaCrawler developed
May 1995 SavvySearch developed at Colorado State University
February 1995 Infoseek starts
July 1994 Lycos starts
April 1994 Yahoo!
April 1994 WebCrawler developed by Brian Pinkerton
November 1993 Aliweb developed by Martijn Koster
June 1993 The World Wide Web Wanderer developed by Matthew Gray
February 1993 architext developed by Stanford University students


References 


A History of Search Engines by John Wiley & Sons, Inc.
History of Search Engines & Directories by CommerceFriends.com
Search Engine Players: A Brief History by Search Engine World