Thursday, December 4

Federated Search Blog Part I of Erik Selberg Interview

Last Friday Federated Search Blog posted part I of an interview with Erik Selberg. Erik created MetaCrawler, one of the first meta-search engines. He wrote it for his master's project at the University of Washington (UW) and continued to work on meta-search for his dissertation.

The interview reminds me of the the article I wrote back in 2006 on the beginning of metasearch featuring MetaCrawler. Maybe sometime I'll get around to part II.

One quote from the interview struck me because it deals with the problem of extracting interesting research questions from engineering tasks. Erik writes,
Fundamentally, a Web service that simply sends a query to a number of search engines and brings back results isn’t all that interesting for a researcher. That’s an engineering problem, and not a difficult one. But there are a number of questions that ARE interesting — such as how do you optimally collate results? How do you scale the service?... Oren pushed me to answer those questions.
The ability to abstract the interesting problems in a system and focus on those is a skill I'm still in the process of acquiring.

Erik solved the problem of combining a bunch of unreliable search engines to create one that was very useful, in the process he pioneered early research on meta-search. It's amazing how far web search engines have come; from unreliable early prototypes developed by grad students into today's multi-billion dollar industry.

I look forward to reading part II.

3 comments:

  1. If you really want to go back to earliest metasearch results, I suggest looking at the following references from TREC1:

    [1] E. A. Fox, M. Koushik, J. Shaw, R. Modlin, and D. Rao. Combining evidence from multiple searches. In D. K. Harman, editor, TREC 1, pages 319–328, Gaithersburg, Maryland, November 1992. Department of Commerce, National Institute of Standards and Technology.
    [2] P. Thompson. Description of the prc ceo algorithm for trec. In D. K. Harman, editor, TREC 1, pages 337–342, Gaithersburg, Maryland, November 1992. Department of Commerce, National Institute of Standards and Technology.

    Interested readers should watch out for a post about Thompson's CEO model on Probably Irrelevant some time in 2009.

    ReplyDelete
  2. Fernando, thanks for the pointing out the pre-web search work.

    I look forward to the post on Thompson's model.

    ReplyDelete
  3. Jeff,

    I appreciate your coverage of my interview with Erik Selberg as well as your historical coverage of SavvySearch and other early search technology. Part II of the Erik Selberg interview is now available:

    http://federatedsearchblog.com/2008/12/05/erik-selberg-federated-search-luminary-part-ii/

    I invite you to leave a comment at the Federated Search Blog referring to your coverage of the interview; it may send you some readers.

    ReplyDelete