Monday, September 15

Beyond Relevance in evaluation

Over at the Google Blog, Scott Huffman writes an entry on Search Evaluation at Google.
Traditional search evaluation has focused on the relevance of the results, and of course that is our highest priority as well. But today's search-engine users expect more than just relevance. Are the results fresh and timely? Are they from authoritative sources? Are they comprehensive? Are they free of spam? Are their titles and snippets descriptive enough? Do they include additional UI elements a user might find helpful for the query (maps, images, query suggestions, etc.)? Our evaluations attempt to cover each of these dimensions where appropriate.
One of my biggest issues with TREC and similar environments is the single focus on topical based relevance. See my previous post on the TREC blog track. For example, a spam post that is relevant to a topic would be acceptable, even if you would never want to read it in real life. It's time we move beyond the basics and find ways to tackle the more challenging retrieval quality aspects in a way that is still amenable to cost effective measurement.

Note: I also highly recommend What People Think About When Searching by Daniel Russell who analyzes user intent and behavior at Google.


  1. Not to be predictable, but a nice way to address the problem of multifaceted user needs is to offer a faceted navigation interface, rather than try to seek one ranking function that somehow meets all of them. Just sayin'.

  2. Daniel, I think that faceted interfaces are a fantastic tool.

    There are scenarios where I may want to explore data, but often I just want the best answer, now.

    The search engine takes care of spam, freshness, topical and source diversity, etc... so that the user doesn't have to be bothered with them.