Friday, July 17

TREC: Not the right forum for interactive retrieval evaluation

See also yesterdays post on SIGIR workshop papers.

In his latest post, William Webber argues that TREC is the wrong forum for the Human Performance Prediction (HPP) evaluation proposed by Mark Smucker at the Future of Evaluation workshop. William argues that when we get away from the Cranfield paradigm, an online setup not tied to a yearly conference with real users would work better than using TREC. I think his argument has merit.

Going back to tenants of HPP evaluation:
  • Evaluation should be predictive of user performance.
  • Evaluation should concern itself with both the user interface and the underlying retrieval engine.
  • Evaluation should measure the time required for users to satisfy their information needs.
In his paper Mark proposes,
This effort would in effect create an interaction pool with possibly many participants plugging different retrieval engines into a canonical UI.
Because it involves user interaction and the interplay of the UI with accomplishing tasks it's not clear that TREC would be the best host for this. It instead favors a less formal forum with significant web development experience that could facililate rapid iteration in UI and system changes. This type of setup would also easily allow flexibility in changing what interactions were logged.


  1. I once ran a study where we gave subjects queries and they had to use a search engine to find answers as fast a possible. We used TREC Q&A track queries. It's good for something, maybe :)

  2. How about trying to develop inside the framework of games, like Luis Von Ahn's Phetch?