In his latest post, William Webber argues that TREC is the wrong forum for the Human Performance Prediction (HPP) evaluation proposed by Mark Smucker at the Future of Evaluation workshop. William argues that when we get away from the Cranfield paradigm, an online setup not tied to a yearly conference with real users would work better than using TREC. I think his argument has merit.
Going back to tenants of HPP evaluation:
- Evaluation should be predictive of user performance.
- Evaluation should concern itself with both the user interface and the underlying retrieval engine.
- Evaluation should measure the time required for users to satisfy their information needs.
This effort would in effect create an interaction pool with possibly many participants plugging different retrieval engines into a canonical UI.Because it involves user interaction and the interplay of the UI with accomplishing tasks it's not clear that TREC would be the best host for this. It instead favors a less formal forum with significant web development experience that could facililate rapid iteration in UI and system changes. This type of setup would also easily allow flexibility in changing what interactions were logged.