Last Friday was SIGIR workshop day. First up is the workshop on CrowdSourcing for Search Evaluation. It focuses on using Amazon's Mechanical Turk (MT) and similar service to provide judgments. I did not attend this workshop, but heard positive things from the attendees. The workshop is organized by Matt Lease, Vitor Carvalho, and Emine Yilmaz.
The presentations and papers in the program are available online. Here are a few I want to highlight:
A main highlight was the CrowdFlower keynote:
Better Crowdsourcing through Automated Methods for Quality Control
CrowdFlower provides commercial support for companies performing tasks on Mechanical Turk. Everyone had great things to say about this talk that kept people enthralled even though it was the end of the day; some said it was the best talk of the conference.
The other keynote was:
Design of experiments for crowdsourcing search evaluation: challenges and opportunities by Omar Alonso. Don't miss the slides from Omar's ECIR tutorial. They also had a paper at the workshop,
Detecting Uninteresting Content in Text Streams, which looked at using crowdsourcing to evaluate the 'interestingness' of tweets. They found that most tweets, 57% were not interesting. The found that generally, tweets that contain links tend to be interesting (81% accuracy) and that those without links that were interesting generally contained named entities.
Omar, Gabriella Kazai, and Stefano Mizzaro are working on a book on crowdsourcing that will be published by Springer in 2011.
My labmate, Henry Feild, presented a paper, Logging the Search Self-Efficacy of Amazon Mechanical Turkers.
Be sure to read over the rest of the program, because there are other great papers that I haven't had a chance to feature here.