The annual
TREC conference was held a little over a week ago in Maryland. So far, there have been no public reports on how it went. It would be useful to have the meetings video taped and broadcast over the web, as well as other ways for interacting with non-attendees.
Rodrygo from Glasgow has a
post covering the blog track workshop, focusing mainly on the discussion around the 2009 track. Notably, the opinion finding and polarity tasks are being discontinued.
Rodryo writes,
It was a consensus among the attendees that opinion retrieval and polarity detection are still open, relevant problems. Yet a few groups managed to deploy interesting techniques that achieved consistent opinion retrieval performances across several strongly performing baselines in the track this year, polarity detection approaches looked rather naive.
In its place are new tasks for 2009 were discussed.
Faceted distillation task
The goal for this task will be to identify desirable blogs on a topic, for example non-spam blogs that have a recurring interested in a given topic. It's 'faceted' because the topic can specify desired properties of the blog such as, non-spam, satirical, Republican, last month, etc... Personally, I'm encouraged because it's more realistic and goes beyond topicality as the sole criteria for 'relevance'.
Story tracking task
The task will investigate how stories emerge and evolve over time. It may be linked to a parallel news corpus to show connections between blog news and the media. This reminds me of a previous discussion I had about using the blog corpus for topic detection and tracking (TDT) tasks.
There will also be a new blog corpus. See also the previous related discussion on the ICWSM 2009 data challenge for similar tasks on a different blog corpus.
Hopefully, I'll hear more about TREC 2008 as well as the future for TREC 2009 beyond just the blog track. I hear exciting rumors about the possibility of resurrecting the web search track with a new corpus.
0 comments:
Post a Comment