Thursday, April 23

NSF Clue Award for Mining Semantic Word Relationships

Google congratulated the projects that were awarded 2009 CLuE grants that includes access to the Google/IBM cluster. Our lab received a grant to work on mining word relationships from large corpora.
The particular focus is on techniques that create and use Web-based corpora of "comparable" sentences and text chunks for estimating word and phrase translation probabilities, and on techniques that derive relationships from "context vectors" that represent word and phrase meanings.
Part of the project will also upgrade Trevor's work on TupleFlow to work with Hadoop.

Wednesday, April 22

WWW 2009 Papers and Workshops

This week WWW 2009 is happening in Madrid. The papers and many presentations are available on eprints.

For web search, the AIR Web Workshop (aka Web Spam) proceedings are also online.

SIGIR 2009 accepted papers published

The list of accepted papers is now available.

Here are a few of the interesting-looking papers I look forward to reading soon:

  • Adaptation of Offline Vertical Selection Predictions in the Presence of User Feedback
  • Extracting Structured Information from User Queries with Semi-Supervised Conditional Random Fields
  • The Impact of Crawl Policy on Web Search Effectiveness
  • Web-Derived Resources for Web IR: From Conceptual Hierarchies to Attribute Hierarchies
The only paper affiliated with the lab is Matt's:
An Improved Markov Random Field Model for Supporting Verbose Queries