One workshop that I would like to attend is Exploiting Semantic Annotations in Information Retrieval, organized by Omar Alonso from A9 and Hugo Zaragoza from Yahoo! Barcelona. From the description:
By semantic annotations we refer to linguistic annotations (such as named entities, semantic classes, etc.) as well as user annotations such as microformats, RDF, tags, etc. We are not interested in the annotations themselves, but on their application to information retrieval tasks such as ad-hoc retrieval, classification, browsing, textual mining, summarization, question answering, etc...There have been some recent efforts on automatically semantically Wikipedia. For example, Hugo and other Yahoo researchers made available a Semantically Annotated Snapshot of the English Wikipedia (SW .1).
In particular, techniques have been developed to ground named entities in terms of geo-codes, ISO time codes, Gene Ontology ids, etc. Furthermore, the number of collections which explicitly identify entities is growing fast with Web 2.0 and Semantic Web initiatives...
Despite the growing number and complexity of annotations, and despite the potential impact that these may have in information retrieval tasks, annotations have not yet made a significant impact in Information Retrieval research or applications. Further research is needed before we can unleash the potential of annotations.
Also, in the paper Autonomously Semantifying Wikipedia, Fei Wu and Danield Weld from the University of Washington describe the KYLIN system that automatically extracts semantic information from Wikipedia, with two main goals:
- Automatically generating "infoboxes", the concise tabulated summaries of the subjects attributes
- Autonomously linking articles to create useful structure between articles