Wednesday, May 2

ACM SIGKDD Webcasts Online

The ACM SIGKDD has started hosting webcasts to improve data mining education and share expertise.

There are currently two webcasts online and a third scheduled for May. Here is information on the first two:

Web Content Mining
By Bing Liu, University of Illinois at Chicago (UIC)
Web content mining aims to extract/mine useful information or knowledge from Web page contents. Apart from traditional tasks of Web page clustering and classification, there are many other Web content mining tasks, e.g., data/information extraction, information integration, mining opinions from the user-generated content, mining the Web to build concept hierarchies, Web page pre-processing and cleaning, etc.
His website also has slides and other data mining material to go with his new book: Web Data Mining (December 2006).

Towards Web-Scale Information Extraction
By Eugene Agichtein, Emory University
Data mining applications over text require efficient methods for extracting and structuring the information embedded in millions, or billions, of text documents... First I will briefly review common information extraction tasks such as entity, relation, and event extraction, indicating the main scalability bottlenecks associated with each task. I will then review the key algorithmic approaches to improving the efficiency of information extraction, which include applications of randomized algorithms, ideas adapted from information retrieval, and recently developed specialized indexing techniques.
Eugene has a web page that accompanies the webinar. The page has lots of good resources, including links to other similar tutorials.

No comments:

Post a Comment