Friday, February 6

I Want You To Want Me by Sep Kamvar

Danah Boyd pointed me to the work of Sep Kamvar. In honor of mid-February I'll highlight the project, I Want You To Want Me an interactive art/visualization project that analyzes the descriptions used in online dating listings. I think the project is still on display at MOMA, but I could be wrong.

There is an incredible video highlighting some example interactions.

Sep also has We Feel Fine, an exploration of human emotion in online blogs.

Amazing projects.

Wednesday, February 4

MIT-Yahoo Human Computer Interaction Seminar Series Online

The MIT UI blog highlights a new seminar series on HCI at MIT sponsored by Yahoo!.

The first talk by Ed Chi titled Augmented Social Cognition: Using Web2.0 technology to enhance the ability of groups to remember, think, and reason is available via Yahoo!.

Here at UMass we have a Machine Learning Lunch seminar every semester sponsored by Yahoo!, but the series is not being recorded.

Thanks to everyone at Yahoo! for the support of seminars like these.

Tuesday, February 3

Evri creating the "Entity Web"

Matthew Hurst has a post on Evri, a company that has created an "entity web", a massive data graph connecting entities.

They have a good post on entity disambiguation titled Blue.

Evri has a REST based preview API available for developers.

Cool stuff.

Monday, February 2

Seb's Back: The Greatness and Perils of Following Suit

Seb Paquet, a professor of CS in Quebec, is blogging again after a long hiatus.

His first post is weights the perils of popularity and quality. One of the interesting topics in the article describes how quality is often hard to recognize.

Welcome back Seb, keep it up! Thanks to Daniel for the pointer.

SIGIR 2009 Industry Day Details

Daniel wrote a post detailing the speakers for the upcoming SIGIR 2009 Industry Day he's organizing.

It's an impressive lineup including Matt Cutts from Google. Given Daniel's experience at Endeca, it should be no surprise that Enterprise Search is a main theme, with two panels. The first panel is of leading vendor technologists and the second is for industry analysts.

The great news is that the track is free for those attending the main SIGIR conference.

Alon Halevy Talk on Structured Data at Google

Alon Halevy, head of the "Deep Web" surfacing project at Google gave a talk at the New England Database Day Conference at MIT. Details of his talk are still sparse, but ReadWriteWeb has a partial writeup.

In the article, RWW outlines ways that Google hopes to improve search by leveraging structured data, leading towards improved semantic and product search. One way they Google will gather the data is harvesting it from tables: There are 14 billion such tables on the web, and, after filtering, about 154 million of them are interesting enough to be worth indexing.

He reportedly also outlined some of the key DB application challenges Google is working on: schema auto-complete, synonym discovery, creating entity lists, association between instances and aspects, and data level synonyms discovery...

I really like Alon's research. See also Google's Deep Web Crawl and Web-scale Data Integration: You can only afford to Pay As You Go