Monday, November 16

New Yahoo! Research Demo: Quest for NLP based Q&A exploration

Hugo Zaragoza let me know that Yahoo! research has a new demo out, Quest. Quest is a faceted navigation interface on Q&A data. It lets you browse using key phrases, nouns, and verbs extracted from a dependency parse of the questions.

For their description, you can read the announcement on the Y! Sandbox. The demo uses a set of 8 million Q&A documents from Yahoo! Answers collected in 2007. Here's their description of some of the challenges they faced:
The first one is to select the right "lexical units" of the collection in order to produce meaningful browsing suggestions. The next challenge is to develop interesting list suggestions, on the fly, for whatever query the user may submit. Lastly, we had to invent an interface that would allow users to interact with the suggestions and the results, and enable a natural browsing experience.
They used the DeSR dependency parser to extract terms and phrases and then use a forward index with Archive4J to count and sort the terms in the questions that are returned by a query.

I tried it for pasta and then filtered to "pasta salad" I was hoping that some of the nouns would include common ingredients: bacon, chicken, olives, onion, pepperoni, mozzarella cheese, etc... However, most of the nouns/verbs are more general and somewhat redundant given my selected filters. I think the algorithm to select the terms could still be improved.

Faceted search interfaces are important browsing tools, and automatically extracting and selecting facets is a challenging problem. It's good to see first steps applying NLP to the task. I look forward to seeing how Quest evolves.

Be sure to check out the Correlator demo if you haven't seen it.

1 comment:

  1. Nice post! I agree that there are issues with term extraction. Also huge issues with the data set. Out of several queries I tried, only a few listed sensible results. :-/