Wednesday, November 3

Components of Compelling Vertical Search

In this post, I will discuss key components of successful topic-specific vertical search. I was motivated to write it by the launch of blekko earlier this week.

Blekko is marketing its ability to slice the web up into verticals using slashtags. Blekko's slashtags define a list of hosts or page to focus a search. But, that is not enough to be successful. Search in a vertical needs to provide a significantly different experience from general web search. A compelling vertical search engine has the following key components:
  1. Vertical specific ranking. A focused topic should define and utilize ranking features unique to the vertical. It may as simple as the topical classification score for a page. It often requires applying information extraction to identify meaningful document fields. It should also leverage vertical-specific static rank features. For example, use a technique like topic-specific pagerank, an author/source popularity score, or other features.

  2. Rich results. The result objects should be presented in a way that uses the structured and semantic information from the topic. For example, simple examples of this include presentations that use data from Google Rich Snippets and SearchMonkey. This may include topic-specific metadata like authors, political perspectives, addresses, or aggregated user rating scores.

  3. Faceted UI. A vertical should exploit structured metadata for exploratory search. It should allow you to flexibly combine keyword search and structured attribute restriction to limit the search space by: price, airline, manufacturer, genre, date, etc... See the CHI 2006 tutorial and the relevant section from Marti Hearst's Search UI book on eBay Express.

  4. Domain knowledge. A restricted topical domain should model important relationships between objects and concepts to improve retrieval. For example, it should use a Freebase-like knowledge base of objects and their attributes. In a recipe search engine, it would would model ingredients and relationships such as contains:gluten or is kosher.

  5. Task Modeling. A key benefit of a narrow domain is that it should allow users to accomplish complex tasks more easily. It should provide tools and interfaces to more directly allow users to get things done.
Of course, it needs to keep up with web search engines in ranking, comprehensiveness, and freshness, which are all key components of search quality.

For more of my thoughts on these issues, you can see the slides from my ECIR 2008 Industry Day talk The Challenge of Engineering. Vertical Search.

Overall, creating a compelling vertical experience currently requires a lot of hard work and painstaking curation. It requires a deep understanding of the tasks that users perform. It requires modeling the topic and domain objects in meaningful ways. Combining these elements together is difficult to do well. It is extremely hard to do at the scale of the entire web across all topics.

1 comment:

  1. Hi Jeff - you might enjoy checking out - I think we hit pretty much all of your points, and I think you've got the right ones here. I'd add one further point - if you're willing to restrict yourself to a particular domain, you can move beyond the "one url = one search result" paradigm and consolidate information around "known entities" in your domain. For example, a books search engine might find lots of blog posts that are about a particular result, and roll those URLs up into a master "book" entity.