Thursday, August 9

Data driven decision making and controlled experiments

Most of the major web companies use controlled experiments to test new features and products on their website.

First, if you want to learn how these work you should read a recent paper published by Ron Kohavi from Microsoft:
Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO from SIGKDD 2007 (Knowledge Discovery and Data Mining), which is happening next week in San Jose. Essentially, he provides guidance on running experiments based on his experience running controlled tests at Amazon and Microsft.

Fist, I have to say that creating good tests is HARD. One of the hardest parts in testing is creating what Ron calls the Overall Evaluation Criterion (OEC) that maximizes not only short term profits, but also long term profits and customer satisfaction. The long-term impact is often overlooked in favor of short-term gain.

For example, spamming a customer with lots of promotions now may be great for short term sales, but in the long term it is not the best decision because of its longer term effect of increasing user unsubscribe rates thus limiting the scope of possible future promotions. All too often it is hard for profit-driven companies to take a long view and think beyond this quarter's (or if you are lucky, this year's) bottom line.

The authors have some good advice, suggesting that if you don't get the results desired you should drill-down and look at many metrics along with slicing and dicing by user segment in order to understand what happened in more detail; learning from what didn't work. The devil is in these oft overlooked details.

Erik Selberg raises this issue when he comments on the 'data driven decision process.' He writes,
A data-driven approach has to start with the right question, followed by experiments that provide data that is properly interpreted to provide the answer. Typically, people fail in either starting out with the wrong question, or by conducting poor experiments that produce flawed data. An evil twin of flawed data is what I call Executive Data Bias. A decision-maker will have a certain bias on what to do, and is looking for data to back up that decision. Thus, flawed data that backs up the decision is accepted without much probing, while good data and the implications are rejected, typically by asking for “more experiments” or “more data,” or questioning assumptions made in the experiment or question.
He goes on to say that it can work, but it must be done very very carefully.

I sometimes hear, 'we tested that, it didn't work', but too often when I ask why not, what didn't work in detail I don't get satisfactory answers. This incomplete test data is then used to dismiss projects that may still hold great potential if we only understood what failed in test that was run in more detail. Unfortunately, more often than I would like, I find myself in Erik's camp.

Hopefully, Ronny's team at MS and others will help all of us become more educated on this important topic.

Erik Selberg's SIGIR 2007 wrap-up

Erik, a senior developer at MSN search labs, has his wrap-up of SIGIR.

Definitely worth reading the papers on his list.

Monday, August 6

Semantic Web Progress: Radar Networks and MetaWeb

The recent news on Hypertext Solutions prompted me to check the status of two other Semantic Web based startups, MetaWeb and Radar Networks.

Radar Networks
First, a recent Business 2.0 article: Web 3.0: No Humans Required, finally provides some details on the application they are building:

The first consumer app Radar plans to launch is a sort of personal data organizer. It will allow you to bring in e-mail, contacts, photos, video, music --anything digital, really -- from anywhere on the Web, turn it into RDF, and access it in one place.

Semantic tags are added manually, or automatically if the item is a photo from Flickr or a video from YouTube. "We add a new level of order to connect and interact with these things at a higher level than is possible today," Spivack says. "We are letting you build a little semantic Web for your project, your group, or your interest."... When it's done, it should be like the best wiki you've ever used.

There is also a less thorough (and less accurate) BusinessWeek article. In the comments founder Nova Spivack clarifies:
Radar Networks is actually combining human and machine intelligence, leveraging social networks and user-generated content as well as artificial intelligence. We're not attempting to overlay a lot of new structure on the Web. We're actually trying to make sense of the structure that is already there. By combining the semantic web with social networks, a more powerful level of collective intelligence can be achieved. Our focus is not only on organizing information but also in helping people collaborate more intelligently around interests and activities. We'll be sharing more as we head towards our beta launch in the fall of 2008.
FreeBase (by MetaWeb)
FreeBase is a centralized, open, database of semantically tagged data. “We’re trying to create the world’s database, with all of the world’s information,” says founder Danny Hillis.

There is a recent IT Conversations interview with MetaWeb co-founder, Robert Cook. Apart, from that, there has been little news on FreeBase or MetaWeb in several months, but you can still read the March NY Times Article.

It exciting to see growth in this area evident in the phenomenal growth of mashups as people start using rss, xml technologies, and microformats to blaze the first trails connecting structured data across services. I will conclude with a quote from Tom Coates of Yahoo from the Business 2.0 article that captures it nicely,
"It's in the combination that the real power of this comes out. The mashup is an early example of the Web that is to come...The goal is the most important thing: reusable, repurposable, and reconnectable data. How we get there is not as important."
Yahoo Pipes is a platform for creating mashups; it is leading the way in this next phase of this development.

Whether specific technologies (RDF, OWL, etc...) are adopted is inconsequential in the long run. The promise of a web of data instead of mere HTML that can be combined and cross-referenced holds too much opportunity to go unrealized.

Hypertext Solutions: Another NLP based search startup

Hypertext Solutions is a stealth-mode company working on a next-generation 'web 3.0' (search?) platform. It recently purchased Insightful's InFact NLP text analytics platform for 3.65 million dollars, see the press release. As part of the transfer, Hypertext is getting Insightful's Director of Engineering for Text Analysis and Search, Deep Dhillon, see the post on his blog. According to his LinkedIn profile, Dhillon was the:
Project and Technical Lead for the design, development and deployment of InFact, a scalable natural language based next generation search engine.
According to the InFact product page, now defunct, it says:
Advanced entity and relationship extraction capabilities enable researchers and analysts to quickly uncover the critical facts and trends contained within extensive collections of unstructured textual information. Results are sorted and summarized according to user parameters, with hidden relationships between people, places, objects, and organizations highlighted.
HyperText Solutions is one of several semantic search startups including PowerSet, Hakia, and Spock.

Sunday, August 5

SIGIR 2007 coverage including Karen Spärck Jones video

I lamented last week on the lack of SIGIR coverage on blogs. However, some is finally beginning to surface.

Andre Vellino has a brief overview, including the best paper awards:

Best Student Paper
Relaxed Online SVMs for Spam Filtering by D. Sculley, G. M. Wachman (Tufts University)
Best Paper
Neighborhood Restrictions in Geographic IR

Karen Spärck Jones Address Video
In other coverage, SIGIR has posted links to Karen Spärck Jones' acceptance of the ACM Athena Award on the University of Cambridge's website (as well as the ACM Portal). It is 33 minutes long and was recorded not long before her death.