Sunday, June 15

Google I/O Presentations, Google Infrastructure

The Google I/O presentations are now online via Google Sites.

Underneath the Covers at Google: Current Systems and Future Directions by Jeff Dean. James Hamilton has a summary online. [Found via Greg].

A single query currently hits between 700 to 1000 servers. Slide 38 from his slides lays out the basic query serving architecture (Index shards + replicas for redundancy and query volume). I'm sure there are more gems here.

I also ran across a few more presentations I'd like to watch:
Effective Java Reloaded by Josh Bloch.
How to Index Your Geo Data by Lior Ron and Mano Marks

How Microsoft Live Search Plans to Differentiate Itself

Robert Scoble posted a short video interview with Brad Goldman, the General Manager of Windows Live Search.

Robert's title for the interview, "Microsoft Search: Will It Use Mahalo Techniques to Compete With Google?" is blatant link bait, trying to invoke the controversy over Jason Calacanis's Mahalo. I'll save you the time of watching the video. The short answer is a resounding no. MS is focused on scalable algorithmic solutions (Search is still about the long tail; 50% of MS's queries are distinct within a given week). Below are my other notes from the interview.

For users, the most important factors are relevance and the speed (the time to find information, not simply search responsiveness). Most users still think that relevance is the most important factor in search engine usage. There's still a lot of room for improvement.

Brad claims that all three of the search engines are about even on relevance. However, he acknowledges the fact that there is no objective third party measuring service and that this is still very subjective. [I don't believe it. I want them to put their data where their mouth is so that we can publicly decide. I suspect there is bias in their evaluation methodology.]

Focus on relevance
- A big milestone was surpassing an index of 20 billion documents (last fall around Searchification 2007). They are now confident that they can keep up with the growth of the web and have a fresh and comprehensive search index.
- Last fall they had 85% of their search team working on relevance. It's really important priority.
- A goal for the coming year is to continue to focus on great relevance.

Task-centric search
Brad claims that search is becoming more task centric. He categorizes searches into four broad categories:
  1. Users are looking to be entertained
  2. Users are looking to buy something (commercial queries)
  3. Users are looking to find an article or other piece of information (research)
  4. Users are looking to navigate (navigational queries)
[It's interesting that he broke out entertainment as a query type. He didn't elucidate clearly on what this really meant or what it's chracteristics were. I don't buy it as a top four label for search query behavior.]

Differentiation plans: Better Shopping Experience
Brad talked about Microsoft plans to differentiate itself by focusing on no. 2 above, the commercial queries. One push on the consumer side is to turn Windows Live Shopping into Live Search Cashback, [It's not clear how Live Product Search will be incorporated]. Brad mentioned briefly a plan to better incorprate reviews and continue to do more to improve the user experience for this class of queries. For the advertiser, they want to move from a CPC model to a more productive CPA model.

Wrap-up and analysis
Overall, I expected more interesting information from the interview. Scoble rambled on about Twitter and Friendfeed, which he clearly loves, but which didn't produce any interesting discussion. The interview did clarify Microsoft's push on shopping as a path for differentiation. It was also comforting to see that MS is still focused on improving relevance.

It makes sense for MS to focus on shopping first; follow the Benjamins. Creating a better user experience in shopping could lead to a greater share of monetizable searches, providing greater revenue opportunity per search.

Live Search's plan of differentiating itself in commercial search reminds me of the shopping vertical search engine, searches over 5.6 billion web pages and uses its patent-pending AIR (Affinity Index Ranking) search technology to provide the Internet's most useful product reviews and guides, and then makes it easy to find and buy products from brand name retailers at the best prices. With over 25 million products from 5000 merchants, provides the Web's most robust and easy to use combination of relevant product research and comparison shopping.
Will Microsoft buy

Personally, I don't find better shopping a compelling vision for search. But, maybe the team will prove me wrong.