- Karen Spärck Jones (26 August 1935 – 4 April 2007) died last Wednesday from cancer, the press release is available here. She was a pioneer in computational linguistics and information retrieval. Karen pioneered the ideas of IDF, BM25, and automatic document summarization. She was also instrumental in helping to create the TREC evaluation model. Karen will be greatly missed. If you get the chance, look over her recent papers (even into 2007).
- SEOmoz posted an article on Search Engine Ranking Factors, Version 2. It incorporates replies from 37 leading SEO practitioners. The important things are not all that surprising: 1) Keyword use in the title, 2) Site/Domain popularity or "authority", 3) anchor text.
I believe one of the most underrated factors in their article is user behavior (relative click-thru rates, time spent on the page, popularity measured via toolbars, number of times bookmarked, etc...). It is hard to measure (unless you are the SE), so I can see how it might be overlooked. However, I believe it is more important than most people realize.
User behavior based ranking is a hot research area, one important researcher in the field is Susan Dumais at MSR. For example, two recent MSR papers are Learning User Interaction Models for Predicting Web Search Result Preferences and Improving Web Search Ranking by Incorporating User Behavior Information. Google has not been publishing in this area, but you can bet that they have created similar models for measuring user behavior to measure and improve the relevancy of their search results.
- David Sifry, founder and CEO of Technorati, posted their quarterly State of the Live Web Report. It has some interesting information on the growth and size of the blogosphere.
- LingPipe is moving to Java 1.5 for their 3.0 release. They have an interesting write-up on their experience using Generics in LingPipe 3.0.