Friday, July 2

Google Scholar and Microsoft Academic Search Updates

Google Scholar announced yesterday a new feature to allow search within a citing articles.

Microsoft has also re-released the Microsoft Academic Search from MSR Asia. The MAS search has a faceted search interface which allows filtering by author, conference, and journal. In addition, it goes beyond searching papers by providing a page for each author. This is quite useful and it has interesting visualizations to show citations over time and connections to other authors.

It's great to see improvements in this area. Scholarly research is still far too difficult and we have a long way to go in making the search tools more capable.

Thursday, July 1

2010 Hadoop Summit

Yesterday was the 2010 Y! Hadoop Summit. Be sure to read Ryan Rosario's coverage. Many of presentation talks are linked from the Agenda page and via the YDN slideshare channel. I'll highlight a few presentations that caught my attention:

XXL Graph Algorithms - by Sergei Vassilvitskii Connected component analysis in large graphs

Mining Billion-node Graphs: Patterns, Generators and Tools - Jimmy Lin's presentation on experience computing PageRank on a section from the ClueWeb09 web graph.

Hopefully, the morning talks will also be made available online.

Wednesday, June 30

Eclipse Helios Release - Improved C++ and Web tools

Every June we look forward to a new version of Eclipse. Earlier this month we had the release of, Eclipse 3.6 aka Helios.

Get started with a tour from IBM developerworks. A few participating plugins worth highlighting:

EGit - an Eclipse plugin for the GIT version control system.

Linux Tools - A full featured C and C++ IDE building on the older CDT development toolkit.

Javascript Development Tools - Updated and improved javascript development and debugging

Tuesday, June 29

ICML 2010 and Yahoo! Learning to Rank Workshop

Last week was ICML 2010 in Haifa. You can read the Hal's coverage on NLPers. The conference also had two workshops of note, the Yahoo! Learning to Rank Workshop and the Machine Learning Open Source Software (mloss) workshop. I'm going to focus mainly on the LTR workshop, but be sure to check out the mloss site for more details.

One highlight of YLTR was Chris Burges' MSR team winning track 1 with LambdaMART. They given an overview of their method in a recent tech report:

From RankNet to LambdaRank to LambdaMART: An Overview
LambdaMART is the boosted tree version of LambdaRank, which is based on RankNet. RankNet, LambdaRank, and LambdaMART have proven to be very successful algorithms for solving real world ranking problems: for example an ensemble of LambdaMART rankers won Track 1 of the recent Yahoo! Learning To Rank Challenge.
The other winning teams were from the Russian search company, Yandex. See the company blog post on the topic (via Google translate). You can also read the presentations from the top leaders:
The two teams' methods are related to those used by Yandex for it's ranking.