Saturday, August 29

Hadoop: Major Platform Upgrades Coming Soon

The Hadoop world is undergoing rapid evolution. Tom White has a presentation called Hadoop Futures available on slideshare that outlines some of the next major directions.

There are some important changes to keep your eye on. In the next month we will see major releases that will change the Hadoop landscape.

First up is the Hadoop 0.20.1 release. It is a major Hadoop release. It will (likely) be used as the basis for the next Y! and Cloudera distributions. Hadoop 0.20 was released in June, but hasn't been widely adopted until some of the bugs were worked out. Hadoop 0.20.x has a new API that will be used going forward. The upcoming point release has a lot of fixes and features, including the new TFile format. The new version is critical because it opens up the way for releases from the sub-projects.

The 0.20.1 release paves the way for the PIG 0.50 release. PIG 0.40 and 0.50 will have significant performance and other improvements that have been developed over the past months. One key change is that it will likely include PIG SQL support that is now in the trunk.

The release of HBase 0.20 is getting very close. There are great presentations on the new releases given at the recent HBase User Group Meeting at StumbleUpon. Again, one of the key new features is a new HFile format based on the TFile that will be in the 0.20.1 release.

In the very near future we will also see a bug fix release from the Avro serialization system, Avro 1.0.1.

In short, by the middle to end of September we should see the adoption of a new and radically improved Hadoop platform. We can ditch the aging 0.18.x platform. We will finally be able to use the new scheduling systems, simplified API, and take advantage of significant performance and reliability improvements.

Tuesday, August 25

Quick News: Google Traffic update, Yahoo! Search UI changes, and more

  • Google is using your Google Location with My Location enabled to monitor traffic data. It is now integrating this data into Google Maps. This means expanded coverage beyond main roads for traffic conditions.
    Imagine if you knew the exact traffic speed on every road in the city — every intersection, backstreet and freeway on-ramp — and how that would affect the way you drive, help the environment and impact the way our government makes road planning decisions...
  • Yahoo! is testing new search UI changes.

  • Matt Cutts has his WhiteHat SEO Tips for Bloggers. It's a useful presentation because Matt shares what Wordpress plugins he uses and what tweaks he makes to make his blog more search engine friendly.

  • In case you missed it, the MSN toolbar is now powered by Bing.