Tuesday, July 14

Knowing your data (and your domain) matters

Greg has an article for CACM, The biggest gains come from knowing your data. The article argues that carefully crafted features and judicious selection of your algorithms to match your data provides significant performance gains over of-the-shelf algorithms and generic features. He highlights the Netflix contest and insights into user rating behavior as a key example of this.

He also posted his summary of the Collaborative Filtering with Temporal Dynamics paper.

  1. Nice find. I particularly like Linden's emphasis upon having a metric for success and a way of testing against that metric as being essential for ongoing progress (but then I would...).