Update 3/3/2010: Added Mahout
Background from Amazon
Since we are talking about recommendations, the first stop is Amazon and the creator of its original system, Greg Linden, author of Geeking with Greg. (I had the opportunity to meet Greg at SIGIR this past summer and we had some great discussion during the poster session.) Greg's "Early Amazon" posts really provide fascinating insight into some of Amazon's early days. The Amazon recommendation system started as a side project that he wasn't supposed to be working on, read the full story and don't miss his earlier story on his first attempt at a system, BookMatcher.
Since we are talking about recommendations, the first stop is Amazon and the creator of its original system, Greg Linden, author of Geeking with Greg. (I had the opportunity to meet Greg at SIGIR this past summer and we had some great discussion during the poster session.) Greg's "Early Amazon" posts really provide fascinating insight into some of Amazon's early days. The Amazon recommendation system started as a side project that he wasn't supposed to be working on, read the full story and don't miss his earlier story on his first attempt at a system, BookMatcher.
Current Systems
Recently, a lot of work on distributed recommendation systems is happening in Apache Mahout, a distributed machine learning library that uses Hadoop. The Taste recommender was incorporated into it. The first version was originally started as work on the NetFlix contest. (via Greg). The Mahout library has support for KNN, SVD, and Frequent Pattern Mining using Parallel FP-Growth. Some of the recommendation algorithms are more mature than others: so you'll be getting your hands dirty getting some of them to work. Despite it lack of maturity, this would be my first stop if I was building a system today.
A simple content based recommender could be built using a search system to take an object and convert it into a query. See the open-source search engines.
Other Related Work
Another specialist in this area is Daniel Lemire, a researcher at the University of Quebec in Montreal. He wrote this paper on a simple and effective recommendation engine using SQL and PHP, the code is available on the site. There is a related project, Vogoo in PHP which appears to be actively maintained. Daniel also wrote a version of the item based recommender engine in Java, Cofi.
CoFE (Collaborative Filtering Engine) is another open source Java based engine created by Jon Herlocker from the University of Oregon, but I don't believe it is being maintained; it looks like it hasn't been updated since 2004.
Ray Mooney at the University of Texas has also been working on recommendation research as well, his main specialty is information extraction and machine learning. Here are some of his department's publications. Specifically, here are some introductory level slides from a recent course he taught on Information Retrieval.
That pretty much covers recommender systems for today. You can always check the Wikipedia article on Collaborative Filtering (CF) for updates. Again, many of these systems use machine learning and classification, which fits nicely with my previous post on text mining.
Another specialist in this area is Daniel Lemire, a researcher at the University of Quebec in Montreal. He wrote this paper on a simple and effective recommendation engine using SQL and PHP, the code is available on the site. There is a related project, Vogoo in PHP which appears to be actively maintained. Daniel also wrote a version of the item based recommender engine in Java, Cofi.
CoFE (Collaborative Filtering Engine) is another open source Java based engine created by Jon Herlocker from the University of Oregon, but I don't believe it is being maintained; it looks like it hasn't been updated since 2004.
Ray Mooney at the University of Texas has also been working on recommendation research as well, his main specialty is information extraction and machine learning. Here are some of his department's publications. Specifically, here are some introductory level slides from a recent course he taught on Information Retrieval.
That pretty much covers recommender systems for today. You can always check the Wikipedia article on Collaborative Filtering (CF) for updates. Again, many of these systems use machine learning and classification, which fits nicely with my previous post on text mining.
2 comments: