Tuesday, January 22

Top Content of 2007: Open Source is the theme

My top 10 posts from 2007, in descending order according to Google Analytics:
  1. Java Open source Text Mining and Information Extraction tools
    (By far the most popular with almost half my traffic)

  2. Current Open Source Search Engine Libraries

  3. Search at Ebay Part I: Faceted Search and Ebay Express

  4. HBase: Powerset's BigTable

  5. Open source collaborative filtering and recommendation systems

  6. Query Expansion: an alternative to static stemming

  7. Open Source Scraping (Wrapper Generation) Tools

  8. Octopart and SupplyFrame: Part Search Engines

  9. Integrating a Database of Everything with Web Search

  10. SIAM Data Mining Proceedings, LingPipe 3.0, and fun with Pig, Sawzall, and DryadLinq
A major theme was the popularity of visitors looking for information on open source text mining and search engine tools. This year I would like to get more hands-on and dig deeper into some of the tools.

I would be interested if anyone has feedback on what they would like to see done differently, improved upon in 2008.