Thursday, January 8

Lemur Toolkit 4.8 / Indri 2.8 Release

More catching up on end of year software releases. At the end of December version 4.8 of UMass and CMU's LEMUR language modeling toolkit, including version 2.8 of the Indri search engine, was released.

The new version has important fixes and enhancements. You can read the full release notes, but here are some highlights:
  • A new Java GUI for the ireval package
  • Critical fixes to the TextTokenizer (critical if you created indexes with Indri 2.7!)
  • The ModifyFields application to add or remove field data from an Indri Repository without complete reindexing
  • Improvements to the Query Log Toolbar
See the Lemur SourceForge page for download and the wiki documentation.

Wednesday, January 7

Eclipse PHP Development Tools (PDT) 2.0 Release

If you develop PHP applications, then you should look at the Eclipse PDT project. The long awaited PDT 2.0 was released over the holidays.

PDT 2.0 is the first version that works with Eclipse 3.4. PDT 2.0 supports local and remote javascript debugging, greatly improved code completion, and support for object-oriented PHP development. It's a major improvement over PDT 1.0, which was slow and somewhat buggy. PDT is the basis for several commercial products, including Zend Studio.

I like PDT because it gives me a single IDE for developing PHP/Java web applications using Quercus, a Java implementation of PHP. I've found running Quercus on Glassfish to be a successful combination.

Monday, January 5

CIKM 2008 Videos Not Yet Available

CIKM 2008 was back in mid-October and many of the talks were recorded by VideoLectures. However, despite news that they would be available in November, and then early December, they are still unavailable.

I heard rumors that this was due to presenters not submitting the necessary release forms.

Does anyone have an update on when they will be available?

Terrier 2.2 Release

Happy new year! It's been a nice holiday break, but it's good to get back to work. First, some catching up.

The IR Group at Glasgow released Terrier 2.2 right before Christmas, see their blog post for details. It's a major update because it adds support for distributed map-reduce index creation using Hadoop. There is currently no equivalent for distributed retrieval, but they are working on it.

I haven't seen formal performance numbers, but from what I hear it is quite scalable; it should be able to index billions of documents on a commodity cluster.