Thursday, January 8

Lemur Toolkit 4.8 / Indri 2.8 Release

More catching up on end of year software releases. At the end of December version 4.8 of UMass and CMU's LEMUR language modeling toolkit, including version 2.8 of the Indri search engine, was released.

The new version has important fixes and enhancements. You can read the full release notes, but here are some highlights:
  • A new Java GUI for the ireval package
  • Critical fixes to the TextTokenizer (critical if you created indexes with Indri 2.7!)
  • The ModifyFields application to add or remove field data from an Indri Repository without complete reindexing
  • Improvements to the Query Log Toolbar
See the Lemur SourceForge page for download and the wiki documentation.

4 comments:

  1. any further information about the textokenizer bug?

    ReplyDelete
  2. FD -- I was hit by this bug -- it was an overly greedy regex for finding the end of a comment block in HTML.

    see
    http://sourceforge.net/tracker/index.php?func=detail&aid=2080937&group_id=161383&atid=819615

    ReplyDelete
  3. Thanks Jon.

    The links to the details for all the bug fixes are in the release notes.

    I got hit by it indexing GOV2 back in September.

    ReplyDelete
  4. Hi Jeff,
    I need to know how to invert file (posting file) using indri lemur? Can u provide some sample data and some instruction ? Do you have any reference other than lemurproject website..

    ReplyDelete