Friday, July 24

Ivory: A New MapReduce Indexing and Retrieval System

To coincide with the SIGIR MapReduce Tutorial, Jimmy Lin announced the release of Ivory, an open-source MapReduce search platform. It is a web-scale indexing and retrieval system built on top of Hadoop. Since it's based on Hadoop, it's clearly written in Java.

For retrieval it uses Don Metzler's Searching using Markov Random Fields (SMRF) Java implementation. You can read his publications on the topic. It's exciting to finally get a chance to play with the implementation of one of the state-of-the-art retrieval tools. To my knowledge this is the first time Don's Java MRF toolkit for retrieval is available to the public.

Ivory is aimed at IR researchers as a platform for experimentation. This is an early release with a lot of rough edges.

Jimmy is using Ivory to index the ClueWeb09 dataset, which has 500 million English documents for the TREC web track.


  1. Hibernate and spring are the frameworks of Java. A java developer should be well aware of these frameworks in order to master the technology and work efficeiently.
    spring training in chennai | hibernate training in chennai
    FITA Academy reviews

  2. Java is the most robust secured and multi threaded programming language which is the reason why most the the developers go for java. A single java code can be used for various platforms.
    JAVA training in chennai | java training institutes in chennai | FITA Academy Chennai

  3. We are offering web service, designs and application off Android and IOS and much more......

  4. This is really awesome. Full of knowledge and latest information about web design.Contact Page

  5. Java is a programing language which needs no introduction. Java is immensly popular anguage which is used in building softwares in mobile app or desktop. Even today java is used to program tools like hadoop, owing to this java has becom imensley popular and one of the most preffered language around the world.
    Java training in Chennai | Java training institute in Chennai | Java course in Chennai

  6. Hadoop is one of the best cloud based tool for analysisng the big data. With the increase in the usage of big data there is a quite a demand for hadoop professionals.
    Big data training in Chennai | Hadoop training Chennai | Hadoop training in Chennai

  7. Really Nice Blog. Thank you for Sharing. We are the best erp software providers in chennai. For more details call +91 9677025199 or email us on erp in chennai | erp for automotive industry chennai

  8. Thank you for sharing this wonderful blog. It will help to improve my knowledge about JAVA frameworks. This blog will know more about JAVA concepts. Also, it will improve my programming skill.
    Spring Training in Chennai | Spring framework Training | Spring framework Certification | Spring Hibernate Training in Chennai