Roughly speaking, realtime search means:
- documents are available to to queries immediately after indexing, without any reindexing or index merging;
- later documents are more important than earlier documents.
Whistlepig takes these principles to an extreme.
- It only returns documents in the reverse (LIFO) order to which they were
added, and performs no ranking, reordering, or scoring.
- It only supports incremental indexing. There is no notion of batch indexing or index merging.
- It does not support document deletion or modification (except in the
special case of labels; see below).
- It only supports in-memory indexes.
Features that Whistlepig does provide:
- Incremental indexing. Updates to the index are immediately available to
- Fielded terms with arbitrary fields.
- A full query language and parser with conjunctions, disjunctions, phrases, negations, grouping, and nesting.
- Labels: arbitrary tokens which can be added to and removed from documents at any point, and incorporated into search queries.
- Early query termination and resumable queries.
- A tiny, <>
Tuesday, March 1
WhistlePig: A minimalist real-time search engine
William Morgan recently announced the release of Whistlepig, a real-time search engine written in C with Ruby bindings. It is now up to release 0.4. Whistlepig is a minimalist in memory search system with ranking by reverse date. You can read William's blog post for his motivations for writing it. Here is a description from the current readme: