Monday, February 28

Palantir: Next Gen Platform for Information Analysis

Palantir is a very ambitious new tech company building a high-powered information analysis platform. They currently have products targeted for the government and the financial industries. Their product is a highly specialized enterprise data system to support intelligence and business analysts.

What does Palantir do?
... the most central hard problem that we address in trying to enable the analyst is data modeling, the process of figuring out what data types are relevant to a domain, defining what they represent in the world, and deciding how to represent them in the system. At Palantir we make sure our data model (ontology) is both flexible and dynamic, and that it mirrors the concepts people naturally use when reasoning about the domain.
The platform handles both structured and unstructured information and performs extraction and data integration. See their infrastructure page and white videos for a few more details.

Their data platform handles objects. An Object in their platform has four object components:
- Properties: text object attributes
- Media: images, video, and binary data
- Notes: free text
- Relationships: links between objects

Clients can specialize this generic object to have specific types using their "Dynamic Ontology" tool to define the semantics. Their platform has one fixed schema with 5 tables: object, property, notes, media, and object-object. An object is linked to one or more data sources which is critical for data lineage and access controls.

A key component of the platform is search over the objects. According to their blog, their scenario has two differentiating features from web search:
  • Realtime indexing and querying – we need information to be available immediately as it changes in the system.
  • Leak-proof access controls – we need the search engine to help us make sure that we don’t have information leaking across access control boundaries.
They go into more detail on their modifications to Lucene for their use cases in two blog posts, Search with a Twist Part I and Part II. From the comments, it sounds like they are using a custom branch of Lucene 2.4.

Palantir's platform combines data processing over large heterogenous datasets, filtering, mapping, visualization, and search in unique ways to create a compelling toolset. It built an intelligence platform that the Government could not do themselves by recruiting a team of uber-geek talent lured by hip silicon valley panache worthy of James Bond.


  1. They were founded in 2004, so I'm not sure I'd describe them as "new". Their technology is very interesting, however!

  2. Its the same palantir as the one associated with HB Gary!

  3. There's a fun discussion about Palantir over at David Kellogg's blog: