Tuesday, March 14

Vortals -- Spiraling upwards, or down the drain

Raul Valdes-Perez, from Clusty / Vivisimo, has been making the rounds claiming that Vortals are making a comeback. While I have seen specialized search take off, I have to disagree, I haven't seen the vortals take off. He is going to give a talk at the upcoming InfoNortics search technology conference on Vortals. However, the talk hardly looks like anything new for the conference.

Raul outlines the 5 reasons vortals are making a comeback in several recent media organization, and he released a short document on the clusty website. Let me summarize, here are the five reasons:
  1. Bandwidth and hardware were expensive, now they're cheap.
  2. Web crawling was hard (and expensive), now its easy (and cheap)
  3. Search engines were unreliable, so you had to create your own in house index. Now that they are reliable (most of the time), all you need is metasearch. (Ironically, one reason metasearch originated was because search engines were unreliable, as a I discuss in my meta-search history post.)
  4. You don't have to build a big, complex, directory because web search is now relevant.
  5. CPC ads will pay for all this, so its all works out.
And of course, don't forget to cluster the results using Vivisimo's whiz-bang technology. As cool as clustering technology is, I don't think it's as much as help here as is claimed. Don't get me wrong, Clusty is really cool and it has some interesting technology. If you haven't tried it, you should. I'll hope to write more about it on my series on Metasearch. But, I wouldn't put as much importance on it as he predictably does.

The problem I see is this: people like Google, getting people to switch is hard. In order to accomplish this you need to have something new, remarkable, and very compelling in order to be able to compete. The above isn't. At best, what he proposes is an incremental improvement over Google. Not enough to get people to switch, in my opinion.

Oh, and creating a really good vortal isn't as easy as it sounds! According to Raul, "The entire process can take as little as a week or two." I don't buy it. It takes blood, sweat, and tears. It also takes time, time to perfect the product and build partnerships with information vendors to get "hidden web" content. Creating something unique and useful, real innovation, isn't something that is done overnight.

So, what is the future? One possible set of answers is personalized search and topical search, but not mashed up as easily as Raul's solution. Google has been rolling out personalized and topical search quietly, but aggressively. An old demo of topical search is the "site flavored search." Take a look. It correctly identifies my page as Internet and Programming related (out of the DMOZ categories?). Not too shabby! Although, it is very coarse grained as Greg Linden has noted, but then again so are most topical search engines. However, looking at the results of my first search, I'm not delighted. My query for "eclipse sunshade" didn't produce the most relevant results, confusing my search for an Eclipse IDE plugin with a car windshield accessory.

This technology and much of the rest of Google's personal search tech, as Greg Linden points out, leads back to Kaltix. He points to this fascinating paper on "Scaling Personalized Web Search." It was written by Kaltix's founders at Stanford. They now most probably work at Google since Kaltix's acquisition. Their paper also references a very interesting article paper on "Topic-Sensitive PageRank," which is relevant to the discussion and some readers may find it interesting as well, I did.

Perhaps I'm wrong with all of this, perhaps Clusty inspired vortals shall conquer the land. However, for now, I'll stay alert and keep looking for something truly remarkable, like a purple cow.