There are some that think the "next big thing" is tagging. This is all part of the "Web 2.0" way of doing things where users generate content. The most famous examples of these models are Wikipedia, Flickr, and Del.icio.us.
The question is, can this be extended to the web as a whole? Search engines crave high quality meta-data about web pages. First, they use sophisticated computer algorithms, like clustering, to derive meta-data. However, sometimes humans can provide more insightful data. Users can generate this data explicitly by tagging urls directly, or implicitly through some by product of using the service, even by playing a game. One of the coolest examples I have seen of this type of system is the ESP Game. The ESP Game is an attempt by CMU researchers to get users to label image data. In fact it is entitled "The ESP Game: Labeling the web". Very compelling incentive -- addictive fun.
One group trying to build a high quality social network-tag-based search engine is RawSugar. There is an interesting interview with its founder over on Free Internet Radio (Thanks to Topix.net's Weblog). At first glance RawSugar may appear to be another Delicious rip-off. However, it is more than a social bookmarking platform -- it is the first real tag based social search engine. A Raw Sugar employee provides a good description of this differentiation over on Tech Crunch:
Most importantly, our search is not the same as del.icio.us and most (though not all) of the other sites in the tagging space–we search the tags, notes and full text of pages saved into our system while del.icio.us, at least for now, only searches tags and, i think, notes.RawSugar is angel funded with about ten engineers working on the engine. They have just made some very interesting service upgrades, check out their blog for details. According to a recent interview with CEO Ben-Shachar they are using an interesting mix of technologies, including PostGresSQL and Lucene. Lucene is an Apache project -- a very popular open source indexing library, in Java and other languages.
Right now I would say Raw Sugar is more of an experiment than anything else -- it only has about 135,000 pages indexed (based on stop word tests my estimate is about 170k) and an undisclosed number of users. If it can scale and attract a sizable user base it could be something to watch. At the very least, it is an experiment to learn from.
Rollyo is another search engine using a more implicit approach to tagging. It allows people to create their "own custom verticals" by performing restricted searches across a collection of sites organized into a "Roll". One of the by products of creating a roll is the creation of a human created cluster of sites organized under an informative title and keywords. One of the biggest questions I have about Rollyo is: Can it scale? Users are currently limited to 20 sites in a roll and you can only search one roll at a time. Is being able to restrict a keyword search to a list of websites enough incentive to use the service? I'm not convinced -- I think there is a lot of potential, but will it catch on? What compelling new features does it offer to get people to switch?
The question that these and others are trying to answer is: How can search engines get users to tag web pages with usable content as a by product of their daily surfing? What incentives motivate users to provide reliable and useful tags? And lastly, how can search engines handle spam in these tagging systems?
To sum it all up, I'm not sure if tagging will work. Right now I have more questions than answers -- and the questions are still fairly nebulous. I hope to refine these questions when I attend the WWW 2006 conference and hopefully attend the Collaborative Web Tagging workshop on May 21st. Raw Sugar, Yahoo, and other major players will be taking part, so I have high hopes for an interesting discussion. More on the confirmed speakers at the Raw Sugar Blog...
Social Consequences of social tagging
There is also a paper available via the ACM on the ESP game:
"Labeling images with a computer game"