Three key challenges
- Optimizing task-aware relevance (model long-running user tasks)
- Grid-based content analysis (new computing algorithms)
- Measure/predict/generate engagement
- Anything involving PageRank
- Algorithms for supercomputers
- Folksonomies and tag analysis (at least not yet)
- Friend of a friend
- Unsupervised user-facing techniques (showing result clusters)
o Dawn of search:
- navigation and packets of information
- increasing migration of content online
- new forms of media available online
- infrastructure for payment more comfortable for users
o Moving away from 2.7 words and 10 blue links
- more structured results
- more satisfaction without clicking
- more interaction with web services
- much richer page structure
o The resources people search is changing
- search engines may or may not be the hub
2. Storage trends
o Storage is cheap: any company with tens of employees can store all
text produced by all humans on the planet
- multimedia is another story
o Move away from scale to deep understanding
o Richer models about what's on a page
- page semantics
- user consumption patterns
- aggregate properties
- how do we search it??
3. The problem is bigger than search -- Understanding the user
o why do people lurk versus participate?
o why do people create new personas?
o why are Facebook/YouTube/etc. so successful?
o what new genres are emerging?
- for content creation?
- what tools are appropriate?
o haven't really gotten started
- many proxy measures based on views/clicks/etc.
o too low level
o some contributions
- click prediction
- dynamics of social network analysis
- models of viral marketing
o predictions of engagement still "embryonic"
- generation of engagement remains an art form
o need new science of engagement
- this is not a substitute for creativity
- scientific basis
Stay tuned for coverage of the machine learning challenges...