Thursday, September 10

Yahoo! Key Scientific Challenges Coverage I: Challenges in Search

Yahoo! put out a press release on the Key Scientific Challenges Summit. My labmate, Henry, attended the summit, presenting work on detecting searcher frustration. He took some great notes to share. First up are the notes from Andrew Tomkins, Chief Scientist, Yahoo! Search talking about key challenges in search.

Three key challenges
  1. Optimizing task-aware relevance (model long-running user tasks)
  2. Grid-based content analysis (new computing algorithms)
  3. Measure/predict/generate engagement
Many non-issues:
  1. Anything involving PageRank
  2. Algorithms for supercomputers
  3. Folksonomies and tag analysis (at least not yet)
  4. Friend of a friend
  5. Unsupervised user-facing techniques (showing result clusters)

Challenges (details)
1. Task-aware
o Dawn of search:
- navigation and packets of information

o Today:
- increasing migration of content online
- new forms of media available online
- infrastructure for payment more comfortable for users

o Moving away from 2.7 words and 10 blue links
- more structured results
- more satisfaction without clicking
- more interaction with web services
- much richer page structure

o The resources people search is changing
- search engines may or may not be the hub

2. Storage trends
o Storage is cheap: any company with tens of employees can store all
text produced by all humans on the planet
- multimedia is another story

o Move away from scale to deep understanding

o Richer models about what's on a page
- page semantics
- user consumption patterns
- aggregate properties
- how do we search it??

3. The problem is bigger than search -- Understanding the user
o why do people lurk versus participate?

o why do people create new personas?

o why are Facebook/YouTube/etc. so successful?

o what new genres are emerging?
- for content creation?
- participation?
- what tools are appropriate?

o haven't really gotten started
- many proxy measures based on views/clicks/etc.

o too low level

o some contributions
- click prediction
- dynamics of social network analysis
- models of viral marketing

o predictions of engagement still "embryonic"
- generation of engagement remains an art form

o need new science of engagement
- this is not a substitute for creativity
- scientific basis

Stay tuned for coverage of the machine learning challenges...

