Monday, July 7

Susan Dumais SIGIR 2014 ACM Athena Award Lecture

Sue Dumais
 - Introduced by Marti Hearst
http://research.microsoft.com/en-us/um/people/sdumais/

Putting the searcher back into search

The Changing IR landscape
 - from an arcane skill for librarians and computer geeks to the fabric of everyone's lives

 - How should the evaluation metrics be enriched to characterize the diversity of modern information systems?

How far we have come....
 - The web was a nascent thing 20 years ago.  
 - In June '94; there were 2.7k websites (13.5% were .com)
 - Mosaic was one year old
 - Search in 1994: 17th SIGIR;  text categorization, relevance feedback and query expansion
 - TREC was 2.5 years old (several hundred k newswire; federal register)
 - TREC 2 and 3, the first work on learning to rank
 - Size of Lycos debuted (Fuzzy Malden), # web pages - 54k pages (first 128 characters)
   --> 400k pages, to 10s of millions
   --> The rise of infoseek, altavista
 - Behavioral logs: #queries/day: 1.5k

Today, search is everywhere
 - Trillions of webpages
 - Billions of searches and clicks per day
 - A core part of everyday life; the most popular activity on the web. 
 - We should be proud, but... search can be so much more.

Search still fails 20-25% of the time.  And you often invest way more effort than you should. Once you find an item, there is no opportunity to do anything with it. 
- Requires both great results and great experiences (understanding users and whether they are satisfied)

Where are the Searchers in Search?
 - A search box to results
 - But, queries don't fall from the sky in an IID fashion; they come from people trying to perform tasks at a particular place and time. 
 - Search is not the end goal; people search because they are trying to accomplish something. 

Evaluation
 - Cranfield style test collections
 - "A ruthless abstraction of the user .."
 - There is still tremendous variability across topics. 
 - What's missing?
  --> Characterization of queries/tasks
       -- How are they selected?  How can we generalize to?
  --> We do not tend to have searcher-centered metrics
  --> Rich models of searchers
  --> Presentation and interaction


Evaluation search systems

Kinds of behavioral data
 - Lab studies  (detailed instrumentation and interaction)
 - Panel studies (in the wild; 100s to 1000s; special search client)
 - Log studies (millions of people; in the wild, unlabeled) - provides what, not why

Observational study
 - look at how people interact with an existing system; build a model of behavior.

Experimental studies
 - compare existing systems; goal: decide if one approach is better than another

Surprises in (Early) web search logs
 - Early log analysis...
  --> Excite logs in 1997, 1999
  --> Silverstein et al. 1998, 2002
  --> web search != library search
  --> Queries are very short, 2.4 words
  --> Lots of people search about sex
  --> Searches are related (sessions)

Queries not equally likely
 - excite 1999; ~2.5 million queries
 - top 250 queries are 10% of the queries
 - almost a million occurred once
 - top 10: sex, yahoo, chat, horoscope, pokemon, hotmail, games, mp3, weather, ebay
 - tail: acm98; win2k compliance; gold coast newspaper

Queries vary of time and task
 - Periodicities, trends, events
 - trends: like Tesla, repeated patterns: pizza on saturday
 - What's relevant to the queries changes over time (World Cup) -- What's going on now!
 - Task/Individual - 60% of queries occur in a session

What can logs tell us?
 - query frequency
 - patterns
 - click behavior

-- Experiments are the life blood of web systems
 - for every imaginable system variation (ranking, snippets, fonts, latency)
 - if I have 100M dollars to spend, what is important?
 - Basic questions:  What do you want to evaluate?  What are the metrics?

Uses of behavioral logs
 - Often surprising insights about how people interact with search systems
 - Suggest experiments

How do you go from 2.4 words to great results? 
 -> Lots of log data driving important features (query suggestion, autocompletion)

What they can't tell us?
 - Behavior can mean many things
 - Difficult to explore new systems

Web search != library search
 - Traditional "information needs" do not describe web searcher behavior
 - Broder 2002 from Alta Vista logs
 - They did a pop up survey in Jun-Nov through 2001

Desktop search != web search
 - desktop search, circa 2000
 - Stuff I've Seen
 - Example searches:  recent email from Fedor that contained a link to his new demo; query: Fedor
 pdf of a SIGIF paper on context and ranking sent a month ago; query: SIGIR
 - Deployed 6 versions of the system
 -> Queries: very short;  Date was by FAR the most common sort order
 -> Seldom do people switch from the default, but they did from best match to Date; the information from James; people remember a rough time. 
 -> People didn't care about they type of file, they cared that it was an image. 
 -> More re-finding than finding, more metadata than best match driven
 -> People remember attributes, seldom the details, only the general topic
 --> Rich client-side interface; every time we go int an are they have characteristics that are very different from other generations of search

Contextual Retrieval
 - One size does not it all
  --> SIGIR  (who's asking, where are they, what they have done in the past)
 - Queries are difficult to interpret in isolation
 - SIGIR - information retrieval vs. special inspector general for iraq reconstruction
 - A single ranking severely limits the potential because different people have different notions of relevance

Potential for Personalization
http://research.microsoft.com/en-us/um/people/horvitz/teevan_dumais_horvitz_tochi_2010.pdf
 - Framework to quantify the variation of relevance for the same query across individuals (Teevan et al., ToCHI 2010)
 - Regardless of how you measure it, there is tremendous potential; it varies widely across different queries
 - 46% potential increase in search ranking
 - 70% if we taken into account individual notions of relevance
 - Need to model individuals in different ways

Personal navigation
 - Teevan et al. SIGIR 2007, Tyler and Teevan WSDM 2010
 - Re-finding in web search; 33% are queries you've issued before
 - 39% of clicks are things they've visited before
 - "Personal" navigation queries
 --> 15% of queries 
 --> simple 12 line algorithm
 --> If you issued a query and clicked on on only one link twice, you are 95% likely to do it again
 - Resulted in online A/B experiments (successfully)

Adaptive Ranking
Bennett et al. SIGIR 2012
 - Queries don't occur in isolation
 - 60% of sessions contain multiple queries
 - 50% of the time occur in sessions that last 30+ mins (infrequent, but important)
 - 15% of tasks continue across sessions

User Model
 - specific queries, URLs, topic distributions
 - Session (short) +25%
 - Historic (long) +45%
 - combinations  - 65-75%

- By third query in a session, just pay attention to what is happening now. 

Summary
 - We have complementary methods to understand and model searchers
 - Especially important in new search domains and in accommodating the variability we see across people and tasks in the real world

Future
 - More and more importance of spatial-temporal context (here now)
 - Richer representations and dialogs
  --> e.g. knowledge graphs
 - More proactive search (especially in mobile)
 - Tighter coupling of digital and physical worlds 
 - Computational platforms that couple human and algorithmic components
 - If search doesn't work for people, it doesn't work; Let's make sure it does!!!

We need to extend our evaluation methodologies to handle the diversity of searchers, tasks, and interactivity.

Disclaimer: The views here expressed are purely mine and do not reflect those of Google in any way.

14 comments:

  1. Thank you so much for sharing a lot of this good content! I am looking forward to seeing more!
    Linder Surveying

    ReplyDelete
  2. Some website are providing amazon gift codes and some are selling and we have to choose the best one so that we can save money.

    ReplyDelete
  3. if you are searching free itunes code generator now you can click here

    ReplyDelete

  4. Thank you for taking the time to provide us with your valuable information. We strive to provide our candidates with excellent care and we take your comments to heart.As always, we appreciate your confidence and trust in us.


    SAP Training in Chennai

    ReplyDelete
  5. And money. Thank God for instant loans! It is usually fast loans difficult to approach
    happy wheels | friv | girlsgogames | games2girls | happy wheels

    ReplyDelete
  6. Really unbelievable, is not as easy as you can see, the posts difficulty and challenge waiting for you and that's the main difficulties that make us interesting.
    Unblocked Games, friv 2 games, abcya

    ReplyDelete
  7. The blog or and best that is extremely useful to keep I can share the ideas. Age Of War 2
    Big Farm | Slitherio | Tank Trouble
    Of the future as this is really what I was looking for, I am very comfortable and pleased to come here. Thank you very much.
    Happy Wheels | Goodgeme Empire | Slither.io

    ReplyDelete
  8. Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article.

    Discover our website bounty of free online games now!
    Our website has the biggest collection of free online games. Totally new games are added every day!

    age of war 2
    gold Miner 2
    unfair Mario 2
    cubefield 2
    tanki Online 2

    ReplyDelete
  9. Informative blog are here.Lot of things are informative and helpful.thanks for sharing.
    Tidebuy Promotional Codes

    ReplyDelete
  10. The blog was absolutely fantastic! Lot of great information which was helpful
    Dot net Training institute in velachery

    ReplyDelete