Creating User Interfaces that Entice People to Manage Better Information
By David Karger (MIT)
History:
HayStack - Per user Information Environments (1999)
Current State of IKM (Information and Knowledge Management)
- We take users with extremely rich landscapes of information and we give them keyboards to barely sketch their interested. Algorithms work really really hard on that sketch.
- We work hard to make computers do IKM well
- People are better than computers at IKM
Question:
- In what ways can we give people the ability to manage more or better information?
- How do we make them want to?
1) Capture more data digitally
2) Collaborate to understand lecture notes
Capture of Information Scraps
- The state of PIM
- The desks all have computers, but we have huge piles of paper (never put into it)
- 27 participants, 5 Orgs, 1 hour in situ interviews
#1 using computer is distracting / impossible
-- people instead just grab random notes to write things down
-- Interfaces for Staying in the Flow (Ben Bederson, Ubiquity 2004
-- (Being "in the zone', in the flow)
#2 chimeras fight between apps
-- Meeting notes with TODOs and follow up meetings
#3 Diverse information forms don't fit apps
Types of information
TODOs, meeting Notes, Name and Contact information
#4 Want in view at right time --workflow integration
Costs to digital capture
- costs: effort to choose place, imposted schema, entry time is a distraction
Fixes: no organization, plain text, in the browser, cross-computer sync offline+online
list.it (open source mico note tool for Firefox.
--> 25,000 downloads, 16,625 registered users, 920 volunteers, 116k contributed notes
Types of notes:
TODOs, Web bookmarks, Concat information
median time to write something is 7.4s
median number of lines is 4
35% - ease/speed
20% simplicity
20% direct replacement for post-its
Detour: Note Science
-- How do people keep and acccess information in list-it?
3 coders
first clustered, identified 4 archetypes
MISC - MIT Open Scrap Corpus (available online)
NB: Classroom Discussion
Stellar Classroom discussion tool
- 50 most active classes made 3275 posts
-- no heavily populated posts
- Nb: forum in context (happen in the margin of lecture notes)
- Highlight a section of the post, write a comments
--> Implicit context (how do I get 3 from 1)
Benefits - Discuss as you read without existing note view
-- Context is clear because the PDF content is there
-- annotations create a heat map of lecture notes
15 classes, 4 different universities
(Annotation required), usage of the tool doubled over the term.
--> they liked seeing that they weren't the only one that was confused.
--> rich interaction
NB specific benefits
--> "Why?"
--> The social benefits outweighted the use of paper
FEEDME
- Artificial Collaborative Filtering
Vast amounts of content, how do we get the good stuff
machine learning recommenders - users rate what they read, content recommendation, collaborative filtering (find people with similar likes, predict what they will like)
Effort
- have to read lots of junk to train system
- have to spend energy now for future benefit
- many users won't ever get started
Quality
- ML algorithms imperfect
- Deliver reading irrelevant content, worry about what is missed
Alternative: People
Email is dominant in information sharing
Median 6 - people do want more relevant links
Sharers are reluctant to spam their friends
(unsure of relevance, may have seen it already, too much effort)
Fixes
-> let them use email, reasssure sender that content is relevant. Aand that the recipient isn't overloaded. One-click sharing
Firefox plugin
1. recoomend recipients to reduce time and effort for sharing
(uses ML to find people to recommend)
One-click thanks
Recommendation Algorithm
-- rochio classifier
Assessment
- two week study for $30
- 60 google reader users recruited on blogs
- Viewed 85k posts, shared 713 posts
- Significant increase in sharing
Recipients were happy - 80.4% of the posts contain novel content
Recommendations Useful
Do overload indicators help
- 1/3 of subjects with them said they were favorite feature
- 30 of shares resulted a thanks
Machine filtering
- have to read stuff
Structured Data
We all know structured data is good.
it supports
-> rich vizualizations, filtering, sorting, queries, merge data
Epicious (old version)
-> filter by ingredient, cuisine, part of meal
Mere mortals just write text or HTML
Structured data takes skill
- design a data model,
Plain authors are left behind
-> less power to communication effectively
Coping: Information Extraction
- Entity Recognition, Coference, relationship extraction
Imperfect, so errors creep in.
Alternative: Give regular people tools that let people author structured data
-> to communicate well
Do we need this? Yes.
Approach
- HTML is the language of the web
- Extend it to talk about data
- Anyone authoring HTML should be able to author data and interactive visualization
- Edit data-html in web, blogs, wikis
(like spreadsheets)
Publishing data is easy, just put a spreadsheet online. rows are items, columns are properties
Data
Items (recipes)
- Each has properties, Title, source magainze, publication date, etc...
- Vizualization - a collection of a view of data items
-- bar chart, sortable list, map, thumbnail set
Bound to peroperties
- sort by property
Facets for filtering information
-> specificy a property, user clicks to select
-> templates -> format per item.
- HTML with "fill in the blanks"
Key primitives of a data page
Data - a spreadsheet
Exhibit javascript library
1800 websites using exhibits
hobby stores, science
(lots of strange hobbyists)
Veggie guide to Glasgow
Not very scalable (fast for < 100 items)
Side effects - the data is out there. (structured data is the side effect)
Wibit
Datapress - data visualization inside the blog
DIDO - WYSIWYG editor
Conclusion
- People are powerful information managers
In each case, it's about giving people the tools to be information managers
Wait, There's more
--> manage structured data by making it look like a spreadsheet
--> Atomate -> help users translate incoming data data into structured data
We work hard to make computers do IKM well,
Don't assume people are passive IK consumers
Give people tools that can encourage active engagement in IKM
All the links are at haystack.csail.mit.edu/blog
Questions:
The success of exhibit came from why HeyStack didn't succeed. It's not the only measure of success that lots of people use a tool. It's still an interesting piece of research.
--> 25,000 downloads, 16,625 registered users, 920 volunteers, 116k contributed notes
Types of notes:
TODOs, Web bookmarks, Concat information
median time to write something is 7.4s
median number of lines is 4
35% - ease/speed
20% simplicity
20% direct replacement for post-its
Detour: Note Science
-- How do people keep and acccess information in list-it?
3 coders
first clustered, identified 4 archetypes
MISC - MIT Open Scrap Corpus (available online)
NB: Classroom Discussion
Stellar Classroom discussion tool
- 50 most active classes made 3275 posts
-- no heavily populated posts
- Nb: forum in context (happen in the margin of lecture notes)
- Highlight a section of the post, write a comments
--> Implicit context (how do I get 3 from 1)
Benefits - Discuss as you read without existing note view
-- Context is clear because the PDF content is there
-- annotations create a heat map of lecture notes
15 classes, 4 different universities
(Annotation required), usage of the tool doubled over the term.
--> they liked seeing that they weren't the only one that was confused.
--> rich interaction
NB specific benefits
--> "Why?"
--> The social benefits outweighted the use of paper
FEEDME
- Artificial Collaborative Filtering
Vast amounts of content, how do we get the good stuff
machine learning recommenders - users rate what they read, content recommendation, collaborative filtering (find people with similar likes, predict what they will like)
Effort
- have to read lots of junk to train system
- have to spend energy now for future benefit
- many users won't ever get started
Quality
- ML algorithms imperfect
- Deliver reading irrelevant content, worry about what is missed
Alternative: People
Email is dominant in information sharing
Median 6 - people do want more relevant links
Sharers are reluctant to spam their friends
(unsure of relevance, may have seen it already, too much effort)
Fixes
-> let them use email, reasssure sender that content is relevant. Aand that the recipient isn't overloaded. One-click sharing
Firefox plugin
1. recoomend recipients to reduce time and effort for sharing
(uses ML to find people to recommend)
One-click thanks
Recommendation Algorithm
-- rochio classifier
Assessment
- two week study for $30
- 60 google reader users recruited on blogs
- Viewed 85k posts, shared 713 posts
- Significant increase in sharing
Recipients were happy - 80.4% of the posts contain novel content
Recommendations Useful
Do overload indicators help
- 1/3 of subjects with them said they were favorite feature
- 30 of shares resulted a thanks
Machine filtering
- have to read stuff
Structured Data
We all know structured data is good.
it supports
-> rich vizualizations, filtering, sorting, queries, merge data
Epicious (old version)
-> filter by ingredient, cuisine, part of meal
Mere mortals just write text or HTML
Structured data takes skill
- design a data model,
Plain authors are left behind
-> less power to communication effectively
Coping: Information Extraction
- Entity Recognition, Coference, relationship extraction
Imperfect, so errors creep in.
Alternative: Give regular people tools that let people author structured data
-> to communicate well
Do we need this? Yes.
Approach
- HTML is the language of the web
- Extend it to talk about data
- Anyone authoring HTML should be able to author data and interactive visualization
- Edit data-html in web, blogs, wikis
(like spreadsheets)
Publishing data is easy, just put a spreadsheet online. rows are items, columns are properties
Data
Items (recipes)
- Each has properties, Title, source magainze, publication date, etc...
- Vizualization - a collection of a view of data items
-- bar chart, sortable list, map, thumbnail set
Bound to peroperties
- sort by property
Facets for filtering information
-> specificy a property, user clicks to select
-> templates -> format per item.
- HTML with "fill in the blanks"
Key primitives of a data page
Data - a spreadsheet
Exhibit javascript library
1800 websites using exhibits
hobby stores, science
(lots of strange hobbyists)
Veggie guide to Glasgow
Not very scalable (fast for < 100 items)
Side effects - the data is out there. (structured data is the side effect)
Wibit
Datapress - data visualization inside the blog
DIDO - WYSIWYG editor
Conclusion
- People are powerful information managers
In each case, it's about giving people the tools to be information managers
Wait, There's more
--> manage structured data by making it look like a spreadsheet
--> Atomate -> help users translate incoming data data into structured data
We work hard to make computers do IKM well,
Don't assume people are passive IK consumers
Give people tools that can encourage active engagement in IKM
All the links are at haystack.csail.mit.edu/blog
Questions:
The success of exhibit came from why HeyStack didn't succeed. It's not the only measure of success that lots of people use a tool. It's still an interesting piece of research.
0 comments:
Post a Comment