Information extraction
- Goal: from unstructured -> structured
- How?
- site-specific
- format-specific
- domain/category-centric - Goal: system and process for fast and easy building of domain-centric operations.
- exploits structured regularities/proxies to nested concepts
- lists, records, attributes
- create business directories for store locations
- pulling useful tidbits of info from around the web, dereferencing them, and then presenting them to the user
- scalability is important
- get rid of some complex features - speed
- adaptive allocation for reduced server load
- tagging
- relying on these is messy - photo albums online allow for quick searching
- image labeling
- could use ESP, but relies on users playing the game
- OR let people tag as normal and then offline... - detect similar photos
- overlap tags
- collaborative tagging
0 comments:
Post a Comment