Tuesday, June 21

Inside ACL: Building Watson DeepQA keynote Address by David Ferrucci

This morning David Ferrucci gave the Association for Computation Linguistics (ACL) 2011 keynote talk. Michael Bendersky is attending the conference and was very generous to send me his notes on the first keynote talk. Be sure to read his paper, Joint Annotation of Search Queries. Here are his notes from the talk,

Building Watson: An Overview of the DeepQA Project
  • What’s the difference between playing chess and understanding human language?
    - People find chess difficult and natural language easy
    - Many non-scientists don’t realize how difficult human language understanding really is

  • Computers are good at
    - Understanding formulas
    - Understanding structured query languages

  • Computers are bad at
    - Parsing ambiguous natural language

  • The system challenges
    - Open domain
    - Complex language
    - High precision
    - Accurate confidence – only buzz in when you’re very confident
    - High speed

  • Core technologies
    - Deep parsing – using a proprietary IBM technology that has been developed over the last 20 years
    - Relation detection
    - Multiple parse interpretations
    - Multiple query formulations per parse

  • Co-reference resolution
    - The entire research was driven by a single end-to-end metric – how much the proposed solution improves the Jeopardy game
    - Some improvements on a single algorithm might be redundant or harmful in the overall solution

  • Jeopardy is open-domain – not using ontologies that were crafted specifically for Jeopardy
    - Using general resources: Wordnet, YAGO

  • Learning from Reading
    - Parsing sentences in the text
    - Generalization and Statistical Aggregation

  • Some questions require decomposition and synthesis
    - Using techniques to decompose questions into parts
    - Synthesis of answers from different parts
    - Helps in answering questions that involve puns/rhyming

  • Some questions require finding a missing link between concepts
    - Using spreading activation to find links
    - eg, link between “shirt”, “tv remote”, “telephone” -> buttons

  • Metrics for performance evaluation
    - Plot x- % answered, y – Precision
    - Winners clouds – answered at least 50% of the questions, precision 80-92%
    - The goal was to get Watson into the winner cloud – achieved and went over the cloud by the Jeopardy game

  • Great leaps in performance from 2007. In the beginning, breaking even in the game seemed like an accomplishment

  • Watson is self-contained. Deciding what content to use is very hard – the amount of hardware is limited.

  • Guidelines
    - Specific large hand-crafted methods won’t cut it
    - Combining intelligence from diverse methods using machine learning techniques
    - Massive Parallelism is a Key Enabler

  • DeepQA – QA system underlying Watson
    - Many components for parsing and multiple answer generation
    - Logistic regression to weight the different features and rank the answers

  • Search systems used: Indri & Lucene. Both were modified to reduce run-time

  • Work process
    - All group members working in the same open space room
    - NLP researchers, IR researchers, ML researchers, linguists, statisticians
    - 8,000 experiments – all documented with tools that allow analysis by question/algorithm/features

  • Run-time
    - Single CPU time for answering a question – 2 hours
    - Scaled out to 3,000 CPU’s – 2-3 seconds
    - Enabled by the built-in parallelization of the algorithms
What I find particularly striking is the deep analysis of a contained corpus, particularly the analysis to find various kinds of missing links. The hardware is limited and the corpus is very circumscribed in order to run complex and expensive algorithms - and it results in significant improvements!
  • How would you develop a system for the real-time web where what's meaningful is constantly in flux?
Ultimately, the true test of DeepQA will be how it generalizes to domains beyond Jeopardy. I hope this is just the beginning for Watson.
Thanks again to Michael for his notes. Look for more highlights from ACL coming soon!


  1. Nice article, thanks for the information.

  2. My hope is that some members of jakarta hotel our community will be interested in submitting their scholarly work for possible presentation at the conference. I'm serving as co-ch air of the conference's Marketing Education track, one of more than 15 tracks that comprise the conference program.

  3. I know where I'm going and l know the truth, and I don't have to be what you want me to be. I'm free to be what I want.Thankyou i really love it.........

  4. I gathered useful information on this point . Thank you posting relative information and its now becoming easier to complete this assignment
    mahjong |geometry dash | hulk|agario| kizi|sniper games| minecraft| pacman

  5. I want to say that this post is awesome, nice written.My Little Pony Games


  6. Very useful post. This is my first time i visit here. I found so many interesting stuff in your blog especially its discussion. Really its great article. Keep it up

    - Mortal Kombat XL
    - Atari Breakout
    - Dragon Ball Z Games

  7. Let’s keep out sites for your child! click:
    brain games | puzzle games | tetris | happy wheels | agario | abcya | fnaf 4 | super mario games
    To play for free!

  8. All the best blogs that is very useful for keeping me share the ideas
    of the future as well this is really what I was looking for, and I am
    very happy to come here. Thank you very much
    earn to die play
    earn to die
    earn to die 3
    Hi! I’ve been reading your blog for a while now and finally got the
    earn to die 4
    courage to go ahead and give youu a shout out from
    earn to die 6
    Austin Texas! Just wanted to tell
    earn to die 5
    you keep up the fantastic work!my weblog
    age of war
    Hi! I’ve been reading your blog for a while now and finally got the
    happy wheels
    tank trouble 3
    slither io

  9. This blog is so nice to me. I will continue to come here again and again. Visit my link as well. Good luck
    obat aborsi
    cara menggugurkan kandungan
    obat telat datang bulan
    obat penggugur kandungan

  10. The best space for your child to relax!: wingsio
    slither io

  11. head soccer is young children dream of scoring goals, creating their own celebrations, making headlines, baffling opponents, being superstars and virtually ruling the game with a lot of games head soccer unblockedhead soccer 2 or soccer heads and Big Head Soccer . Big Head Football or big head basketball and unblocked games are also famous games.

  12. The blog or and best that is extremely useful to keep I can share the ideas. Age Of War 2
    Big Farm | Slitherio | Tank Trouble
    Of the future as this is really what I was looking for, I am very comfortable and pleased to come here. Thank you very much.
    Happy Wheels | Goodgeme Empire | Slither.io