Tuesday, September 28

Twitter Talk at UMass: Discovery and Emergence

Today Abdur, the Chief Scientist at Twitter gave a talk, Discovery and Emergence here at UMass. The talk was interactive with lots of questions. It was similar to the one he presented at SIGIR Social Media Workshop last year. If you read it, be sure to read Jon's comment. Here are a few of my notes from the talk, which focused heavily on the Trending Topics feature.

A few key points to remember. First we have to keep in the forefront:
It's not about the technology, it's how it enriches our lives and makes it better.

The Data
160 Million accounts. 90 million tweets per day. 16.7 gb of tweets. > 1000 tps
200,000 time line rps, 3GBs outbound data, 1 B queries per day

Tweets are searchable within seconds and the data is kept forever.

About 30% of search traffic is generated by clicks from trending topics.

In 1ms answer the following about a tweet:
- what language is this tweet?
- where was this tweet posted from?
- what are the entities in this tweet?

Every X min answer the following:
- Which tweets should you ignore?
- What topics are trending and where?

A key problem is how to evaluate the quality of trending topics. What makes one topic 'better' than another?

One of the coolest things I saw from the talk was the vizualization of the World Cup tweets, which was on their blog, World Cup 2010: A Global Conversation. It was created by Miguel Rios, whose work you can check out on his website.

Abdur ended with an admonition to researchers to think about the impact of their work,
Why does your research matter? Will it make the world a better place?

