Wednesday, March 24

AMPLab: Exploring Big Data With Algorithms, Machines, and People

Today, there was a talk by Michael Franklin on the future of big data covering the databases vs. NoSQL movement in terms of WoW: The Alliance vs. the Horde, a struggle for the future of big data. More on that later.

Michael mentioned that he is excited about a new project that is spinning up called AMPLab. The effort explores the combination of distributed (machine learning) algorithms with crowdsourcing.

To explore the research area they created a course, AMPLab, CS294. From the first lecture one of the main goals is to:
Enable lots of people to collaborate (knowingly or not) to collect, generate, clean, make sense of and utilize lots of data.
It will try to address one of the key problems, scalability:
  • Scale state-of-the-art ML to large datasets (building on efforts like Spark)
  • Enable data analytics frameworks to handle incomplete, heterogeneous, dirty data
  • Simplify distributed processing models
  • Use Active Learning to direct large numbers of people to improve data quality
Check out the course for details as the project is getting started.


  1. Anonymous6:56 PM EDT

    DB+IR all the wayyyyyy :)


  2. Anonymous7:27 PM EDT

    Apparently the PostgreSQL project is caving in to recent trends and going NoSQL next year:


  3. Anonymous7:34 PM EDT

    lol April fools' day :)