Michael mentioned that he is excited about a new project that is spinning up called AMPLab. The effort explores the combination of distributed (machine learning) algorithms with crowdsourcing.
To explore the research area they created a course, AMPLab, CS294. From the first lecture one of the main goals is to:
Enable lots of people to collaborate (knowingly or not) to collect, generate, clean, make sense of and utilize lots of data.It will try to address one of the key problems, scalability:
- Scale state-of-the-art ML to large datasets (building on efforts like Spark)
- Enable data analytics frameworks to handle incomplete, heterogeneous, dirty data
- Simplify distributed processing models
- Use Active Learning to direct large numbers of people to improve data quality