Projects
Project 1
CommonCrawl is a free, publicly accessible "crawl" of the web - that is, an archive of web content that has been downloaded and saved for future analysis. Their November 2015 crawl results alone is over 151TB in size and holds more than 1.82 billion urls. In this project, you will work in groups of 1-2 people to process a subset of the CommonCrawl dataset using MapReduce. But, what portion of data you process, and how you process it, is up to you!
Grading
- 10% - Project Idea document - Grading Rubric
- Due: Tuesday, Feb 9th by 11:59pm
- 30% - Project Proposal - Grading Rubric
- Due: Tuesday, Feb 23rd by 11:59pm
- 60% - Final Report, Source code, and In-class Presentation - Grading Rubric
- Due: Thursday, Mar 10th by 11:59pm
Project 2
In this project, you will work in groups of 1-2 people to write an application for Amazon's cloud platform that takes advantage of their monitoring and management features.
- Project Description
- Eclipse Setup (for Java developers)
- Visual Studio Setup (for C# developers)
Grading
- 10% - Project Idea Document - Grading Rubric
- Due: Tuesday, Mar 29th, 2016 by 11:59pm
- 15% - Initial Project Demonstration - Grading Rubric
- Due: Thursday, Apr 14th, 2016 (in class)
- 60% - Final Project Demonstration - Grading Rubric
- Due: Tuesday, May 3rd, 2016 (in class)
- 15% - Final Report and Source code - Grading Rubric
- Due: Friday, May 6th, 2016 by 11:59pm