CSE 4/587: Data Intensive Computing

Important Links

Syllabus
Piazza (for class discussion)
UBLearns (for assignments and grades)

Lectures and Assignments (subject to change)

Mon Wed Fri Readings/Assignments
1/30
Lec 01: Course Introduction
slides
2/1
Lec 02: DIC Overview
slides
2/3
Lec 03: Data Strategy
slides
Doing Data Science [Chapter 1]
2/6
Lec 04: Data Strategy (cont) + Project Overview
slides
2/8
Lec 05: Data Cleaning/EDA
slides demo code
2/10
Lec 06: Models and Algorithms - Linear Regression
slides
Doing Data Science [Chapter 2,3]
Data Science from Scratch [Chapter 5,10]
Project Overview
2/13
Lec 07: Models and Algorithms - k-means
slides demo code
2/15
Lec 08: Classifiers - KNN
slides
2/17
Lec 09: Classifiers - Naive Bayes
slides
Doing Data Science [Chapter 2,3,4]
Project Phase 1 assigned. Due 3/6 @ 11:59PM [description]
2/20
Lec 10: Classifiers - Naive Bayes
slides
2/22
Lec 11: Classifiers - Logistic Regression
slides
2/24
Lec 12: Classifiers - Evaluating and Demo
slides demo code
Doing Data Science [Chapter 4,5]
Data Science from Scratch [Chapter 11]
Project Phase 2 released. Due 4/10 @ 11:59PM [description]
2/27
Lec 13: Midterm Review
slides practice midterm practice midterm key
3/1
Midterm Exam #1
3/3
Workshop Day
Midterm #1 [key/rubric]
3/6
Lec 14: Big Data/Hadoop
slides
3/8
Lec 15: Big Data/Hadoop
slides
3/10
Lec 16: Big Data/Hadoop
slides
Hadoop Architecture
Hadoop Paper
Project Phase 1 due 3/6 @ 11:59PM
3/13
Lec 17: MapReduce Introduction
slides
3/15
Lec 18: MapReduce Analysis
slides
3/17
Lec 19: MapReduce Demo
slides demo code
Lin and Dyer [Chapter 1]
3/20-3/24
Spring Break
No Class
3/27
Lec 20: Intermediate Data
slides
3/29
Lec 21: Word Co-Occurrence
slides
3/31
Lec 22: NGS Case Study
slides
HW 1 assigned. Due 4/3/23 @ 11:59PM [description,key]
NGS Case Study [link]
4/3
Lec 23: Midterm Review
slides practice midterm practice midterm key
4/5
Midterm Exam #2
4/7
Workshop Day
Midterm #2 [key/rubric]
4/10
Lec 24: Graph Analysis and PageRank
slides
4/12
Lec 25: Graph Analysis and PageRank
slides
4/14
Lec 26: PageRank in MapReduce
slides
Project Phase 2 due 4/10 @ 11:59PM
Project Phase 3 released. Due 5/8 @ 11:59PM [description]
4/17
Lec 27: Intro to Spark
slides
4/19
Lec 28: Spark
slides
4/21
Lec 29: Spark Demo
slides demo code
HW 2 assigned. Due 5/1/23 @ 11:59PM [description,key]
4/24
Lec 30: Spark Streaming
slides
4/26
Lec 31: Spark Streaming
slides
4/28
Lec 32: HBase/HIVE
slides
   
5/1
Lec 33: Cloud Overview
slides
5/3
Lec 34: Bias in Data
slides
5/5
Lec 35: Workshop Day
5/8
Lec 36: Course Recap
slides
5/10
Lec 37: Final Review
slides practice final practice final key
5/12
Lec 38: Final Review
slides practice pagerank problem practice pagerank key
Project Phase 3 due 5/10 @ 11:59PM Final Exam [key/rubric]