CSE 487: Data Intensive Computing

Important Links

Syllabus
UBLearns
Piazza

Lectures and Assignments (subject to change)

Lectures Coursework
Mon Wed Reading Assignments
8/29
Lec 01: Course Introduction
8/31
Lec 02: Data Intensive Computing
Doing Data Science [Chapter 1] Week 1 TODOS [see Lec01]
9/5
No Lecture (Labor Day)
9/7
Lec 3: Data Strategy and EDA
Doing Data Science [Chapter 2]
Data Science from Scratch [Chapter 10]
 
9/12
Lec 4: Models and Algorithms
9/14
Lec 5: Models and Algorithms
Doing Data Science [Chapter 3]
Data Science from Scratch [Chapters 11, 12, 14, 15]
 
9/19
Lec 6: Models and Algorithms Demo
Demo Code
9/19
Lec 7: Introduction to Hadoop
  Project Phase 1 [description]: Due 10/17/22 @ 11:59PM
9/26
Lec 8: Introduction to MapReduce
9/28
Lec 9: MapReduce Demo
Demo Code
Lin and Dyer [Chapter 1]  
10/3
Lec 10: MapReduce Case Study
NGS Paper Project Phase 2 [description]: Due 11/7/22 @ 11:59PM
10/10
Lec 12: Graphs in MapReduce
10/12
Lec 13: Word Co-Occurrence
Lin and Dyer [Chapter 3,5]  
10/17
Midterm Review
10/19
Midterm Exam
Midterm Review Questions
Midterm Key/Rubric
 
10/24
Lec 15: Midterm Discussion
10/26
Lec 16: Naive Bayes
Doing Data Science [Chapter 4]  
10/31
Lec 17: Naive Bayes
11/2
Lec 18: Logistic Regression
Doing Data Science [Chapter 4, 5]  
11/7
Lec 19: Hive
11/9
Lec 20: Spark
RDD Paper Project Phase 3 [description]: Due 11/28/22 @ 11:59PM
11/14
Lec 21: Spark RDDs
11/16
Lec 22: Spark Demos
Demo Code
RDD Paper  
11/21
Workshop Day
11/23
No Class (Fall Break)
Ungraded Spark HW  
11/28
Lec 24: Spark HW Review
11/30
Lec 25: Bias in Data
   
12/5
Lec 26: Course Summary
12/7
Lec 27: Final Exam Review
Final Review Questions
Final Rubric