CSE 462/562: Database Systems (Fall 2024)

Course Description

Welcome to CSE462/562 Database Systems of Fall 2024. Database Management Systems (DBMS) are an important piece of software widely used in various data intensive applications. No matter what data model or query language it adopts, there are principles and methodologies commonly found in virtually all DBMS, in order to support efficient and fault-tolerant data storage, query and management. In this course, we will examine the internals of a traditional Relational DataBase Management System (RDBMS) and discuss the common principles and methodologies that may be useful in many other database and data processing systems beyond RDBMS. The students will also gain practical experience of efficient data management through a semester-long project of implementing various components of a mini RDBMS.

The main topics include database storage management, query processing, query optimization, transaction processing, concurrency control and recovery in RDBMS. Please note that this is not a course on database design or database application development -- those are covered in CSE460/560 Data Models and Query Languages. We will briefly review the necessary background on the relational data model and the SQL language in the first a few lectures but students are expected to understand relational algebra and SQL before this course.

Logistics

Course Requirements

Grading Policy

Please refer to the PDF syllabus for the tentative assignment of letter grades. Grade break-down:

Course Schedule

Hint: If you find the last update date of a slide deck is later than the one you previously downloaded and you opened the slides within the browser, you may want to force a refresh using Ctrl + F5.

The following schedule is subject to adjustments throughout the semester.

Lectures

Date Topics Slides Last updated
8/27/2024 Course introduction and logistics 01 8/27/24 2:12 AM
8/29/2024 Relational model and query lanaguage 02 8/28/24 1:00 PM
9/3/2024 Physical Storage, POSIX I/O Interface and buffer management 03 8/30/24 1:00 PM
9/5/2024 Data storage layout 04 9/5/24 4:10 PM
9/10/2024 Relational Algebra and SQL 05 8/28/24 1:00 PM
9/12/2024 Relational Algebra and SQL (cont'd) See above
9/17/2024 Project and homework review and Q&A
9/19/2024 Query Processing Overview 08 9/10/24 9:35 AM
9/24/2024 Volcano execution model and cost analysis 09 9/10/24 9:35 AM
9/26/2024 Single-table query processing: aggergation and sorting 11
12
9/10/24 1:30 PM
9/10/24 1:30 PM
10/1/2024 Single-table query processing: sorting (cont'd) and hashing See above
10/3/2024 Single-table query processing: selection, projection and expression evaluation 10 10/9/24 3:00 PM
10/8/2024 Single-table query processing: deduplication and grouping
10/10/2024 Project and homework review and Q&A
10/15/2024 Fall break: no lecture today.
10/17/2024 Access methods and indexing 13 10/9/24 2:00 PM
10/22/2024 Tree indexes 14 10/9/24 2:00 PM
10/24/2024 Mid-term exam: in-class, 5:05 PM - 6:20 PM. (updated date) Covers everything before the Fall break.
10/29/2024 Tree indexes (cont'd)
10/31/2024 Hash indexes 15 10/9/24 2:00 PM
11/5/2024 Index scans 16 10/9/24 3:00 PM
11/7/2024 Concurrency control and recovery: an overview 17 10/9/24 3:00 PM
11/12/2024 Join algorithms 18 10/9/24 3:00 PM
11/14/2024 Join algorithms (cont'd)
11/19/2024 Project and homework review and Q&A
11/21/2024 Query optimization 19 10/9/24 3:00 PM
11/26/2024 Query optimization (cont'd)
11/28/2024 Thanksgiving: no lecture today
12/3/2024 Project and homework review and Q&A
12/5/2024 Advanced topics (TBD)
12/12/2024 Final exam: NSC 205, 7:30 PM - 9:00 PM. Covers everything after the mid-term exam.

Written Assignments

Written Assignment Release date Due date (23:59:59 PM Eastern Time) Links
HW1: relational model and query language 8/27/2024 9/16/2024 [Handout and Submission] [Solution]
HW2: single-table query processing 9/19/2024 10/9/2024 [Handout and Submission] [Solution]
HW3: access method and indexing 10/22/2024 11/18/2024 (extended due to late release) [Handout and Submission] [Solution]
HW4: transaction processing, join processing, query optimization 10/28/2024 (early release due to D2L visibility settings issue) 12/2/2024 [Handout and Submission] [Solution]

Projects

Assignment Release date Due date (23:59:59 Eastern Time) Link
Project 0: project setup and team sign-up 8/27/2024 9/2/2024 [Description]
Project 1: POSIX I/O 9/3/2024 9/9/2024 [Description]
Project 2: Data Layout 9/5/2024 9/16/2024 [Descrption]
Project 3: Single-table query processing 9/19/2024 10/9/2024 [Descrption]
Project 4: B-tree index 10/22/2024 11/11/2024 [Descrption]
Project 5: Join algorithms 11/14/2024 12/2/2024 [Descrption]
Bonus Project 6: Manual query optimization 11/14/2024 12/22/2024 (no grace day) See Project 5 description, Section 2
[Leader board]

Academic Integrity Policy

Academic integrity is critical to the learning process. It is your responsibility as a student to complete your work in an honest fashion, upholding the expectations your individual instructors have for you in this regard. The ultimate goal is to ensure that you learn the content in your courses in accordance with UB’s academic integrity principles, regardless of whether instruction is in-person or remote. Thank you for upholding your own personal integrity and ensuring UB’s tradition of academic excellence. You should get familiar with the departmental and the university academic integrity policies and procedures:

Academic integrity policies specifics to this course (please read carefully): you may NOT share code with from anyone about your course projects except for your teammate. We also require all students, whether enrolled, dropped or resigned from the class, to keep your course project repository inaccessible to public indefinitely, and never share it with any current students or future students who may take the course. When you look up online materials for the course project, please exercise common sense in terms of what is allowed and what is not allowed (e.g., cppreference/cplusplus.com are fine to look at, while posting questions directly related to the project on StackOverFlow would not be acceptable). Examples that we consider as academic integrity violations include but are not limited to:

  1. Copying any part of other team's code implementation or code found online in the course project, regardless of whether it is a verbatim copy or a modified copy;
  2. Collaborate with other teams in the course project; (Writing code together with students from other teams or exchanging code is not allowed. Discussion on task and system specification, algorithms and system designs, on the other hand, is allowed and encouraged);
  3. Making your course project repository or any copy of it available to any current or future students of the course;
  4. Submitting work that is not created by you or your team, including deriving your solutions based on materials from search engines, tutoring services, Q&A websites, and/or those generated by generative AI software (e.g., ChatGPT);
  5. Collaborate with anyone in the written assignments; (Developing solutions together with fellow classmates or exchanging solutions is not allowed. Discussion on homework problems, course materials, on the other hand, is allowed and encouraged).
  6. Cheating or referring to any material not permitted in the exams and written assignments.

Consistent with the CSE Academic Integrity Policy, we have zero tolerance towards academic integrity violations. Any academic integrity violation will result in an F grade for all students involved and will be referred to the Office of Academic Integrity. The only exception when the violation is accidental and does not provide any unfair advantage to any of the students involved, in which case you may receive a reduction in grade in the particular assignment/exam/project.

Accessibility Resources

If you have any disability which requires reasonable accommodations to enable you to participate in this course, please contact the Office of Accessibility Resources in 60 Capen Hall, 716-645-2608. The office will provide you with information and review appropriate arrangements for reasonable accommodations, which can be found on the web at this link. Please also notify the instructor as soon as possible to make reasonable arrangement early.