CSE 462/562: Database Systems (Spring 2024)

Course Description

Welcome to CSE462/562 Database Systems of Spring 2024. Database Management Systems (DBMS) are an important piece of software widely used in various data intensive applications. No matter what data model or query language it adopts, there are principles and methodologies commonly found in virtually all DBMS, in order to support efficient and fault-tolerant data storage, query and management. In this course, we will examine the internals of a traditional Relational DataBase Management System (RDBMS) and discuss the common principles and methodologies that may be useful in many other database and data processing systems beyond RDBMS. The students will also gain practical experience of efficient data management through a semester-long project of implementing various components of a mini RDBMS.

The main topics include database storage management, query processing, query optimization, transaction processing, concurrency control and recovery in RDBMS. Please note that this is not a course on database design or database application development -- those are covered in CSE460/560 Data Models and Query Languages. We will briefly review the necessary background on the relational data model and the SQL language in the first a few lectures but students are expected to understand relational algebra and SQL before this course.

Logistics

Course Requirements

Grading Policy

Please refer to the PDF syllabus for the tentative assignment of letter grades. Grade break-down (100% + 10% in bonus):

Course Schedule

Hint: If you find the last update date of a slide deck is later than the one you previously downloaded and you opened the slides within the browser, you may want to force a refresh using Ctrl + F5.

The following schedule is subject to adjustments throughout the semester.

Lectures

Date Topics Slides Last updated Opt. Textbook
1/29/2024 Course introduction and logistics; physical storage 01 1/20/2024 Ch. 1, Ch. 12
2/5/2024 Buffer mamagement; Data storage layout 02 2/3/2024 Ch. 13.1, 13.2, 13.3, 13.5
2/12/2024 Access methods, indexing and cost analysis 03 2/7/2024 Ch. 13.6, 14.1, 14.2
2/19/2024 Hashing techniques 04 2/20/2024 Ch. 14.5, 24.5
2/26/2024 Tree index 05 3/6/2024 Ch. 14.3, 14.4
3/4/2024 Tree index (cont'd); Advanced topics on indexing See above
Mid-term exam covers all the topics above this line
3/11/2024 Relational model and SQL 07 3/4/2024 Ch. 2, 3
Spring recess: 3/18/2024 - 3/23/2024
3/25/2024 Query processing overview; Midterm Q&A 08 3/19/2024 Ch. 15.1, 15.2
3/27/2024 Mid-term exam: Knox 104, 7:05 pm - 8:25 pm. Please arrive at least 5 minutes earlier.
4/1/2024 Single-table query processing 09 3/19/2024 Ch. 15.3, 15.6, 15.7
4/8/2024 Single-table query processing (cont'd); External Sorting
(Updates) Asynchronous lecture due to solar eclipse;
recording and live-streaming available on Panopto
10 3/19/2024 Ch. 15.3, 15.4, 15.6, 15.7
4/15/2024 Join algorithms 11 5/3/2024 Ch. 15.5
4/22/2024 Query optimization 12 3/19/2024 Ch. 16
4/29/2024 Transaction Processing; Pessimistic Concurrency Control 13 3/19/2024 Ch. 17, 18.1 - 18.4
5/6/2024 Pessimistic Concurrency Control (cont'd); Crash Recovery 14 3/19/2024 Ch. 18.1 - 18.4, Ch. 19
TBD Final review (optional)
5/15/2024 Final exam: Knox 109, 3:40 pm - 5:30 pm. Please arrive at least 10 minutes earlier.

Written Assignments

Written Assignment Release date Due date Links
HW1: database storage 2/5/2024 2/18/2024 [Handout and Submission] [Solution]
HW2: hashing 2/19/2024 3/3/2024 [Handout and Submission] [Solution]
HW3: tree indexes and cost analysis 3/4/2024 3/18/2024 [Handout and Submission] [Solution]
HW4: query processing 4/4/2024 (new) 4/25/2024 (new) [Handout and Submission] [Solution]
HW5: query optimization 4/4/2024 (new) 5/6/2024 [Handout and Submission] [Solution]
HW6: concurrency control and crash recovery 4/4/2024 (new) 5/13/2024 [Handout and Submission] [Solution]

Projects

Assignment Release date Due date Links
Project 1 - Lab 0: project and team sign-up 1/29/2024 2/1/2024 [Description] [Submission]
Project 1 - Lab 1: File I/O 1/29/2024 2/4/2024 [Description] [Submission] [Solution]
Project 2 - Buffer Manager and Data Layout 2/7/2024 2/25/2024 [Description] [Submission] [Solution]
Project 3 - B-Tree index 3/4/2024 4/1/2024 [Description] [Submission] [Solution]
Project 4 - Basic Query Processing 4/2/2024 4/25/2024 (new) [Description] [Submission] [Solution]
Project 5 - Join Algorithms and Query Optimization 4/22/2024 5/20/2024 [Description] [Submission] [Solution]

Academic Integrity Policy

Academic integrity is critical to the learning process. It is your responsibility as a student to complete your work in an honest fashion, upholding the expectations your individual instructors have for you in this regard. The ultimate goal is to ensure that you learn the content in your courses in accordance with UB’s academic integrity principles, regardless of whether instruction is in-person or remote. Thank you for upholding your own personal integrity and ensuring UB’s tradition of academic excellence. You should get familiar with the departmental and the university academic integrity policies and procedures for graudate students and for undergraudate students

Specifics to this course (please read carefully): you may NOT discuss/share code with/copy code from anyone about your course projects except for your teammate. For project write-ups, you must submit your independent write-up. We also require all students, whether enrolled, dropped or resigned from the class, to keep your course project repository inaccessible to public indefinitely, and never share it with any current students or future students who may take the course. When you look up online materials for the course project, please exercise common sense in terms of what is allowed and what is not allowed (e.g., cppreference/cplusplus.com are fine to look at, while posting questions directly related to the project on StackOverFlow would not be acceptable). Examples that we consider as academic integrity violations include but are hot limited to:

  1. Copying any part of other team's code implementation or code found online in the course project, regardless of whether it is a verbatim copy or a modified copy;
  2. Collaborate with other teams in the course project;
  3. Making your course project repository or any copy of it available to any current or future students of the course;
  4. Submitting work that is not created by you or your team, including deriving your solutions based on materials from search engines, tutoring services, Q&A websites, and/or those generated by generative AI software (e.g., ChatGPT);
  5. Cheating or referring to any material not permitted in the exams.

Consistent with the CSE policy, we have zero tolerance towards academic integrity violations. Any academic integrity violation will result in an F grade for all students involved and will be referred to the Office of Academic Integrity. The only exception when the violation is accidental and does not provide any unfair advantage to any of the students involved, in which case you may receive a reduction in grade in the particular assignment/exam/project.

Accessibility Resources

If you have any disability which requires reasonable accommodations to enable you to participate in this course, please contact the Office of Accessibility Resources in 60 Capen Hall, 716-645-2608 and also the instructor of this course during the first week of class. The office will provide you with information and review appropriate arrangements for reasonable accommodations, which can be found on the web at this link. Please also notify the instructor within the fisrt two weeks to make reasonable arrangement early.