Welcome to CSE462/562 Database Systems of Fall 2024. Database Management Systems (DBMS) are an important piece of software widely used in various data intensive applications. No matter what data model or query language it adopts, there are principles and methodologies commonly found in virtually all DBMS, in order to support efficient and fault-tolerant data storage, query and management. In this course, we will examine the internals of a traditional Relational DataBase Management System (RDBMS) and discuss the common principles and methodologies that may be useful in many other database and data processing systems beyond RDBMS. The students will also gain practical experience of efficient data management through a semester-long project of implementing various components of a mini RDBMS.
The main topics include database storage management, query processing, query optimization, transaction processing, concurrency control and recovery in RDBMS. Please note that this is not a course on database design or database application development -- those are covered in CSE460/560 Data Models and Query Languages. We will briefly review the necessary background on the relational data model and the SQL language in the first a few lectures but students are expected to understand relational algebra and SQL before this course.
minsky.cse.buffalo.edu for project development, debugging, testing and grading. See Project 0 for details.
                    
Taco-DB in C++, divided into six subprojects. You may work
                in a team of up to two students. The tasks remain the same
                regardless of your team size.  Please refer to the project 1
                web page (link from the navigation bar at the top of this page)
                for details.
                
Please refer to the PDF syllabus for the tentative assignment of letter grades. Grade break-down:
                Hint: If you find the last update date of a slide deck is later
                than the one you previously downloaded and you opened the
                slides within the browser, you may want to force a refresh
                using Ctrl + F5.
                
The following schedule is subject to adjustments throughout the semester.
| Date | Topics | Slides | Last updated | 
|---|---|---|---|
| 8/27/2024 | Course introduction and logistics | 01 | 8/27/24 2:12 AM | 
| 8/29/2024 | Relational model and query lanaguage | 02 | 8/28/24 1:00 PM | 
| 9/3/2024 | Physical Storage, POSIX I/O Interface and buffer management | 03 | 8/30/24 1:00 PM | 
| 9/5/2024 | Data storage layout | 04 | 9/5/24 4:10 PM | 
| 9/10/2024 | Relational Algebra and SQL | 05 | 8/28/24 1:00 PM | 
| 9/12/2024 | Relational Algebra and SQL (cont'd) | See above | |
| 9/17/2024 | Project and homework review and Q&A | ||
| 9/19/2024 | Query Processing Overview | 08 | 9/10/24 9:35 AM | 
| 9/24/2024 | Volcano execution model and cost analysis | 09 | 9/10/24 9:35 AM | 
| 9/26/2024 | Single-table query processing: aggergation and sorting | 11 12 | 9/10/24 1:30 PM 9/10/24 1:30 PM | 
| 10/1/2024 | Single-table query processing: sorting (cont'd) and hashing | See above | |
| 10/3/2024 | Single-table query processing: selection, projection and expression evaluation | 10 | 10/9/24 3:00 PM | 
| 10/8/2024 | Single-table query processing: deduplication and grouping | ||
| 10/10/2024 | Project and homework review and Q&A | ||
| 10/15/2024 | Fall break: no lecture today. | ||
| 10/17/2024 | Access methods and indexing | 13 | 10/9/24 2:00 PM | 
| 10/22/2024 | Tree indexes | 14 | 10/9/24 2:00 PM | 
| 10/24/2024 | Mid-term exam: in-class, 5:05 PM - 6:20 PM. (updated date) Covers everything before the Fall break. | ||
| 10/29/2024 | Tree indexes (cont'd) | ||
| 10/31/2024 | Hash indexes | 15 | 10/9/24 2:00 PM | 
| 11/5/2024 | Index scans | 16 | 10/9/24 3:00 PM | 
| 11/7/2024 | Concurrency control and recovery: an overview | 17 | 10/9/24 3:00 PM | 
| 11/12/2024 | Join algorithms | 18 | 10/9/24 3:00 PM | 
| 11/14/2024 | Join algorithms (cont'd) | ||
| 11/19/2024 | Project and homework review and Q&A | ||
| 11/21/2024 | Query optimization | 19 | 10/9/24 3:00 PM | 
| 11/26/2024 | Query optimization (cont'd) | ||
| 11/28/2024 | Thanksgiving: no lecture today | ||
| 12/3/2024 | Project and homework review and Q&A | ||
| 12/5/2024 | Advanced topics (TBD) | ||
| 12/12/2024 | Final exam: NSC 205, 7:30 PM - 9:00 PM. Covers everything after the mid-term exam. | 
| Written Assignment | Release date | Due date (23:59:59 PM Eastern Time) | Links | 
|---|---|---|---|
| HW1: relational model and query language | 8/27/2024 | 9/16/2024 | [Handout and Submission] [Solution] | 
| HW2: single-table query processing | 9/19/2024 | 10/9/2024 | [Handout and Submission] [Solution] | 
| HW3: access method and indexing | 10/22/2024 | 11/18/2024 (extended due to late release) | [Handout and Submission] [Solution] | 
| HW4: transaction processing, join processing, query optimization | 10/28/2024 (early release due to D2L visibility settings issue) | 12/2/2024 | [Handout and Submission] [Solution] | 
| Assignment | Release date | Due date (23:59:59 Eastern Time) | Link | 
|---|---|---|---|
| Project 0: project setup and team sign-up | 8/27/2024 | 9/2/2024 | [Description] | 
| Project 1: POSIX I/O | 9/3/2024 | 9/9/2024 | [Description] | 
| Project 2: Data Layout | 9/5/2024 | 9/16/2024 | [Descrption] | 
| Project 3: Single-table query processing | 9/19/2024 | 10/9/2024 | [Descrption] | 
| Project 4: B-tree index | 10/22/2024 | 11/11/2024 | [Descrption] | 
| Project 5: Join algorithms | 11/14/2024 | 12/2/2024 | [Descrption] | 
| Bonus Project 6: Manual query optimization | 11/14/2024 | 12/22/2024 (no grace day) | See Project 5 description, Section 2 [Leader board] | 
Academic integrity is critical to the learning process. It is your responsibility as a student to complete your work in an honest fashion, upholding the expectations your individual instructors have for you in this regard. The ultimate goal is to ensure that you learn the content in your courses in accordance with UB’s academic integrity principles, regardless of whether instruction is in-person or remote. Thank you for upholding your own personal integrity and ensuring UB’s tradition of academic excellence. You should get familiar with the departmental and the university academic integrity policies and procedures:
Academic integrity policies specifics to this course (please read carefully): you may NOT share code with from anyone about your course projects except for your teammate. We also require all students, whether enrolled, dropped or resigned from the class, to keep your course project repository inaccessible to public indefinitely, and never share it with any current students or future students who may take the course. When you look up online materials for the course project, please exercise common sense in terms of what is allowed and what is not allowed (e.g., cppreference/cplusplus.com are fine to look at, while posting questions directly related to the project on StackOverFlow would not be acceptable). Examples that we consider as academic integrity violations include but are not limited to:
Consistent with the CSE Academic Integrity Policy, we have zero tolerance towards academic integrity violations. Any academic integrity violation will result in an F grade for all students involved and will be referred to the Office of Academic Integrity. The only exception when the violation is accidental and does not provide any unfair advantage to any of the students involved, in which case you may receive a reduction in grade in the particular assignment/exam/project.
If you have any disability which requires reasonable accommodations to enable you to participate in this course, please contact the Office of Accessibility Resources in 60 Capen Hall, 716-645-2608. The office will provide you with information and review appropriate arrangements for reasonable accommodations, which can be found on the web at this link. Please also notify the instructor as soon as possible to make reasonable arrangement early.