CSE462/562: Database Systems (Spring 2022)

CSE 462/562: Database Systems (Spring 2022)

Project 4: B-Tree Index

Release date: Tuesday, 2/28/2023
Project due: Monday, 4/3/2023, 01:00 am EDT (extended)
Write-up due:, Wednesday, 4/5/2023, 01:00 pm EDT (extended)
Last updated: 3/27/2023

Updates (3/24/23) we will run the large system tests at 1 am every day starting 3/24/2023 until the project deadline, which include the unavailable test on Autolab. Please double check the testing results to find out whether your code fully passes the project.

0. Getting Started

This project will be built upon your code for previous projects. If you need the solution code for projects 1, 2, and/or 3, please download it from UBBox (link released on Piazza, under Homework Solutions) To get started, extract /local/Spring_2023/cse562/lab4.tar.xz on the student servers and import the supplemental files as your did in previous projects. The code should build without compilation errors once the supplemental files are imported, but most of the additional tests are likely to fail. Note: some of past test cases may also fail because we enabled new features that you haven't implemented (e.g., BasicTestRepoCompilesAndRuns.TestShouldAlwaysSucceed). Once you complete the tasks below, these past test cases should pass automatically. If not, it is highly likely that your code has undiscovered bugs and may not pass the hidden system tests. You may list all the tests in your build directory using the ctest -N command. As before, you may attempt any number of code submissions at Autolab, project 4 (the same hourly submission rate limit applies as before). To do so, please create a new tag of the commit you want to submit, and provide the tag name in Autolab.

Bonus: the additional tests for the bonus project is at /local/Spring_2023/cse562/lab4_bonus.tar.xz. You may only extract and import these supplemental files after the lab4.tar.xz files are imported. However, you may skip this if you do not want to complete the bonus project. Skipping the bonus project will have no impact on your later projects.

A few useful hints:

The project requires you to understand the B-Tree data structure in general and understand the specific design and requirement of the indexing layer in Taco-DB, so you might want to start very early. We strongly encourage you to read through the entire document before getting started with coding and at least before the Spring recess (because we won't have any office hours during Spring recess). Note that some of the tasks involve the source files distributed to you in the previous projects. You might also find it useful to review the lectures slides for indexing and tree-based index. Please also consider joining the office hours if you are not sure about how to proceed or have any questions about the tasks.
This lab will heavily rely on your implementation from the previous projects. In particular, we provide preloaded B-Tree indexes in some of the test cases, where the B-Tree pages are encoded using the VarlenDataPage. Some test cases also depend on a correct implementation Table for storing test data. Hence, you need to make sure your implementation of them conforms to the specification. However, we do not recommend overwriting VarlenDataPage with the solution code, because you have new tasks to complete in VarlenDataPage. It might be harder for you to debug compared to using your own.
The description below is meant to be a brief overview of the tasks. Always refer to the special document blocks in the headers and source files for a more detailed specification of how the functions should behave. While we try to provide as many hints as possible to alleviate your burden of reading code in the provided code, there might always be some declarations or definitions you need to look for in the code docs (in the navbar of the course page), or in the source code.
While you may add any definitions/declarations to the classes or functions you're implementing, it is unlikely you'll need to do that in project 3. Your design is probably incorrect if you do, but in case you need to add any definitions/declarations, you should not change the signatures of any public and private fucntions. Also do not edit those source files not listed in the source files to modify below, because they will be replaced during tests.
You do not have to consider concurrency control for multi-threading in your implementation.
We provide a list of tasks throughout this document, which is a suggestion of the order of implementation such that your code may pass a few more tests after you complete a task. It, however, is not the only possible order and you may also find alternative order that makes more sense to you. It is also possible that your code still could not pass certain test that is supposed to because of some unexpected dependencies on later tasks or undiscovered bugs in your implementation for previous projects, in which case you might want to refer to the test case implementation in the tests/ directory.
If you're working in a team of two, you might want to first go through the list of tasks, agree on a big picture of design, and divide the implementation tasks into indepedent pieces. One strategy is to implement tasks that may be tested independently. Alternatively, you could also design the algorithm and data structures together, and identify independent components to complete. You may also come up with better ways of collaboration that makes more sense to you. In any case, the grade for the code implementation will be based the test result of your group's final submission.

1. Project Overview

In this project, you will build a B-Tree index, one of the index access methods in Taco-DB. More specifically, your goal is to understand the structure of the B-Tree and implement the core pieces of insert/delete/search/scan functions. This B-Tree supports variable-length keys, thus does not have a stable fanout. Splitting and merging are based on the actual load of internal and leaf pages rather than the number of entries. To simplify our implementation and maximize code reuse, TacoDB’s B-Tree utilizes VarlenDataPage to organize all its nodes. In the lab handout, you will find the basic tests for B-trees. During online and offline grading, there will be larger-scale system tests (which are hidden to you) to further evaluate the correctness and efficiency of your implementation. Note that, some of the system tests are not available on Autolab and will be only tested during the offline tests, because they are too large to run on the Autolab VMs. Your grade on those test cases will be posted on UBLearns after the final submission deadline once we finish the offline tests. The following table lists a break down of the tests:

Test suite	Max score	Hidden?	Available on Autolab?	Available in offline test?
BTreeBasicTests `BasicTestSortedDataPage` `BasicTestBTreeUtils` `BasicTestBTreeInsert` `BasicTestBTreeSearch` `BasicTestBTreeDelete` `BasicTestBTreeSplitPage` `BasicTestBTreeMergePages`	7.0	No	Yes	Yes
BTreeSystestSmall	0.6	Yes	Yes	Yes
BTreeSystestLarge	2.4	Yes	No	Yes
BTreeBonusTests `BonusTestSortedDataPage` `BonusTestBTreeRebalancePages`	3.4	No	Yes	Yes

1.1 Index Interface

Source files to read:

include/index/Index.h
include/index/IndexKey.h
include/index/tuple_compare.h
src/index/Index.cpp
src/index/tuple_compare.cpp
include/index/volatiletree/VolatileTree.h
src/index/volatiletree/VolatileTree.cpp

Source files to modify: none

As a general and important interface in an RDBMS, Index is the base class of all different types of indexes. Its public interface is defined in Index.h. The metadata, including its base table ID (the table over which the index is built), index type, virtual file FID (the vfile that contains the index data), and etc., are stored in the catalog. These metadata are passed around in the system as IndexDesc when they are loaded into memory.

When a new index is created for a table, a new virtual file is allocated for it and all the metadata including the virtual file ID are stored into the catalog. The system will then create an IndexDesc filled with the metadata and call a static function Index::Initialize() to properly initialize the index, which in turn dispatches the call to the specified index's initialization function. The initialize function should perform essential checks and perform the necessary initialization that the index is a valid empty index. At this point, the system should be able to create any number of instances of index objects that reference this index. As the final step, the system calls Index::BulkLoad() to load the existing data in the base table into the newly created index. As a default and often a less efficient implementation of bulk loading, Index::BulkLoad() will scan the base table and insert data entries one by one using the Index::InsertKey() interface. (Note that you do not need to implement the more efficient B-tree bulk loading algorithm in this project).

Question 1: Why does the system need to perform bulk loading as the final step of creating an index? Hint: what should happen to an index when records are inserted into or deleted from its base table? Answer the latter question and briefly explain why for the first question.

To implement an index, one should implement the the static Initialize and Create functions and override public interfaces for inserting, deleting, searching for, and scanning through data entries. The data entries in all the indexes in TacoDB are organized in Alternative 2, where each index entry is a (key(s), rid) pair. Note that rid is unique for each record. When we compare data entries, rid is included as the last key field and thus serves as a tie-breaker in case there are duplicate index keys. In other words, all data entries are unique even though keys in the records may have duplicates.

Question 2: Can you store any index record or any data for locating index records in an instance of a subclass of Index? Why? Hint: you may draw parallels between it and VarlenDataPage/Table in project 3.

Data entries or index entries are collectively called index records in Taco-DB. An index record is usually encoded as a record with a fixed-size header followed by a record payload that stores the actual index key values. Index::StartScan() will initialize an iterator that may be used to scan through all data entries falling into a particular search range. When scanning through the iterator, you need to ensure that each data entry will be touched EXACTLY ONCE. Besides, since all indexes of the current version of TacoDB are range indexes, your index iterators should scan through data entries in the ascending order.

There is a flag in the catalog information for indexes to indicate whether this index is a unique index. If an index is non-unique, you should reject an insertion only if you are inserting a duplicate data entry with the same key and same record ID. However, if an index is unique, you should reject an insertion if there exists some data entry in the index with the same key but different record id.

Question 3: Suppose we have an non-unique index built on table A over its x: INT4 column. A record ID is represented as a pair of integers (pid, sid). Let's say there is a record in A with x == 100 and record ID being (10, 1). Now, we insert a new record into A with x == 100, can its corresponding data entry be inserted into the non-unique index? Why?

Question 4: The index in Question 3 is a unique index instead of a non-unique index. How does that change your answer to the previous question?

We provided an volatile (in-memory) tree index implemented using std::multimap in index/volatiletree as a reference implementation. If you are unsure about what the specification of a specific interface means, you may always refer to the volatile tree implementation. Any sequence of insert/delete/update/scan in your B-Tree implementation should produce excatly the same result in the eye of the user (i.e., the same set of data entries are in the index in the same order, they will generate the same error if there's any, an index scan will produce the same set of index key and record ID pairs). However, you MUST NOT use any STL container, Abseil container, or any other in-memory container to implement your B-Tree.

We want to make a quick note about the difference between IndexKey and the key payload of an index record. IndexKey is an in-memory structure that stores an array of references to in-memory Datum objects, which are either primitive C/C++ values or pointers to variable-length data. You may think of an IndexKey as an std::vector<NullableDatumRef> with a more compact memory layout and a trivial destructor. We use that to pass search keys or keys constructed from heap records. A key payload in an index record is the keys of this record stored in a serialized record format (as defined by the Schema class). To use a value in the record, you must use Schema::GetField()/Schema::DissemblePayload() or other related functions to extract the field into a Datum. In this project, you don't have to worry about extracting individual fields from a record to form an index key, as the code of that is provided. However, it is important that you understand the distinctions. Never copy IndexKey into an index record. Instead, you need to use the key schema to serialize it into a record payload.

Question 5: Given an IndexKey *key with the keys extracted from a heap record, a buffer maxaligned_char_buf buf, and a Schema *sch that describes the keys, please write a single C++ expression to serialize the keys into the buffer. (You may assume the buffer is cleared and there is no index record header needed for this question).

One last thing to note is the ordering of data entries. For any two data entries with m key fields, say d = (k1, k2, .., km, rid) and d' = (k1', k2', ..., km', rid), assuming the type of any key field has a total order among all its non-NULL values, then we order the two data entries in lexicographical order using the following rule:

For any field, NULL is always smaller than any non-NULL.
For any field, two NULL values compare equal (which is different from the semantics of SQL expressions).
If there is some i, such that for all j < i, kj == kj', and ki < ki', then d < d'.
If there is some i, such that for all j < i, kj == kj', and ki > ki', then d > d'.
If there all key fields compare equal, then the two data entries are ordered by the record IDs (lexicographically by pid and then sid, see include/storage/Record.h). However, an invalid record ID should always be treated as smaller than any other record ID (which is not handled by the comparison operators of RecordId).

Because the majority of comparisons in indexes are between a search key and a serialized key payload, we provide two routines for compare an IndexKey with a key payload buffer in include/index/tuple_compare.h and src/index/tuple_compare.c. Note that, if there are missing keys in the search key, we ignore the extra key fields in the key payload. Because of that, you'll need to handle partial key search and the comparison of the tie-breaker (record ID) on top of these in later tasks for B-tree.

Question 6: Suppose we have an index on a table A over two columns (x: INT4, y: INT4). We have two records, t1 = (1, 100), t2 = (100, 1) and the record ID of t1is greater than that of t2. Which one of the data entries of the two records is smaller in the total order?

Question 7: What if t1 = (100, NULL) and t2 = (100, 1) in Question 6?

Question 8: What if t1 = (100, NULL) and t2 = (100, NULL) in Question 6?

1.2 B-Tree Structure

Source file to READ:

include/index/btree/BTree.h
include/index/btree/BTreeInternal.h

Source files to modify:

src/index/btree/BTree.cpp
src/index/btree/BtreeUtils.cpp
src/storage/VarlenDataPage.h
src/storage/VarlenDataPage.cpp

The physical structure of the B-Tree in TacoDB is as explained during our lectures, which contains internal nodes and leaf nodes. All the structures of a B-Tree live in a virtual File managed by FileManager, whose first page is a meta page BTreeMetaPageData containing a page number pointing to the root. Note the root node of a B-Tree can be either a leaf or internal node, depending on whether there is enough space for a leaf page to fit all the data entries.

One key deviation from the B-tree described in the lecture is that every level in this tree is doubly-linked, rather than just the leaf level. Your code should maintain these horizontal links when performing structural modification (split/merge of pages).

In this project, you will need to rely on your VarlenDataPage implementation in project 2 to construct both leaf and internal nodes. A B-Tree page is encoded as a VarlenDataPage. The figure above shows the physical layout of the page representing a B-Tree node. You may find its first 24 bytes are exactly the same as those of a regular heap file page. The main differences are:

B-Tree nodes store their specialized meta information BTreePageHeaderData in the user-defined data region. You should reserve enough space for that region through passing usr_data_sz when calling VarlenDataPage::Initialize. Specifically, it includes flags indicating if this is the root node and if this is a leaf node, the pointers to its two neighbor siblings on the same level, and the total number of bytes used by index records in the page.
The records stored in B-tree pages are index records (internal records on internal nodes, or leaf records on leaf nodes) rather than heap records. In contrast to a heap record which does not have a header in Taco-DB, an index record has a header preceeding its key payload.
There may not be unused slots on the data page. Index records in the slots are always sorted in ascending order, with the smallest record in the lowest numbered slot.

Task 1: Implement functions for laying out B-Tree pages and some other basic utilities:

BTree::CreateNewBTreePage()
BTree::GetBTreeMetaPage()
BTree::CreateLeafRecord()
BTree::CreateInternalRecord()
BTree::Initialize()
BTree::IsEmpty()

Your code should allocate a new page and pin it after BTree::CreateNewBTreePage() is called, then return the location of pinned buffer frame. BTree::GetBTreeMetaPage() should pin the meta page of this B-Tree and return the pinned frame. BTree::CreateLeafRecord() and BTree::CreateInternalRecord() are used for creating index records in B-Tree pages. In particular, index entries in internal nodes should be contiguous bytes containing BTreeInternalRecordHeaderData and the index key (strictly in this order with alignment). You can see the m_heap_rec_id is also the part of BTreeInternalRecordHeaderData to serve as a tie breaker when there are duplicate key values. The first record in any internal node should only contain BTreeInternalRecordHeaderData, without any key payload. The key of all records stored in the subtree between two neighboring keys k1 and k2 of an internal node falls in a left-closed right-open range [k1, k2). The leaf node is much simpler, where each data entry is contiguous bytes containing BTreeLeafRecordHeaderData (which is effectively just a RID) and the index key (in this order with alignment).

Question 9: Why doesn't the first record in an internal node need to have the key payload? Hint: find out the key range for subtree pointed by the page number contained in its header.

Question 10: There are two page numbers stored in BTreeInternalRecordHeader. One of them is the child page number m_child_pid but there's also one in heap_recid. Is the second the same as the child page number? Why? What's the purpose of the second one?

A B-Tree page, as a VarlenDataPage, must not have any unoccupied valid slot. In other words, the valid SID on a B-Tree page must be in a consecutive range [GetMinSlotId(), GetMaxSlotId()] (if there's any slot). In addition, the records on a B-Tree page must be in a (ascending) sorted order. If you need to insert or remove an index record from an internal or a leaf B-tree node, you should do so by specifying the correct insert/delete slot ID in the VarlenDataPage::InsertRecordAt() and VarlenDataPage::RemoveSlot functions so that the keys are still sorted after the insertion/deletion. You should not use the VarlenDataPage::InsertRecord() and VarlenDataPage::EraseRecord() functions you implemented in project 2 on any B-Tree page. Updating an index record with VarlenDataPage::UpdateRecord() is only safe when you are sure the updated index record will fit into the page. In addition, you should also implement a utility function VarlenDataPage::ComputeFreeSpace() to compute the page usage given the total number of slots and the total length of record payloads so that you can estimate whether records fit into a page when implementing B-Tree page split, merge, rebalance in the later tasks.

Question 11: Can a B-tree internal page ever become empty? If yes, when? If no, why?

Question 12: Can a B-tree leaf page ever become empty? If yes, when? If no, why?

Task 2: Implement the following additional utility functions for variable-length data page. These utility functions are useful for maintaing a data page with sorted (index) records, especially in handling B-Tree insertion and deletion. You may find the specification for these functions in the header file include/storage/VarlenDataPage.h.

VarlenDataPage::ComputeFreeSpace()
VarlenDataPage::InsertRecordAt()
VarlenDataPage::RemoveSlot()

Your code is likely to pass the following tests at this point:

BasicTestSortedDataPage.TestComputeFreeSpace
BasicTestSortedDataPage.TestInsertRecordDeterministic
BasicTestSortedDataPage.TestInsertRecordRandom
BasicTestSortedDataPage.TestInsertRecordOutOfBound
BasicTestSortedDataPage.TestRemoveAllRecordDeterministic
BasicTestSortedDataPage.TestRemoveAllRecordRandomly
BasicTestSortedDataPage.TestRemoveAllRecordRandomlyOutOfBound
BasicTestSortedDataPage.TestInsertionWithCompationExistingSlot

1.3 B-Tree Search

Source files to modify:

src/index/btree/BTreeSearch.cpp
src/index/btree/BTreeIterator.cpp

To compare two data entries or index entries, you should compare not only the index key but also the record ID. You can compare the index key using TupleCompare() and then compare them on the record IDs further if the index key comparison turns out to be equal. Note that an invalid record id should be treated as being smaller than any record id (including any invalid record id and there are many different possible invalid record ids). We will never try to insert a data entry with invalid record id. Different from SQL, NULL values are considered to be smaller than any non-NULL values in index and NULL values compare equal in index, because we need to index NULL values as well.

Further, we will enforce lexicographical order for partial key matches in B-Tree. This means a key like (1, 2) is always greater than its prefix (1, phi) (where phi means unspecified). You can compare a full key with its prefix using TupleCompare, but it will turn out to be equal. So you have to resolve that in BTree::BTreeTupleCompare() as well.

You should write a simple binary search to look up the key in a B-Tree Node. The second overload of BTree::FindLeafPage() is the core recursion logic of traversing the tree. Please be careful about pinning and unpinning the pages during the traversal. At any time during a search, there should be ONLY ONE B-Tree node pinned in the buffer.

Question 13: You might have noticed that the entry point of a B-tree search BTree::FindLeafPage(key, recid, p_search_path) stores the buffer ID in a ScopedBufferId object. Please read its documentation in include/storage/BufferManager.h and briefly explain what it does in a few words.

Question 14: What is the advantage of using ScopedBufferId in terms of error handling in the code? Hint: suppose your implentation of FindLeafPage may throw errors (C++ exceptions) while your code holds a buffer pin stored as a plain BufferId. What will go wrong when that happens?

Question 15: Suppose you store the buffer ID of a pinned B-tree page in a ScopedBufferId bufid while descending in the B-tree during search, and you need to pass it to the next recursive call of FindLeafPage. How should you pass the buffer ID (as the first argument to the function)? Please write down a single C++ expression for this question.

Task 3: Implement the utility function to compare two keys and complete the recursive B-Tree point lookup logic:

BTree::BTreeTupleCompare()
BTree::BinarySearchOnPage()
BTree::FindLeafPage(bufid, key, recid, p_search_apth) – the second overload

Your code is likely to pass the following tests at this point:

BasicTestBtreeSearch.TestBTreeTupleCompareFullKey
BasicTestBtreeSearch.TestBTreeTupleComparePrefixKey
BasicTestBtreeSearch.TestBinarySearchOnLeafPageFixedlen
BasicTestBtreeSearch.TestBinarySearchOnInternalPageFixedlen
BasicTestBtreeSearch.TestBinarySearchOnLeafPageVarlen
BasicTestBtreeSearch.TestBinarySearchOnInternalPageVarlen

When a B-tree search reaches a correct leaf page, you may use the sibling pointers on the leaf node to iterate through data entries in the sorting order (see Index::StartScan() and BTree::StartScan()). You should also check if the scan is completed (either there are no more data entries or the next data entry falls out of query range).

Question 16: Suppose there is a B-tree index over a table A over two columns (x: INT4, y: INT4). We invoke StartScan() with both the lower key and upper key being a partial key (1, phi) (phi means the field is unspecified, i.e., the number of fields in the search key is 1) and both are non-strict bounds. Which data entries will be returned by the iterator? Write down a single relational algebra to answer this question (you may skip projections).

Question 17: What if the non-strict lower bound is (1, phi) and the non-strict upper bound is (1, NULL) in Question 16? Write down a single relational algebra (you may skip projections), and indicate whether the resulting relation may be non-empty for any possible instance.

Question 18: What if the non-strict lower bound is (1, phi) and the non-strict upper bound is (1, NULL) in Question 16? Write down a single relational algebra (you may skip projections), and indicate whether the resulting relation may be non-empty for any possible instance.

Question 19: What if the non-strict lower bound is (1, NULL) and the non-strict upper bound is (1, phi) in Question 16? Write down a single relational algebra (you may skip projections), and indicate whether the resulting relation may be non-empty for any possible instance.

Question 20: Consider the table and index in Question 16, and relational algebra \sigma_{x >= 1 AND y < 3}. Can you use the BTree::StartScan interface to create an iterator that returns exactly the set of data entries that correspond to the records in the resulting relation of this relational algebra expression in any database instance? If so, please specify the four arguments. If not, please briefly explain why.

Question 21: What if both x and y are real numbers (e.g., REAL) in Question 20? If your answer is yes, please specify the four arguments. If not, please briefly explain why.

Task 4: Finish the B-Tree iterator scan logic:

BTree::Iterator::Next()

Your code is likely to pass the following tests at this point:

BasicTestBtreeSearch1.TestBTreeFullScanFixedlen
BasicTestBtreeSearch2.TestBTreeRangeScanFixedlen
BasicTestBtreeSearch3.TestBTreeFullScanVarlen
BasicTestBtreeSearch4.TestBTreeRangeScanVarlen

1.4 B-Tree Insertion

Source files to modify:

src/index/btree/BTreeInsert.cpp

You may reuse the B-Tree search code to locate the correct leaf page to perform the insertion. In addition, you should remember the path down by keeping track of which B-Tree nodes (a.k.a B-Tree page ID) you have touched and which slot number you took during the traversal. This information is useful in case of page splits.

The uniqueness check of the data entry for a non-unique index or key for a unique index happens in BTree::FindInsertionSlotIdOnLeaf(). You should return an appropriate slot ID if a duplicate is found. A small nuance to note here is that we allow duplicate NULL values to appear in a unique index, which is SQL-compliant but differs from the practice of serveral major DBMS. (Think about why? Hint: unique indexes are used to enforce UNIQUE constraints.)

Question 22: When you search for a leaf page for insert (key, recid) into a unique index, which key and record ID pair should you use? Hint: how do you check for the existence of a data entry with duplicate key but different record ID? Do you ever need to move to a different leaf page to perform this check?

Question 23: What about non-unique index? Answer the questions in Question 22 again for a non-unique index.

Task 5: Implement the functions to find the correct slot on a leaf node for insertion, and insert an index record on a B-Tree page. You may just implement these functions without the logic for recursively splitting the page initially and do that in task 6 instead.

BTree::FindInsertionSlotIdOnLeaf()
BTree::InsertRecordOnPage()

Your code is likely to pass the following tests at this point:

BasicTestBTreeInsert.TestBTreeInsertFindSlotOnLeaf
BasicTestBTreeInsert.TestBTreeInsertOnPageNoSplit
BasicTestBTreeInsert.TestBTreeInsertNoSplit

You should also add the logic for splitting a page in BTree:InsertRecordOnPage() when there is no space left to insert a new index record. Consider a local case of splitting a node into two. There will be two nodes originally in the tree being affected: the node you are splitting and its parent. During this procedure, BTree::SplitPage() will trigger another BTree::InsertRecordOnPage() at the parent node. This procedure can trigger another split on the parent node, thus can be recursive and go all the way to the root. You will need to handle this special case of root node splitting with BTree::CreateNewRoot().

Again, during this recursive procedure, you should be very careful about pinning and unpinning B-Tree pages. At any time during insertion, there should be AT MOST FOUR pages pinned at the same time.

Question 24: Which four pages might be pinned during page split? What for?

Question 25: BTree::SplitPage returns a buffer that contains an index record. Is this a leaf record or an internal record? What are stored in this record?

Question 26: Since we support variable-length keys in the B-Tree index, we try to minimize the absolute difference of the page usages between the two sibling pages involved in a page split, instead of minizing the difference between the number of records. Suppose the page to be split has n records and we must also put the new record to insert on either the left sibling or the right sibling after split. Up to how many different possible ways of split are there?

Question 27: Suppose we compute the difference bewteen the page usages of the left sibling and right sibling (which may be a negative number), for each different way of splitting these two pages. Is the difference always a monotonic function of the number of records on the right sibling for a leaf page split? Please briefly explain why.

Question 28: What if it is an internal page split for Question 27? Is the function monotonic? Plesae briefly explain why.

Task 6: Implement node splitting and complete BTree::InsertRecordOnPage() by adding triggers for recursive splitting:

BTree::CreateNewRoot()
BTree::SplitPage()
BTree::InsertRecordOnPage()

1.5 B-Tree Deletion

Source files to modify:

src/index/btree/BTreeDelete.cpp
src/storage/VarlenDataPage.h (bonus)
src/storage/VarlenDataPage.cpp (bonus)

When we locate the record for deletion in the leaf level, we should also remember the traversal path for possible page merging and rebalancing. Please note that BTree::DeleteKey may be called with a key payload without a valid record ID, in which case, it is expected to delete an arbitrary one of the data entries with matching key. When it is called with a valid record ID, there should be at most one matching data entry which should be deleted.

When we search for a leaf page to locate a key to delete without specifying a valid record ID, it might end up on a leaf page that is to the left of any matching keys, because an invalid record id is treated as smaller than any valid record id. In this case, you may have to move to the right page (m_next_page) to find a matching key for deletion (if there is any, it must be in the first slot of the right page). One tricky issue is that the parent of the original leaf page may be different from the leaf page on the right. If that happens, we ask you to simply add one to the slot ID of the last PathItem of the search path in the BTree::FindDeletionSlotIdOnLeaf(), because BTree::HandleMinPageUsage() will handle the logic of moving to the right to find the correct parent.

Task 7: Implement the function to find the location of the data entry to delete and the function for erasing an index record on a B-Tree page:

BTree::FindDeletionSlotIdOnLeaf()
BTree::DeleteSlotOnPage()

Your code is likely to pass the following tests at this point:

BasicTestBTreeDelete.TestBTreeDeleteFindKey
BasicTestBTreeDelete.TestBTreeDeleteFindKeyOverflow
BasicTestBTreeDelete.TestBTreeDeleteSlot
BasicTestBTreeDelete.TestBTreeDeleteSlotRoot

Merging should happen when a B-Tree node is under-utilized (it has free space of more than 60% of the page size), which is triggered by BTree::HandleMinPageUsage() called in BTree::DeleteKey(). Since node rebalancing is assigned as a bonus, you don’t have to implement it to get the full score. The test that is not part of the bonus project will use a flag to disable rebalancing so that your code on the rebalancing will not be called there.

Node merging involves two sibling pages next to each other that share the same parent node, and you should also delete an index record on the parent. This may recursively trigger another BTree::MergePages() on the parent level, which is already handled recursively in BTree::HandleMinPageUsage(). The merge may go all the way up to the root, and eventually change the root node PID stored in B-Tree meta page. You don’t have to deal with this either, since we have already handled it for you in BTree::HandleMinPageUsage(). During this recursive deletion and merging process, there should be AT MOST FOUR pages pinned at the same time.

Question 29: Which four pages might be pinned during page merge? What for?

Question 30: Why don't we allow merging two sibling pages that do not share the same parent page? Hint: think about what problematic might happen?

Task 8: Implement node merging and complete BTree::DeleteSlotOnPage():

BTree::MergePages()

Your code is likely to pass the following tests at this point:

BasicTestBTreeMergePages.TestMergeLeafPageWithUniformReclenSmallKey
BasicTestBTreeMergePages.TestMergeLeafPageWithUniformReclenLargeKey
BasicTestBTreeMergePages.TestMergeInternalPageWithUniformReclenSmallKey
BasicTestBTreeMergePages.TestMergeInternalPageWithUniformReclenLargeKey
BasicTestBTreeMergePages.TestMergeLeafPageWithExactFill
BasicTestBTreeMergePages.TestMergeInternalPageWithExactFill
BasicTestBTreeDelete1.TestBTreeDeleteSingleMerge
BasicTestBTreeDelete1.TestBTreeDeleteRecursiveMerge
BasicTestBTreeDelete2.TestBTreeDeleteVarlenKey

RebalancePages() is called when we have an under-utilized page, but it can’t be merged with a sibling page because its page usage will exceed the page size if they are merged. RebalancePages() should try to move records from the page with a higher page usage to the one with a lower page usage, and minimizes the absolute difference of page usages. Different from SplitPage() which can always succeed, RebalancePages() may fail because the new separator key after the rebalancing may not fit into the parent page (as it might be longer than the old separator key). You should not make a choice such that the new separator key cannot fit into the parent page. Out of all valid choices, you should choose the one that minimizes the absolute page usage difference between the two pages. If there’s no valid choice, you should not change either of the pages and should return false from RebalancePages().

During this rebalancing process, there should be AT MOST THREE pages pinned at the same time.

You should also implement VarlenDataPage::ShiftSlots() to help you preallocate invalid slots at the start of the slot array or remove a batch of slots at the start of the slot array. Note if ShiftSlots() is called on an empty page with no slots with n and truncate == false, it is expected to reserve n unoccupied slots on the page although that will temporarily violate the constraint that the left-most slot on the page must be occupied. This is only allowed in this very specific corner case. It is expected that the caller will immediately fill in these slots by calling InsertRecordAt() (and possibly GetMaxSlotId()) before calling any other functions on the page. That said, this corner case never appears in our test cases and should not appear in your implementation of B-Tree rebalancing (think: why?). As a result, whether your code passes the bonus tests does not depend on the correct handling of the corner case.

Task 9: (Bonus, optional): Implement node rebalancing and optimize BTree::DeleteSlotOnPage() with rebalancing logic

VarlenDataPage::ShiftSlots()
BTree::RebalancePages()
BTree::DeleteSlotOnPage()

Your code is likely to pass the following tests at this point:

*/BonusTestSortedDataPage.TestShiftSlotsToRight/*
*/BonusTestSortedDataPage.TestShiftSlotsToRightWithInvalidSid/*
*/BonusTestSortedDataPage.TestShiftSlotsToRightThenToLeft/*
*/BonusTestSortedDataPage.TestShiftSlotsToLeftNoSpace/*
BonusTestBTreeRebalancePages.TestRebalanceLeafPagesWithUniformReclenLargeKey
BonusTestBTreeRebalancePages.TestRebalanceInternalPagesWithUniformReclenLargeKey
BonusTestBTreeRebalancePages.TestRebalanceLeafPagesBestSepWontFitL2R
BonusTestBTreeRebalancePages.TestRebalanceLeafPagesBestSepWontFitR2L
BonusTestBTreeRebalancePages.TestRebalanceLeafPagesNoFitL2R
BonusTestBTreeRebalancePages.TestRebalanceLeafPagesNoFitR2L
BonusTestBTreeRebalancePages.TestRebalanceInternalPagesBestSepWontFitL2R
BonusTestBTreeRebalancePages.TestRebalanceInternalPagesBestSepWontFitR2L
BonusTestBTreeRebalancePages.TestRebalanceInternalPagesNoFitL2R
BonusTestBTreeRebalancePages.TestRebalanceInternalPagesNoFitR2L

2. Write-up requirement

Each individual student needs to complete a write-up independently and submit it to UBLearns with answers to Questions 1 - 30. Please find the due date of the write-up at the beginning of this web page and submit a PDF document to UBLearns by the deadline. Please do not submit any other formats such as .docx, .doc, .txt -- we will not grade any non-PDF submission.