Release date: Tuesday, 10/22/2024
Project due: Monday, 11/11/2024, 23:59:59 PM EST
Last updated: 11/9/2024
Note 1 (11/9/2024): To clarify, we do not run any test
on Autolab this semester. All tests, including the hidden
tests, should be automatically executed when you run
submit
.
This project will be built upon your code for previous
projects. If you need the solution code for them, plesae
extract /ro-data/labs/lab<i>_sol.tar.xz
(where <i>
is the project number) into your
repository root and run
./import_supplemental_files.sh
. If you are
importing more than the solution code for an earlier project,
you may want to import them in the project number order as the
latter one may be overwritting some files. If run into any
errors, plesae read the paragraph above "a few hints" in bold
font.
To get started with project 4, extract
/ro-data/labs/lab4.tar.xz
when your current
working directory is your local repository root, and import the
supplemental files as your did
in previous projects.
The code should
build without compilation errors once the supplemental files
are imported, but most of the additional tests are likely to
fail. Note: some of the past test cases may also fail because we
enabled new features that you haven't implemented (e.g.,
BasicTestRepoCompilesAndRuns.TestShouldAlwaysSucceed
).
Once you complete the tasks below,
these past test cases should pass automatically. If not, it is highly
likely that your code has undiscovered bugs and may not pass
the hidden system tests. You may list
all the tests in your build directory using the ctest
-N
command.
A few useful hints:
VarlenDataPage
. Some
test cases also depend on a correct implementation Table
for storing test data.
Hence, you need to make sure your implementation of
them conforms to the specification. However, we do not
recommend overwriting VarlenDataPage with the solution
code, because you have new tasks to complete in
VarlenDataPage
. It might be harder for you to
implement and debug new functionalities based on the
solution code than your own code.
tests/
directory.
In this project, you will build a B-Tree index, one of the
index access methods in Taco-DB. More specifically, your
goal is to understand the structure of the B-Tree and
implement the core pieces of insert/delete/search/scan
functions. This B-Tree supports variable-length keys, thus
does not have a stable fanout. Splitting and merging are based
on the actual load of internal and leaf pages rather than the
number of entries. To simplify our implementation and maximize
code reuse, TacoDB’s B-Tree utilizes
VarlenDataPage
to organize all its nodes.
In the lab handout, you will find the basic tests for B-trees.
During online and offline grading, there will be larger-scale
system tests (which are hidden to you) to further evaluate the
correctness and efficiency of your implementation. Your grade
on those test cases will be posted on UBLearns after the final
submission deadline once we finish the offline tests. The
following table lists a break down of the tests. As before, you
may also find the
more detailed breakdown of points in
/ro-data/testconfs/test_lab4.conf.sh
.
Test suite | Max score | Hidden? | Tested during submission? |
---|---|---|---|
BTreeBasicTests | 10.5 | No | Yes |
BTreeSystestSmall | 0.4 | Yes | Yes |
BTreeSystestLarge | 2.4 | Yes | Yes |
IndexScan | 2 | No | Yes |
SystemTestTPCHQ1Small | 0.96 | Yes | Yes |
SystemTestTPCHQ1 | 2.04 | Yes | Yes |
Source files to read:
include/index/Index.h
include/index/IndexKey.h
include/index/tuple_compare.h
src/index/Index.cpp
src/index/tuple_compare.cpp
include/index/volatiletree/VolatileTree.h
src/index/volatiletree/VolatileTree.cpp
Source files to modify: none
As a general and important interface in an RDBMS,
Index
is the base class of all different types of
indexes. Its public interface is defined in
Index.h
. The metadata, including its base table ID
(the table over which the index is built), index type, virtual
file FID (the vfile that contains the index data), and etc.,
are stored in the catalog. These metadata are passed around in
the system as IndexDesc
when they are loaded into
memory.
When a new index is created for a table, a new virtual file is
allocated for it and all the metadata including the virtual
file ID are stored into the catalog. The system will then
create an IndexDesc
filled with the metadata and
call a static function Index::Initialize()
to
properly initialize the index, which in turn dispatches the
call to the specified index's initialization function. The
initialize function should perform essential checks and perform
the necessary initialization such that the vfile becomes
a valid empty index. At this point, the system should
be able to create any number of instances of index objects that
reference this index vfile. As the final step, the system
calls Index::BulkLoad()
to load the existing data
in the base table into the newly created index. As a default
and often a less efficient implementation of bulk loading,
Index::BulkLoad()
will scan the base table and
insert data entries one by one using the
Index::InsertKey()
interface. (Note that you do
not need to implement the more efficient B-tree bulk loading
algorithm in this project).
To implement an index, one should implement the the static
Initialize
and Create
functions and
override public interfaces for inserting, deleting, searching
for, and scanning through data entries.
The data entries in all the indexes in TacoDB are organized in
Alternative 2, where each index entry is a
(key(s), rid)
pair. Note that rid
is
unique for each record. When we compare data entries,
rid
is included as the last key field and thus
serves as a tie-breaker in case there are duplicate index keys.
In other words, all data entries are unique even though keys in
the records may have duplicates.
Data entries or index entries are collectively called index
records in Taco-DB. An index record is usually encoded as a
record with a fixed-size header followed by a record payload
that stores the actual index key values.
Index::StartScan()
will initialize an iterator
that may be used to scan through all data entries falling into
a particular search range. When scanning through the iterator,
you need to ensure that each data entry will be touched
EXACTLY ONCE. Besides, since all indexes of
the current version of TacoDB are range indexes, your index
iterators should scan through data entries in the
ascending order.
There is a flag in the catalog information for indexes to indicate whether this index is a unique index. If an index is non-unique, you should reject an insertion only if you are inserting a duplicate data entry with the same key and same record ID. However, if an index is unique, you should reject an insertion if there exists some data entry in the index with the same key but different record id.
We provided a volatile (in-memory) tree index implemented
using std::multimap in index/volatiletree
as a
reference implementation. If you are unsure about what the
specification of a specific interface means, you may always
refer to the volatile tree implementation. Any sequence of
insert/delete/update/scan in your B-Tree implementation should
produce excatly the same result in the eyes of the user (i.e.,
the same set of data entries are in the index in the same
order, they will generate the same error if there's any, an
index scan will produce the same set of index key and record ID
pairs). However, you MUST NOT use any STL container,
Abseil container, or any other in-memory container to
implement your B-Tree.
We want to make a quick note about the difference between
IndexKey
and the key payload of an index record.
IndexKey
is an in-memory structure that stores an
array of references to in-memory Datum
objects, which are either primitive C/C++ values or pointers to
variable-length data. You may think of an
IndexKey
as an
std::vector<NullableDatumRef>
with a more
compact memory layout and a trivial destructor. We use that to
pass search keys or keys constructed from heap records.
A key payload in an index
record is the keys of this record stored in a serialized record
format (as defined by the Schema
class, see also
Project 2, Section
1.3). To use a value in the record, you must use
Schema::GetField()
/Schema::DissemblePayload()
or other related functions to extract the field into a
Datum
. In this project, you don't have to worry
about extracting individual fields from a record to form an
index key, as the code of that is provided. However, it is
important that you understand the distinctions. Never copy
IndexKey into an index record. Instead, you need to use the key
schema to serialize it into a record payload.
One last thing to note is the ordering of data entries. For any
two data entries with m
key fields, say d = (k1,
k2, .., km, rid)
and d' = (k1', k2', ..., km',
rid)
, assuming the type of any key field has a total order
among all its non-NULL
values,
then we order the two data entries in lexicographical order using the following rule:
NULL
is always smaller than
any non-NULL
. NULL
values compare
equal (which is different from the semantics of SQL
expressions).i
, such that for all j < i
,
kj == kj'
, and ki < ki'
, then
d < d'
.
i
, such that for all j < i
,
kj == kj'
, and ki > ki'
, then
d > d'
.
include/storage/Record.h
). However, an
invalid record ID should always be treated as smaller
than any other record ID (which is not handled by the
comparison operators of RecordId
).
Because the majority of comparisons in indexes are between a
search key and a serialized key payload, we provide two
routines for compare an IndexKey
with a key
payload buffer in include/index/tuple_compare.h
and src/index/tuple_compare.c
. Note that, if there
are missing keys in the search key, we ignore the extra key
fields in the key payload. Because of that, you'll need to
handle partial key search and the
comparison of the tie-breaker (record ID) on top of these in
later tasks for B-tree.
Source file to read:
include/index/btree/BTree.h
include/index/btree/BTreeInternal.h
Source files to modify:
src/index/btree/BTree.cpp
src/index/btree/BtreeUtils.cpp
src/storage/VarlenDataPage.h
src/storage/VarlenDataPage.cpp
The physical structure of the B-Tree in TacoDB is very similar
to the one explained during our lectures, which contains
internal nodes and leaf nodes. All the structures of a B-Tree
live in a virtual File
managed by
FileManager
, whose first page is a meta page
BTreeMetaPageData
containing a page number
pointing to the root. Note the root node of a B-Tree can be
either a leaf or internal node, depending on whether there is
enough space for a leaf page to fit all the data entries.
One key deviation from the B-tree described in the lecture is that every level in this tree is doubly-linked, rather than just the leaf level. Your code should maintain these horizontal links when performing structural modification (split/merge of pages).
In this project, you will need to rely on your
VarlenDataPage
implementation in project 2 to
construct both leaf and internal nodes. A B-Tree page is
encoded as a VarlenDataPage
. The figure above
shows the physical layout of the page representing a B-Tree
node. You may find its first 24 bytes are exactly the same as
those of a regular heap file page. The main differences are:
BTreePageHeaderData
in the user-defined data
region. You should reserve enough space for that region
through passing usr_data_sz
when calling
VarlenDataPage::Initialize
.
Specifically, it includes flags indicating if this is the
root node and if this is a leaf node, the pointers to its
two neighbor siblings on the same level, and the total
number of bytes used by index records in the page.
Task 1: Implement functions for laying out B-Tree pages and some other basic utilities:
BTree::CreateNewBTreePage()
BTree::GetBTreeMetaPage()
BTree::CreateLeafRecord()
BTree::CreateInternalRecord()
BTree::Initialize()
BTree::IsEmpty()
Your code should allocate a new page and pin it after
BTree::CreateNewBTreePage()
is called, then return
the location of pinned buffer frame.
BTree::GetBTreeMetaPage()
should pin the meta page
of this B-Tree and return the pinned frame.
BTree::CreateLeafRecord()
and
BTree::CreateInternalRecord()
are used for
creating index records in B-Tree pages. In particular, index
entries in internal nodes should be contiguous bytes containing
BTreeInternalRecordHeaderData
and the index key
(strictly in this order with alignment). You can see the
m_heap_rec_id
is also the part of
BTreeInternalRecordHeaderData
to serve as a tie breaker
when there are duplicate key values. The first record
in any internal node should only contain
BTreeInternalRecordHeaderData
, without any key
payload. The key of all
records stored in the subtree between two neighboring keys
k1
and k2
of an internal node falls in a
left-closed right-open range [k1, k2)
. The leaf node
is much simpler, where each data entry is contiguous bytes
containing BTreeLeafRecordHeaderData
(which is
effectively just a RID) and the index key (in this order with
alignment).
A B-Tree page, as a VarlenDataPage
, must not have
any unoccupied valid slot. In other words, the valid SID on a
B-Tree page must be in a consecutive range
[GetMinSlotId(), GetMaxSlotId()]
(if there's any
slot). In addition, the records on a B-Tree page must be in
ascending order. If you need to insert or remove an
index record from an internal or a leaf B-tree node, you should
do so by specifying the correct insert/delete slot ID in the
VarlenDataPage::InsertRecordAt()
and
VarlenDataPage::RemoveSlot()
functions so that the
keys are still sorted after the insertion/deletion. You
should not use the
VarlenDataPage::InsertRecord()
and
VarlenDataPage::EraseRecord()
functions you
implemented in project 2 on any B-Tree page. Updating an index
record with VarlenDataPage::UpdateRecord()
is only
safe when you are sure the updated index record will fit into
the page. In addition, you should also implement a utility
function VarlenDataPage::ComputeFreeSpace()
to
compute the page usage given the total number of slots and the
total length of record payloads so that you can accurately
determine whether records fit into a page for the B-Tree split,
merge, rebalance operations in the later tasks.
Task 2: Implement the following additional utility
functions for variable-length data page. These utility
functions are useful for maintaing a data page with sorted
(index) records, especially in handling B-Tree insertion and
deletion. You may find the specification for these functions in
the header file include/storage/VarlenDataPage.h
.
VarlenDataPage::ComputeFreeSpace()
VarlenDataPage::RemoveSlot()
Your code is likely to pass the following tests at this point:
BasicTestSortedDataPage.TestComputeFreeSpace
BasicTestSortedDataPage.TestRemoveAllRecordDeterministic
BasicTestSortedDataPage.TestRemoveAllRecordRandomly
BasicTestSortedDataPage.TestRemoveAllRecordRandomlyOutOfBound
Source files to modify:
src/index/btree/BTreeSearch.cpp
src/index/btree/BTreeIterator.cpp
To compare two data entries or index entries, you should
compare not only the index key but also the record ID. You can
compare the index key using TupleCompare()
and
then compare them on the record IDs further if the index key
comparison turns out to be equal. Note that an invalid record
id should be treated as being smaller than any record id (including any
invalid record id and there are many different possible invalid
record ids). We will never try to insert a data entry with
invalid record id. Different from the three-valued logic
in SQL queries, NULL
values are considered to be
smaller than any non-NULL
values in index and
NULL
values compare equal in index, because we
need to index NULL
values as well.
Further, we will enforce lexicographical order for partial key
matches in B-Tree. This means a key like (1, 2)
is
always greater than its prefix (1, phi)
(where
phi
means unspecified). If you compare a full key
with its prefix using TupleCompare
, it will
indicate they are equal, which is not what we want. Therefore,
you have to handle such special cases in
BTree::BTreeTupleCompare()
after you call
TupleCompare
.
You should write a simple binary search to look up the key in a
B-Tree Node. The second overload of
BTree::FindLeafPage()
is the core recursion logic
of traversing the tree. Please be careful about pinning and
unpinning the pages during the traversal. At any time during a
search, there should be ONLY ONE B-Tree node
pinned in the buffer.
Task 3: Implement the utility function to compare two keys and complete the recursive B-Tree point lookup logic:
BTree::BTreeTupleCompare()
BTree::BinarySearchOnPage()
BTree::FindLeafPage(bufid, key, recid, p_search_apth)
–
the second overloadYour code is likely to pass the following tests at this point:
BasicTestBtreeSearch.TestBTreeTupleCompareFullKey
BasicTestBtreeSearch.TestBTreeTupleComparePrefixKey
BasicTestBtreeSearch.TestBinarySearchOnLeafPageFixedlen
BasicTestBtreeSearch.TestBinarySearchOnInternalPageFixedlen
BasicTestBtreeSearch.TestBinarySearchOnLeafPageVarlen
BasicTestBtreeSearch.TestBinarySearchOnInternalPageVarlen
When a B-tree search reaches a correct leaf page, you may
use the sibling pointers on the leaf node to iterate through
data entries in the sorting order (see
Index::StartScan()
and
BTree::StartScan()
). You should also check if the
scan is completed (either there are no more data entries
or the next data entry falls out of query range).
Task 4: Finish the B-Tree iterator scan logic:
BTree::Iterator::Next()
Your code is likely to pass the following tests at this point:
BasicTestBtreeSearch1.TestBTreeFullScanFixedlen
BasicTestBtreeSearch2.TestBTreeRangeScanFixedlen
BasicTestBtreeSearch3.TestBTreeFullScanVarlen
BasicTestBtreeSearch4.TestBTreeRangeScanVarlen
Source files to modify:
src/index/btree/BTreeInsert.cpp
You may reuse the B-Tree search code to locate the correct leaf page to perform the insertion. In addition, you should remember the path down by keeping track of which B-Tree nodes (a.k.a B-Tree page ID) you have touched and which slot number you took during the traversal. This information is useful in case of page splits.
The uniqueness check of the data entry for a non-unique index
or key for a unique index happens in
BTree::FindInsertionSlotIdOnLeaf()
. You should
return an appropriate slot ID if a duplicate is found. A small
nuance to note here is that we allow duplicate
NULL
values to appear in a unique index, which is
SQL-compliant but differs from the practice of serveral major
DBMS. (Think about why? Hint: unique indexes are used to
enforce UNIQUE constraints.)
Task 5: Implement the functions to find the correct slot on a leaf node for insertion, and insert an index record on a B-Tree page. You may just implement these functions without the logic for recursively splitting the page initially and do that in task 6 instead.
BTree::FindInsertionSlotIdOnLeaf()
BTree::InsertRecordOnPage()
Your code is likely to pass the following tests at this point:
BasicTestBTreeInsert.TestBTreeInsertFindSlotOnLeaf
BasicTestBTreeInsert.TestBTreeInsertOnPageNoSplit
BasicTestBTreeInsert.TestBTreeInsertNoSplit
You should also add the logic for splitting a page in
BTree:InsertRecordOnPage()
when there is no space
left to insert a new index record. Consider a local case of
splitting a node into two. There will be two nodes originally
in the tree being affected: the node you are splitting
and its parent. During this procedure,
BTree::SplitPage()
will trigger another
BTree::InsertRecordOnPage()
at the parent node.
This procedure can trigger another split on the parent node,
thus can be recursive and go all the way to the root. You will
need to handle this special case of root node splitting with
BTree::CreateNewRoot()
.
Again, during this recursive procedure, you should be very careful about pinning and unpinning B-Tree pages. At any time during insertion, there should be AT MOST FOUR pages pinned at the same time.
Task 6: Implement node splitting and
complete BTree::InsertRecordOnPage()
by adding
triggers for recursive splitting:
BTree::CreateNewRoot()
BTree::SplitPage()
BTree::InsertRecordOnPage()
Source files to modify:
src/index/btree/BTreeDelete.cpp
When we locate the record for deletion in the leaf level, we
should also remember the traversal path for possible page
merging and rebalancing. Please note that BTree::DeleteKey
may be called with a key payload without a valid record ID, in which case,
it is expected to delete an arbitrary one of the data entries
with matching key. When it is called with a valid record ID, there
should be at most one matching data entry to be deleted.
When we search for a
leaf page to locate a key to delete without specifying a valid
record ID, it might end up on a leaf page that is to the left
of any matching keys, because an invalid record id is treated
as smaller than any valid record id. In this case, you may have
to move to the right page (m_next_page
) to find a
matching key for deletion (if there is any, it must be in the
first slot of the right page). One tricky issue is that the
parent of the original leaf page may be different from the leaf
page on the right. If that happens, we ask you to simply add
one to the slot ID of the last PathItem
of the
search path in the
BTree::FindDeletionSlotIdOnLeaf()
, because
BTree::HandleMinPageUsage()
will handle the logic
of moving to the right to find the correct parent.
Task 7: Implement the function to find the location of the data entry to delete and the function for erasing an index record on a B-Tree page:
BTree::FindDeletionSlotIdOnLeaf()
BTree::DeleteSlotOnPage()
Your code is likely to pass the following tests at this point:
BasicTestBTreeDelete.TestBTreeDeleteFindKey
BasicTestBTreeDelete.TestBTreeDeleteFindKeyOverflow
BasicTestBTreeDelete.TestBTreeDeleteSlot
BasicTestBTreeDelete.TestBTreeDeleteSlotRoot
Merging should happen when a B-Tree node is under-utilized (it
has free space of more than 60% of the page size), which is
triggered by
BTree::HandleMinPageUsage()
called in
BTree::DeleteKey()
. Since node rebalancing is
assigned as a bonus, you don’t have to implement it to get
the full score. The tests that are not part of the bonus
project will use a flag to disable rebalancing so that your
code on the rebalancing will not be called there.
Node merging involves two sibling pages next to each other
that share the same parent node, and you should also delete an
index record on the parent. This may recursively trigger
another BTree::MergePages()
on the parent level,
which is already handled recursively in
BTree::HandleMinPageUsage()
. The merge may go all
the way up to the root, and eventually change the root node PID
stored in B-Tree meta page. You don’t have to deal with this
either, since we have already handled it for you in
BTree::HandleMinPageUsage()
. During this
recursive deletion and merging process, there should be
AT MOST FOUR pages pinned at the same
time.
Task 8: Implement node merging and complete
BTree::DeleteSlotOnPage()
:
BTree::MergePages()
Your code is likely to pass the following tests at this point:
BasicTestBTreeMergePages.TestMergeLeafPageWithUniformReclenSmallKey
BasicTestBTreeMergePages.TestMergeLeafPageWithUniformReclenLargeKey
BasicTestBTreeMergePages.TestMergeInternalPageWithUniformReclenSmallKey
BasicTestBTreeMergePages.TestMergeInternalPageWithUniformReclenLargeKey
BasicTestBTreeMergePages.TestMergeLeafPageWithExactFill
BasicTestBTreeMergePages.TestMergeInternalPageWithExactFill
BasicTestBTreeDelete1.TestBTreeDeleteSingleMerge
BasicTestBTreeDelete1.TestBTreeDeleteRecursiveMerge
BasicTestBTreeDelete2.TestBTreeDeleteVarlenKey
Source files to modify:
include/plan/IndexScan.h
src/plan/IndexScan.cpp
include/execution/IndexScanState.h
src/execution/IndexScanState.cpp
Now that we have indexes available in Taco-DB, we will be able
to introduce a new operator, IndexScan
, which
leverages an alternative-2 index (B-tree or volatile tree in
our code base), to perform selection scans over a range
condition. More specifically, this operator will allow you to
draw heap records from a table R
using an index on
a set of columns A
that satisfy a range condition
l < A < r
, where the comparison is done in
lexicographical order, and the lower and upper bound may be
missing or made non-strict (i.e., allowing equality). For
instance, given a B-tree index on (major,
adm_year)
over the student table we use as example
in class S(sid, name, login, major, adm_year)
,
the following shows a few examples of how we can use the operator.
(Recall that phi
denotes unspecified and is different from NULL
in our codebase).
('CS', phi) <= (major, adm_year) <= ('CS', phi)
('C', phi) <= (major, adm_year) < ('D', NULL)
('CS', 2020) <= (major, adm_year) <= ('CS', 2024)
('CS', 2019) < (major, adm_year) <= ('CS', phi)
Your task in this bonus project is to implement the relevant functions in these files and pass all of the following tests.
IndexScan.TestIndexScanEmpty
IndexScan.TestIndexScanAll
IndexScan.TestScanSome
SystemTestTPCHQ1Small.*
SystemTestTPCHQ1.*