Release date: Tuesday, 8/27/2024
Due date: Monday, 9/2/2024, 23:59:59 EDT (no grace period allowed for project 0)
Last updated: 8/29/2024
In this course, you will work in teams of up to 2 people to build a mini database system Taco-DB in C++17. There are 6 subprojects throughout the semester, covering the various layers in Taco-DB from database storage to query processing layers, including storage (I/O, buffer manager, data layout and heap file), indexing (B-tree), query processing (relational operators, join, external sorting), and query optimization (manually optimize a query plan). These cover most of the topics we will discuss in the lectures, with the exception of transaction processing, concurrency control and crash recovery due to time constraints. All projects are due at 23:59:59 Eastern Time (before Nov 3: EDT, and after Nov 3, EST).
Late submission policy: you have up to 3 grace days in total for projects and written assignments, and you may only use at most 1 grace day for each project or written assignment. There is no penalty for late submissions that fall into the allowed grace days. No credit will be given if you make a late submission that uses more than allowed grace days. If you have a project teammate and any of you make a late submission, any grace days used will count towards the used grace days for both team members.
Working in a two-person team is only meant to reduce your workload in terms of coding rather than releasing you from working on the designs. In other words, you will first need to come up with a design and implementation plan together with your teammate; clearly and fairly divide and/or coordinate your coding responsibilities; and finish your share of work responsibly. Remember, it is your team's code submission that gets graded, not each individual's. If you do not complete your share of task, your teammate will also lose the points for that.
You will need to set up a private repository
on Github, shared with your teammate if any, and grant us
access to your repository to make submissions. When you make
submissions through the submit
command, available
only in your dev container on
minsky.cse.buffalo.edu
(see blow), we will pull
your code from your repository and verify whether it remains
private and only accessible to us (ub-cse562
),
your teammate (if any), and yourself. Please refer to lab 0
below for details of how to set it up and how the grading
works. Note that you may not make any of your project code
publicly available during or after this semester, or make them
available priviately to any current or future students who may
take the course. Please take some time to review the academic
integrity policy on the course homepage for more details.
[Policy for dissolving a team] In the rare cases if you want to dissolve your team or your teammate has dropped/resigned from the course, you are allowed to do so only if
Upon dissolving a team, you will be working in a single-person team for the rest of the semester and will not be allowed to form a new team.
In this project, your task is to set up your development docker
container on minsky.cse.buffalo.edu
, where you may
write, debug, test and submit your code; set up your private
repository and initial code base; and sign up for the project.
This project is only worth 0.1 point out of the final grade,
but you will not be able to make submissions for later project
unless you make a valid submission to project 0.
In this section, please follow the steps below to set up your
dev docker container on minsky.cse.buffalo.edu
.
You will have SSH access to your personal container and be able
to manage your dev container in case you need to restart it.
Note: While other students' containers are isolated from yours, you are still sharing the same physical resource with the rest of the class. To prevent hogging the server, you are limited to up to 8 GB of memory, and you only have access to CPU 0 (the 40 even numbered cores/hyperthreads). When you are building code, we recommmend setting the job parallelism to no more than 8 to avoid having the container killed due to OOM/making the server overloaded.
Step 1: Connecting to UBVPN. If you are connecting from off campus, you have to connect to UB VPN in order to connect to the student servers. See here for how to install and setup UB VPN. Once you're connected, you may continue to step 2.
You may ignore this step if you are connected to eduroam
or UB_Secure
.
Step 2: Install SSH Client locally. If you already know how to install and use SSH, you may skip below. Otherwise, please the following are the commnoly use SSH clients in typical systems.
For Linux/Mac OS users, openssh client is usually
pre-installed. You should be able to run ssh
from
a terminal. If not, please install the openssh package
available through the package manager of your system.
For Windows 10/11 users, you may enable
the OpenSSH client feature in Settings -> Apps ->
Optional Features
. Once it is enabled, you may use
ssh
command from PowerShell. If that does not work
for you, or you're using an older Windows release, you may also use
PuTTY, a standalone
SSH client for Windows.
Step 3: Each student should set up a development docker
container on the students server
minsky.cse.buffalo.edu
, which will be used for
developing, debugging and submitting your code throughout the
semetser. To set up and manage the container, you need to do so
through a centrally authenticated CSE student server
cerf.cse.buffalo.edu
. If you know how to and/or
have already accessed any of the centrally authenticated CSE
student server before, you may skip below. Otherwise, you
may follow the following steps.
First, you may generate a local SSH key pair using and
upload it so that you may access cerf
without a
password in the future. Before you continue, please check
whether you already have an SSH key and/or installed it on any
of the CSE student servers. To do so:
id_rsa.pub
, id_ecdsa.pub
or
id_ed25519.pub
) at the default location
(for Windows: open file explorer and enter
%USERPROFILE%\.ssh
in the address box; for
Linux/Mac: enter `ls -al ~/.ssh
` in
terminal). If the default .ssh
directory
does not exist or there is no such key file, you
probably have never generated a key before, in which case please
continue to Step 3a. Otherwise, continue with Check 2 below.
ssh
<your-ubit-name>@minsky.cse.buffalo.edu
(replace <your-ubit-name>
with your
UBITName, i.e., the user name of your UBMail address).
If you can log into the server without having to enter
a password, you probably have already installed your
public key on the student servers. In that case, please
continue to Step 7. Otherwise,
please continue to Step 3b.
exit
or Ctrl-D
(on Windows/Linux) to exit the remote server if you
have successfully logged in.)
Step 3a: Generating local SSH key. In your terminal, enter
ssh-keygen
. Follow the prompt and press enter for
a few times to create an SSH key at the default location with
no passphrase. Never share your private key file
(id_rsa
) with anyone.
Step 3b: Uploading SSH key to cerf
.
For Linux/Mac users, please open a terminal. Then copy and
paste the following line, with
<your-user-name>
replaced with your actual
UBITName (i.e., the user name of your UBMail address).
Enter your UBIT password when prompted.
ssh-copy-id <your-ubit-name>@cerf.cse.buffalo.edu
For Windows users, please open PowerShell. Then copy and paste
the following line, with <your-user-name>
replaced with your actual UBITName (i.e., the user name of your
UBMail address). Enter your UBIT password when prompted.
cat "${env:USERPROFILE}\.ssh\id_rsa.pub" | ssh <your-ubit-name>@cerf.cse.buffalo.edu "[ ! -d ~/.ssh ] && mkdir -m 700 ~/.ssh; [ ! -e ~/.ssh/authorized_keys ] && touch ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys; cat - | tr -d '\r' >> ~/.ssh/authorized_keys"
At this point, you should be able to login using ssh <your-ubit-name>@cerf.cse.buffalo.edu
without password.
Step 4: Configure search PATH on cerf
.
Once you're logged into cerf
, you should be using
tcsh
by default. You'll need to perform the following
one-time setup to add the CSE562 dev container management executables
to your PATH environment variable:
~/.cshrc
using your favoriate text
editor (e.g., nano ~/.cshrc
). setenv PATH /shared/projects/CSE562/bin:${PATH}
nano
, enter Ctrl + X
, and then
follow the screen prompt.exit
or hit Ctrl + D
).
cerf
again. Enter which status_dev_container
. This should print
the following line: /shared/projects/CSE562/bin/status_dev_container
. If not, please redo
the previous steps.
Step 5: Generating SSH key on cerf
.
You typically need to log into your dev container from
cerf
using another SSH key. In this step, you will
need to verify whether you have a valid SSH key pair on
cerf
. To do so, look for
~/.ssh/id_rsa
and ~/.ssh/id_rsa.pub
.
If it does not exist (or you have another type of key other
than RSA), you must generate a new one using ssh-keygen
-t rsa
.
Step 6: Start, stop or obtain status of your dev container.
To start your dev container, please enter the following:
start_dev_container
If your container is started successfully, you should see the following message as showed below:
You can now log into your dev container using the command
printed on the screen. The argument after -p
is
the forwarded SSH port number on minsky
.
The dev container is based on Ubuntu 22.04, with all required
dependencies pre-installed. You may start working on your
project directly without setup or install any environments
required by course projects.
Again, each student should only access your own container throughout the semester. If you access or interfere a container that does not belong to you, it may be considered as violation of our academic integrity policy.
You may sometimes lose connection to, or slow down the
container with a few runoff jobs throughout the semetser. To
check whether your container is still up and running, you may enter the following
on cerf
:
status_dev_container
If your container is up and running, you should see the following:
You may also use this command if you forget the SSH port number, which is listed under PORTS
.
If there is no container running , you should see:
Please note, you should not develop, build, or test your code on cerf.cse.buffalo.edu
. Instead, to develop, build, test,
or submit your code, please log into your dev container on minsky.cse.buffalo.edu
.
To stop a container, e.g., when it is running but not responding, and you'd like to restart it, you may enter
the following on cerf
:
stop_dev_container
If you're unable to restart or stop your dev container, please reach out to TA for help through a Piazza private message.
Caution: While you are not required to shut down
your dev container, it is your responsibility to make sure it
is not hogging the server. If the server becomes too
overloaded, we may first identify and ask the owner of a dev
container that is hogging the server to kill some processes in
it, or restart the container. Then we may forcefully stop your
container if the system is unresponsive and we do not hear from
you in a reasonable amount of time. In addition, the storage
devices on minsky
are not backed up, and data loss
could happen in extremely rare cases. You should not leave
unsaved/uncommited changes in your dev container for too long
(certainly not overnight).
Troubleshooting steps if you cannot start container successfully: If nothing below works, please reach out to TA through a private message on Piazza.
cerf
.
In your terminal: cd ~.ssh/
then ls
if you do not see a file called
id_rsa.pub
, then you may not be able to start the dev container successfully. Please refer to Step 5.
Permission
denied. Failed to fetch ~./usrrand.
. In
this case, you may not be able to obtain status or
stop the dev container.
In this case, you need to copy the ~/.usrrand
in your dev container to cerf
using
the following command:
scp -P <Your-Container-Port> <YOUR-UBIT>@minsky.cse.buffalo.edu:~/usrrand ~/.usrrand
permission denied
error.
Typically, that means the private key in ~/.ssh/id_rsa
does not exist or cannot be match
the public key installed in your dev container. This could happen if you are trying to SSH into the dev container
from a non-CSE centrally authenticated student server (i.e., cerf
or other CSE servers).
You may follow the steps below to allow yourself to SSH into the dev container from another system (e.g., your
personal laptop):
~/.ssh/id_rsa.pub
(Linux/Mac), or ${env:USERPROFILE}\.ssh\id_rsa.pub
(Windows). cerf
, from which log into your dev container. echo "<paste your pubkey here>" >> authorized_keys
<paste your pubkey
here>
with the copied line from the system
you'd like to SSH from; do not omit the double quotes
(""
); and make sure to append to the file
(>>
)cerf
, please restart your dev container to
allow it to re-install the public key.
In this section, your task is to set up a working repository and your build environment.
Special instruction: Each team only has to set up a single repository and import the code once (Steps 1, 2).
Prerequisites: You should get familiar with basic bash
shell
and git
operations, command line text editors (vim
, emacs
or
nano
), gdb
debugger, basic Github workflow as well as C++11, 14 and 17 if not
already. Optionally, you might also want to learn either screen
or tmux to allow a job to run detached in case of connection
loss.
Here're a few good resources. (You may go over them as needed. No need to go over all of them at once).
POSIX shell,
git,
Github,
C++11, 14, 17:
ISOCPP FAQ on C++11 and C++14,
and
cppreference wiki on C++11,
C++14, and
C++17.
Note: you should perform the following while logged into
your dev container on minsky
, not
cerf
!
Step 1: Create a private repository on Github using your personal account. Follow the guides below.
Step 2:
Add ub-cse562
and your teammate's Github
user (if any) as collaborators in settings -> Manage
access -> Add people
. Do not add anyone else as
collaborators. The grading script will reject the submission if
you have more than your group size + 1 collaborators (including
yourself). Your invitation to ub-cse562
will be accepted within 5 to 10 minutes automatically.
Step 3: (Both members) ensure
that you have generated an SSH key pair in the dev
container on minsky
(not the one on your
local machine or cerf
) and uploaded the public key
to Github. You should check if you have
~/.ssh/id_rsa
in a terminal in container. If
not, run ssh-keygen
to generate the keys. Then
cat ~/.ssh/id_rsa.pub
to print the public key.
Finally, copy it and upload it in Github -> SSH and GPG
keys -> New SSH Key
.
For the repository owner: enter the following with the
<github-username>
and
<repo-name>
replaced with your actual Github
user name and repository name (skip the comment lines that
starts with #
):
# extract the tarball
tar xf /ro-data/labs/lab0.tar.xz
# rename the extracted directory to the same as your repository
mv lab0 <repo-name>
# change into the directory
cd <repo-name>
# setting up the git repo
./setup_repo.sh git@github.com:<github-username>/<repo-name>
If everything goes well, the script will print Repo setup
is finished. Here are a few post-setup steps to
follow:...
Please follow the post-setup steps.
A common reason the repo setup fails to complete is you have not set your name and email for Git. The post-setup steps will let you know in that case and provide hints for how to continue.
For the teammate of the repository owner:
At this point, you should also be able to clone the
repository with the imported code into their own home directory
on the student server and follow the remaining steps. You may
do so by enter git clone
git@github.com:<github-username>/<repo-name>
at your home directory in a terminal, with
<github-username>
replaced with the
repository owner's Github user name and
<repo-name>
with your Github repository name.
Important Note: Each student should work inside your own dev container even if you have a teammate and you are sharing the Github repository.
Step 4: Build the code.
We use cmake
as the build system. You do not need
to modify any of the CMakeLists.txt
in most cases,
but you should generally be aware of how cmake
works.
Here's how to create a Debug build (unoptimized build
where you can debug using gdb
) in the
build.Debug
directory:
cd <dir-to-local-repository>
cmake -B build.Debug . # don't omit the dot at the end or cmake will report errors
cd build.Debug
make -j 8 # -j 8 enables parallel build with up to 8 processes
# Please be considerate for all CSE students who are
# sharing these servers and refrain from using -j with
# too many processes.
And here's how to create a Release build (optimized build that
does not have the debugging symbols to allow you use
db
) in the build.Release
directory.
When you're finished with developing and debugging your code,
you should run it in release build again to make sure it still
works, and runs within the time and memory limits.
cd <dir-to-local-repository>
cmake -B build.Release -DCMAKE_BUILD_TYPE=Release . # don't omit the dot at the end or cmake will report errors
cd build.Release
make -j 8 # -j 8 enables parallel build with up to 8 processes
# Please be considerate for all CSE students who are
# sharing these servers and refrain from using -j with
# too many processes.
Step 5: Testing your code with ctest.
(Note: the tests are implemented using the
GoogleTest
framework. Going through the GoogleTest Primer section in its
user's guide will help you understand the test cases and allow
you to write your own test cases in later projects).
We recommand you build and run the code through command line:
The following assumes that you have changed the working directory
into either build.Debug
or build.Release
.
ctest
(add -V
to see verbose outputs).ctest -N
BasicTestRepoCompilesAndRuns.TestShouldAlwaysSucceed
):
ctest -R "BasicTestRepoCompilesAndRuns.TestShouldAlwaysSucceed"
, or./tests/BasicTestRepoCompilesAndRuns --gtest_filter="BasicTestCompilesAdnRuns.TestShouldAlwaysSucceed"
--help
to ctest
or individual test programs to find other useful flags.
There is only one test in lab 0:
BasicTestRepoCompilesAndRuns.TestShouldAlwaysSucceed
,
which, as its name suggests, should always succeed without
any source code modified.
In this section, we will show you how to make your first code submission, and sign up as a team. For Project 0, you only need to submit the code base as imported in Section 2, since there is no coding needed to pass the test.
(On-time/late submissions and which one counts as your final submission?) In this project and all later projects, we only count the last submission from either one in your team before the project deadline (or within your allowed grace days in later projects). Both team members must have the allowed grace day to have the submission counted.
For example, (a) if you make a submission one day before the deadline, and your teammate makes another submission one hour before the deadline, the latter submission will be counted as submission by your team. (b) However, if your teammate makes a late submission, let's say, that is late for an hour, and both of you still have allowed grace days left. The latter submission will be counted as your final submission, and a used grace day will be deducted from your allowance. (c) In the third example, let's say your teammate makes a late submission that is one hour late, and your teammate still has one allowable grace day. However, you have used up all the grace days prior to the project deadline. Then, this late submission will not be counted as your last submission, and no grace day will be deducted from your allowance.
The total number of grace days used will be posted to UBLearns,
as four late submission penalty grading items. Each
0.01
grade in the Grace Day items is counted as one
used grace day. If you exceed three grace days, the remaining
late submissions will incur a 100%
late panelty,
added to the Late Penalty column. However, it is your own
responsibility to keep track of the remaining grace days as we
will only update them when we post the grades to UBLearns,
which can be after the project deadlines.
(Signing up and submit project 0). First, please find
the commit hash of the commit you'd like to submit. In project
0, you should only have one commit in your main branch, you may
find its commit hash using git log -n 1 --pretty=oneline
.
It is the 40-digit hexadecimal hash code shown as
the first field of the output.
You may then the following command:
submit sign-up <git-repo-link> <commit-hash> <team-partner-ubitname>
git@github.com:userA/reponame.git
(please use the
complete SSH git link with .git
suffix), the commit
hash is XXXXXX
(must be the full 40-digit commit hash ), and you'd like to sign up as a single-person team, you
may enter:
submit sign-up git@github.com:userA/reponame.git XXXXXX
submit sign-up git@github.com:userA/reponame.git XXXXXX userB
(Creating a git tag to help differentiate between versions of code) You may also create a tag to give a particular
commit a permanent nickname and submit the tag name instead in the submit
command.
This can be handy if you'd like to resubmit a past submission as
the latest one in later projects. To create a tag over the current commit,
you may enter the following:
git tag some-tag-name #change some-tag-name to a name that can help you locate a particular commit
git push --tags
submit sign-up git@github.com:userA/reponame.git some-tag-name # if you do not have a teammate
submit sign-up git@github.com:userA/reponame.git some-tag-name userB # if you have a teammate userB
(Submission command output) You will be prompted for accepting this course's academic
integrity policy, which you must accept by entering
y
to continue. Then you will be asked to verify
the information you are submitting. Once your team sign up
information is accepted, you will not be able to change it
without a reasonable justification, in which case you should
reach out to TA for help through a Piazza private message,
copying your teammate.
However, you may invoke submit sign-up
for any
number of times, subject to a rate limit of up to 10
submissions per hour per team, to update your repository link,
commit hash, or fix any settings other than teammate
information.
If you make a successful submission, you will see the following output:
The second to last line shows the aggregated score for each
part of the project. You may find the total score for each part
of each project from UBLearns -- the Grades tab. For later projects,
to list the score of each test case, you may find the extended result (extres
)
from the submit list-subs
command:
(Listing submission history) To list all the past
submissions for a project in your team, you may enter the
following, where 0
denotes the project number.
For later projects, you should replace 0
with the
coresponding project numbers.
submit list-subs 0
The output will look like the following, in ascending submission timestamp order.
(Obtaining submission and testing result details) Each submission is also associated with a sequence number. You may obtain more detailed testing results using the following command:
submit list-sub-details <labno> <seqno>
At the end of the output, you'll also find a URL linking to the original testing log (UB VPN or campus network required).
(List project deadlines) You may enter the following command to print a summary of the project deadlines, and whether submission to a project is still accepted. (Note: late policy takes precedence for grading purpose even if we accept a submission after the project deadline and/or your allowable grace days).
submit list-labs
(Non-command-line setups) We do not recomment use VSCode or any other IDE/text editors over SSH connections.
However, if you do insist using that, please find here for rules, suggestions and instructions: VSCode Setup.
(Alternative instructions for setting up environment locally) If you really want to set up the dev environment on anywhere else other than minsky, please refer to the following commented instructions. However, we cannot guarantee that the testing results will be the same as your local environment due to the differences of machine configuration. Make sure to fully test your code in your dev container on minsky for each project.
If you prefer setting up the build environment locally, your
system must have an x86_64 CPU and a reasonably recent Linux
installed. Here's the list of required tools and external
libraries (note: our CMakeLists.txt relies on the availability
of pc
files on your PKG_CONFIG_PATH
).
libjemalloc-dev
on Ubuntu)
Here's a script for
building Abseil and GoogleTest. If you do not want to install
them into the default location at /usr/local
,
please replace the installation prefix on line 3. You may pass
-DCMAKE_PREFIX_PATH=the-install-prefix-path
to
cmake to allow it find the libraries.
Note: please do not install libgtest-dev
via apt
on Ubuntu. It is not a shared-lib build
that we need.