Homework 0

Due by 8:00pm, Tuesday, February 4, 2025.

Make sure you follow all the homework policies.

All submissions should be done via Autolab

The support page for matrix vector multiplication could be very useful for this homework.

Submitting HW 0 is optional. However, we do encourage you to submit to get familiar with Autolab and to get some feedback.

You may assume that your container is such that the RapidGrowers can continue replicating every second until you come back to Buffalo.
In other words, we will cast the given problem into a problem for which we know the solution and then directly appeal to the previously known solution. In the present case the reduction is just a matter of changing the language: in later problems in the course, you will have to do more work in the reduction.
Technically, you should prove this claim by induction too but here you can make this claim without a formal proof. In general, you will have to make such decisions at many points while writing your HW solutions. If you are not sure, take the conservative approach. That is, in the present case, go ahead and present the formal proof by induction.

What is a proof?

The goal of this question is to present a gentle start to proofs. In particular, the idea is to highlight a common mistake students make while writing proofs.

The Problem

Consider the following "proof":

Brad Pitt has a beard:

Every goat has a beard:

Hence, Brad Pitt is a goat.

The Question

State precisely where the ``proof" above fails logically.

Follow-up question

Can you prove that Brad Pitt is not a goat?

Submission

You will NOT submit this question. This is for you to get into the grove of thinking about proofs.

Sample Problem 1

Note

Until the mid-term exams, every homework will have a sample solved problem, which will give you more examples of how to break-up your solutions into the proof/algorithm idea and proof/algorithm details. There is another sample problem at the end of the homework.

The Problem

For this and the remaining problems, we will be working with $n\times n$ matrices (or two-dimensional arrays). So for example the following is a $3\times 3$ matrix \[ \mathbf{M}=\begin{pmatrix} 1 & 2 & -3\\ 2 & 9 &0\\ 6 & -1 & -2 \end{pmatrix}. \] Given an $n\times n$ matrix $\mathbf{A}$, we will refer to the the entry in the $i$th row and $j$th column (for $0\le i,j<n$) as $\mathbf{A}[i][j]$. So for example, for the matrix $\mathbf{M}$ above, $\mathbf{M}[1][2]=0$. (We will use the convention that the top left element is the $[0][0]$ element.) We will also work with vectors of length $n$, which are just arrays of size $n$. For example the following is a vector of length $3$ \[\mathbf{z}= \begin{pmatrix} 2\\ 3\\ -1 \end{pmatrix}. \] As usual we will use $x[i]$ to denote the $i$th entry in the array/vector. For example, in the vector/array $\mathbf{z}$ above, we have $\mathbf{z}[2]=-1$.

We are finally ready to define the matrix-vector multiplication problem. Given an $n\times n$ matrix $\mathbf{A}$ and a vector $\mathbf{x}$ of length $n$, their product is denoted by \[\mathbf{y} =\mathbf{A}\cdot \mathbf{x},\] where $\mathbf{y}$ is also a vector of length $n$ and its $i$th entry for $0\le i< n$ is defined as follows: \[\mathbf{y}[i] =\sum_{j=0}^{n-1} \mathbf{A}[i][j]\cdot \mathbf{x}[j].\] For example, here is the worked out example for $\mathbf{M}$ and $\mathbf{z}$ above: \[ \begin{pmatrix} 1 & 2 & -3\\ 2 & 9 &0\\ 6 & -1 & -2 \end{pmatrix} \cdot \begin{pmatrix} 2\\ 3\\ -1 \end{pmatrix} = \begin{pmatrix} 1\times 2 + 2\times 3 + (-3)\times(-1)\\ 2\times 2+ 9\times 3+0\times (-1)\\ 6\times 2+(-1)\times 3+ (-2)\times(-1) \end{pmatrix} = \begin{pmatrix} 11\\ 31\\ 11 \end{pmatrix}. \]

Finally, we come to the actual question:

Design an $O(n^2)$ algorithm, which given any $n\times n$ matrix $\mathbf{A}$ and vector $\mathbf{x}$ of length $n$ correctly computes $\mathbf{A}\cdot \mathbf{x}$.

Note

This is exactly Exercise 1 in the support page on matrix vector multiplication.

Hint

The obvious algorithm works for this problem.
Argue that your algorithm in part above runs in time $\Omega(n^2)$. (Hence, conclude that your algorithm runs in time $\Theta(n^2)$.)
(Only think about this if you have time to spare.) Argue that any algorithm that solves the matrix vector multiplication problem (for arbitrary $\mathbf{A}$ and $\mathbf{x}$) needs to take $\Omega(n^2)$ time.

Note

This is the same as Exercise 2 in the support page on matrix vector multiplication. Further, note that this part implies the second part but if you want to use third part to solve the second part, then you will have to present the solution to third part too. I would recommend that you solve part three only if you want to stretch your algorithmic thinking.

Click here for the Solution

Algorithm Idea

The idea is to basically implement the definition of the matrix vector multiplication. In particular, for each $0\le i <n$, we compute the sum $\sum_{j=0}^{n-1} \mathbf{A}[i][j]\cdot \mathbf{x}[j]$ in the obvious way (i.e. loop over $j$ and maintain the partial sum). So the algorithm will basically be a double loop where each iteration will take $\Theta(1)$ time.

Algorithm Details

                        Allocate an array of length n and call it y

                        for(i=0;i<n;i++)
                            y[i]=0


                        for{(i=0;i<n;i++)
                            for(j=0;j<n;j++)
                                y[i]+= A[i][j]* x[j]

                        return y

Runtime Analysis

We start with the upper bound. Allocation of y takes $O(n)$ time (as we saw in class). The first loop runs for $n$ iterations and each iteration is just an assignment, which takes $O(1)$ time. Thus, the first for loop takes $O(n)$ time overall. We next consider the two nested loops. Each loop runs at most $n$ times for a total of at most $n^2$ executions of the step y[i]+= A[i][j]* x[j]. Note that this step is an addition and an assignment and hence takes $O(1)$ time. Thus, the entire nested for loops take $O(n^2)$ time. Finally the return statement is returning $n$ values and hence takes $O(n)$ time. Thus, the overall runtime is $O(n)+O(n)+O(n^2)+O(n)$, which is $O(n^2)$, as desired.

For the lower bound argument we just focus on the nested loops. We note that for any input the outer loops runs for $n$ values ($i=0,\dots,n-1$) and for each such value of $i$, the inner for loop is also runs $n$ times. Thus, we have $n^2$ executions of the line y[i]+= A[i][j]* x[j], which will take at least one step. Thus, we have that the algorithm runs in time at least $n^2$, which is $\Omega(n^2)$, as desired.

Submission

You will NOT submit this question. This is for you to get into thinking more about matrix vector multiplication.

Question 1 (Number of perfect matchings) [50 points]

The Problem

Let $n\ge 1$ be an integer. Given $n$ men and $n$ women, a perfect matching is a way to assign every man to a unique woman and vice-versa. For example for $n=2$, where Brad Pitt and Billy Bob Thornton are the two men and Jennifer Aniston and Angelina Jolie are the two women, one possible perfect matching consist of the two pairs (Brad Pitt, Jennifer Aniston) and ( Billy Bob Thornton, Angelina Jolie).

Here are the two parts:

Part (a): Argue that the number of perfect matching just depends on $n$ and is independent of the identities of the men and women.

Note

Since HW 0 tests your prior knowledge, part (a) will not be explicitly discussed in the recitations. Starting with HW 1, part (a) will be discussed in recitations.
Part (b): Prove that the total number of perfect matchings when you have $n$ men and $n$ women is $n!$. (Recall that $1!=1$ and for $n\ge 2$, we have $n!=n\times (n-1)\times (n-2)\times\cdots \times 2\times 1$.)

Hint

Use proof by induction.

Common Mistake

A common mistake for this problem is to present a "proof" by induction that does not use anything specific to the problem at hand.

Submission

Submit part (a) and (b) separately

You need to submit two (2) PDF files to Autolab: one for part (a) and one for part (b). While you can assume part (a) as a given for part (b), to get credit for part (a) you have to submit your solution for part (a) separately from part (b).

Make sure you submit the correct PDF to the correct submission link on Autolab. If you do not (e.g. if you submit Q1(a) PDF to Q1(b)), then you will lose ALL points.

We recommend that you typeset your solution but we will accept scans of handwritten solution-- you have to make sure that the scan is legible.

PDF only please

If Autolab cannot display your file, (irrespective of the reason) then you will get a zero (0) on the entire question.

Autolab might not be able to display files in formats other than PDF (e.g. Word cannot be displayed). Note that Autolab will "accept" your submission even if you submit non-PDF file, so it is YOUR responsibility to make sure you submit in the correct format.

Also the file size has to be at most 3MB.

Grading Guidelines

We will follow the usual grading guidelines for non-programming questions. Here is a high level grading rubric specific to part (a) of this problem:

Proof idea: 10 points.

and here is the high level grading rubric for part (b):

Proof idea: 20 points.
Proof details: 20 points.

Note

If you do not have separated out proof idea and proof details, you will get a zero(0) irrespective of the technical correctness of your solution..

Templates

Download LaTeX template. Download Microsoft Word template.

Note

Your must explicitly list your sources and collaborators in your submission file before uploading it to Autolab. Note that you can only used one of the five allowed sources. If you have used a source that is not allowed, please do not submit your homework. If you did not consult any source and/or did not collaborate with anyone just say None.

Question 2 (Asymptotic Notation) [25 points]

The Problem

Order the following running times in ascending order so that if one (let's call it $f(n)$) comes before another (let's call it $g(n)$), then $f(n)$ is $O(g(n))$: \[2^{n^{1.00001}},~n^{3^{10}},~2^{(\log_{96}{n})^{\sqrt{2}-0.4}},~9^{\log_{10.1}{n}},~n^n,~10^{1000}\cdot n!.\] Briefly argue why your order is correct (formal proof is not needed).

(For this problem, it might help to know Stirling's approximation: \[n!=\sqrt{2\pi n}\left(\frac{n}{e}\right)^ne^{\lambda_n},\] where \[\frac{1}{12n+1}<\lambda_n<\frac{1}{12n}.\] Also recall that $x=\log_a{b}$ implies that $a^x=b$.)

Hint

It might be useful to first rank all the functions other than $10^{1000}\cdot n!$ and $2^{(\log_{96}{n})^{\sqrt{2}-0.4}}$. Once you are done with rest of the four functions, put $2^{(\log_{96}{n})^{\sqrt{2}-0.4}}$ in its correct place and then put the $10^{1000}\cdot n!$ function in its correct position once you are done with the rest. The support page on logs might be useful for this problem.

Submission

Do NOT submit this question. This question is for you to get started thinking about asymptotic notation.

Note

You will be submitting Question 2 from homework 1 onwards. For HW 0, the primary aim is for y'all to get comfortable with Autolab: hence, we are asking you to not submit your solution for Question 2.

Question 3 (Programming Assignment) [25 points]

Note

This assignment can be solved in either Java, Python or C++ (you should pick the language you are most comfortable with). Please make sure to look at the supporting documentation and files for the language of your choosing.

The Problem

In this problem we will consider matrix and vector multiplication, but with the caveat that we already know something about the matrix. In particular for any $0\le i, j\le n-1$, we have \[\mathbf{U}_n[i][j]=\begin{cases} 0 & \text{if } j< i\\ 1 & \text{otherwise}.\end{cases}\] If you "draw" this matrix you will note that all of the elements below the diagonal is zero: such matrices are called upper triangular matrices. Further, all non-zero values (in the "upper triangular" part) are all $1$. Note that we only need to know the value of $n$ to completely determine $\mathbf{U}_n$.

You have to write code to solve the following computational problem: given any integer $n\ge 1$ and a vector $\mathbf{x}$ of length $n$, return the vector $\mathbf{y}=\mathbf{U}_n\cdot \mathbf{x}$. Read on for more details on the format etc.

Sometimes it is easier to visualize a specific $\mathbf{U}_n$. Below is $\mathbf{U}_5$ written down in a 2D-array form:

                 1   1   1   1   1   
                 0   1   1   1   1  
                 0   0   1   1   1 
                 0   0   0   1   1
                 0   0   0   0   1

Input

You will not receive the matrix as the input.

Note

You could build the matrix $\mathbf{U}_n$ yourself but this is HIGHLY DISCOURAGED. For the biggest input, you will need about 1 TB to store this matrix explicitly!

What you will receive is the vector itself. Each input file will contain one vector. The length of the vector will be on the first line of the file. The vector itself will take up the remaining lines. Once again, you will not have to handle the file reading and the creation of data structures. The packaged driver will handle that for you.

The input file can be summed up in the following format:

where each x_k is a value of the vector, in vertical order.

This is an example of what a file may look like, for size $n=5$:

Output

The output will be the vector $\mathbf{y}$ in vertical order. I.e.

The correct result for the earlier example of $\mathbf{x}$ would be:

Hint

The best possible algorithm for this problem runs in time $O(n)$.

Note

Both the input and output parsers in each of the three languages are already written for you.
Note that you have to work with the input data structures provided (which will come pre-loaded with the data from an input file).

Addition is the only change you should make

Irrespective of what language you use, you will have to submit just one file. That file will come pre-populated with some stuff in it. You should not change any of those things because if you do you might break what the grader expects and end up with a zero on the question. You should of course add stuff to it (including helper functions and data structures as you see fit).

Download Java Skeleton Code

Directory Structure

                                    ├── src
                                    │   ├── ub
                                    │       ├── cse
                                    │           ├── algo
                                    │               ├── Driver.java
                                    │               ├── HW0Utility.java
                                    │               ├── Solution.java
                                    ├── testcases/
                                    │   ├── input1.txt
                                    │   ├── input2.txt
                                    │   ├── input5.txt
                                    ├── outputs/
                                    │   ├── output1.txt
                                    │   ├── output2.txt
                                    │   └── output5.txt

You are given two coding files: Driver.java and Solution.java. Driver.java takes the input file, parses it and creates an instance of the class Solution and prints the result to the command line. You only need to update the Solution.java file. You may write your own helper methods and data structures in it.

The testcases folder has 3 input files and their corresponding output files for your reference. We will use these three input files (and seven others) in our autograding.

Method you need to write:

                                        /**
                                         * This method must be filled in by you. You may add other methods and subclasses as you see fit,
                                         * but they must remain within the Solution class.
                                         * @return Your resulting vector.
                                         */
                                        public ArrayList<Integer> outputVector() {
                                            return new ArrayList<Integer>(); //Change this to return your output vector
                                        }

The Solution class has 1 instance variable.

in_vector which is an array of ints, and stores the input vector given by the driver.

Compiling and executing from command line:

Assuming you're in the same directory level as src/. Run javac src/ub/cse/algo/*.java to compile.

To execute your code on input1.txt, run java -cp "src" ub.cse.algo.Driver testcases/input1.txt. You can compare your output to output1.txt.

Submission

You only need to submit Solution.java to Autolab.

Download Python Skeleton Code

Directory Structure

                                    ├── Driver.py
                                    ├── Solution.py
                                    ├── Utility.py
                                    ├── testcases/
                                    │   ├── input1.txt
                                    │   ├── input2.txt
                                    │   ├── input5.txt
                                    ├── outputs/
                                    │   ├── output1.txt
                                    │   ├── output2.txt
                                    │   └── output5.txt

You are given two coding files: Driver.py and Solution.py. Driver.py takes the input file, parses it and calls your Solution.py class' output_vector() method on data structures it creates. It then prints the computed output vector. You only need to update the Solution.py file. You may write your own helper methods and data structures in it.

The testcases folder has 3 input files and their corresponding output files for your reference. We will use these three input files (and seven others) in our autograding.

Method you need to write:

                                    def output_vector(self):
                                        """
                                        This method must be filled in by you. You may add other methods and subclasses as you see fit,
                                        but they must remain within the Solution class.
                                        :return: a correct output vector, as a Python list
                                        """

                                        return None

The Solution class contains one data structure.

in_vector which is the vector supplied by the driver.

Your method is expected to return a list of integers representing the correct output vector. Make sure that the order is correct.

Executing from command line:

Assuming you're in the same directory level as Driver.py and you want to run your code on the input1.txt. Run python Driver.py testcases/input1.txt.

Submission

You only need to submit Solution.py to Autolab.

Download C++ Skeleton Code

Directory Structure

                                    ├── Driver.cpp
                                    ├── Solution.cpp
                                    ├── HW0Utility.h
                                    ├── Utility.h
                                    ├── testcases/
                                    │   ├── input1.txt
                                    │   ├── input2.txt
                                    │   ├── input5.txt
                                    ├── outputs/
                                    │   ├── output1.txt
                                    │   ├── output2.txt
                                    │   └── output5.txt

You are given two coding files: Driver.cpp and Solution.cpp. Driver.cpp takes the input file, parses it and creates an instance of the class Solution and calls the outputVector() method on it. It then prints the output vector. You only need to update the Solution.cpp file. You may write your own helper methods and data structures in it.

The testcases folder has 3 input files and their corresponding output files for your reference. We will use these three input files (and seven others) in our autograding.

Method you need to write:

                                        vector<int> Solution::outputVector() {
                                            /*
                                            * Implement the solution in this function
                                            * Return the variable "m_outputVector" after you fill it with the correct result
                                            */

                                            return m_outputVector;
                                        }

Note: The output for this function must output a vector of <int>s. Make sure that the order is correct.

The Solution class has two member variables.

m_inputVector which is a vector of ints, and stores the input given by the driver.
m_outputVector which should be your correct output vector.

Compiling and executing from command line:

Assuming you're in the same directory level as Driver.cpp. Run g++ -std=c++11 Driver.cpp to compile.

To execute your code on input1.txt, run ./a.out testcases/input1.txt.

Submission

You only need to submit Solution.cpp to Autolab.

Grading Guidelines

We will follow the usual grading guidelines for programming questions.

Sample Problem 2

The Problem

Deep in the Pacific ocean you discover the RapidGrowers bacteria. These are single cell organism that divide themselves into two (identical) organisms/cells in one second. When you start going back to the surface you store one organism in a special container that exactly replicates their habitat. From the moment you put the organism in, it takes you $s$ seconds to come back to Buffalo to find your container bursting at its seams.¹

Prove or disprove the following:

There are $2^s$ RapidGrowers in your container.

Hint

You can prove this by induction or by "reducing" this problem to fact about binary trees that you should have seen in CSE 191/250.

Click here for the Solution

Note

Notice how the solution below is divided into proof idea and proof details part. THIS IS IMPORTANT: IF YOU DO NOT PRESENT A PROOF IDEA, YOU WILL NOT GET ANY CREDIT EVEN IF YOUR PROOF DETAILS ARE CORRECT.

Proof Idea

As the hint suggests there are two ways of solving this problem. (I'm presenting both the solutions but of course you only need to present one.)

We begin with the approach of reducing the given problem to a problem you have seen earlier.² Build the following complete binary tree: every internal node in the tree represents a "parent" RapidGrower while its two children are the two RapidGrowers it divides itself into. After $s$ seconds this tree will have height $s$ and the number of RapidGrowers in the container after $s$ seconds is the number of leaf nodes these complete binary tree has, which we know is $2^s$. Hence, the claim is correct.

The proof by induction might be somewhat simpler for this problem if you are not comfortable with reduction. In this case let $R(s)$ be the number of RapidGrowers after $s$ seconds. Then we use induction to prove that $R(s)=2^s$ while using the fact that $2\cdot 2^s=2^{s+1}$.

Proof Details

We first present the reduction based proof. Consider the complete binary tree with height $s$ and call it $T(s)$. Further, note that one can construct $T(s+1)$ from $T(s)$ by attaching two children nodes to all the leaves in $T(s)$. Notice that the newly added children are the leaves of $T(s+1)$. Now assign the root of $T(0)$ as the original RapidGrower in the container. Further, for any internal node in $T(s)$ ($s\ge 0$), assign its two children to the two RapidGrowers it divides itself into. Then note that there is a one to one correspondence between the RapidGrowers after $s$ seconds and the leaves of $T(s)$.³ Then we use the well-known fact (cite your 191/250 book here with the exact place where one can find this fact): $T(s)$ has $2^s$ leaves, which means that the number of RapidGrowers in the container after $s$ seconds is $2^s$, which means that the claim is correct.

We now present the proof by induction. Let $R(s)$ be the number of RapidGrowers in the container after $s$ seconds. We will prove by induction that \[R(s)=2^s,\] which would mean that the claim is correct. First note that for the case case $s=0$, we have one RapidGrower, i.e. $R(0)=1=2^0$, which proves the base case. For the inductive hypothesis, assume that for some $s\ge 0$, $R(s)=2^s$. Next we will argue that $R(s+1)=2^{s+1}$, which would complete the induction. Indeed, consider each of the $R(s)$ RapidGrower. After $s+1$ seconds, each such RapidGrower will be replaced by two RapidGrowers. This implies that \[R(s+1)=2\cdot R(s)=2\cdot 2^s=2^{s+1},\] where the second equality follows from the inductive hypothesis. This completes the proof.