Recitation 4
Recitation notes for the week of Feb 12, 2025.
Homework reminders
Here are some gentle reminders about HW policies:
- Make sure you submit your HW as a PDF only. Otherwise your submission will not be graded.
- Remember to put collaborators and sources during your submission. Use
None
if appropriate. - Can only collaborate with 2 other people and must be for both Q1 and Q2.
- NO collaboration is allowed on Q3. Discussion about Q3 (outside of piazza) is not allowed.
- We will be running MOSS to check for cheating on Q3.
- Proof Idea and Proof Details must be clearly separated and labeled. Not doing so will result in a ZERO for the problem.
Question 1 on Homework 2
For this question, we are still using the Gale-Shapley algorithm (GS) and the stable marriage problem for matching couples, but we are adding a twist to it. One thing that the stable marriage problem does not do is handle divorces, as it assumes that the preferences do not change. What would happen if someone did change their preference list?
Problem Set-Up
- In this problem, we assume that men propose to women (as done in the book and unlike what we did in the lecture)
- $m_i$, $w_i$ $\in$ $M$, $W$ where $0 <i ≤n$
- Each $m$, $w$ have preference lists $L_{m_i}$ or $L_{w_i}$ ranking the other group from most preferred to least preferred
- The output of GS on this input is $S_{orig}$
- After making $S_{orig}$, we change one preference list from $L_w$ to $L_w’$
Now after changing the preference list of one person, we run GS again. If it outputs a set $S_{new}$ that is disjoint with $S_{orig}$, then we call the person whose list changed a reformer.
Disjoint
Two sets are disjoint if they have no element in common. In this problem, the element is a pair $(m, w)$, so for the sets to be disjoint, then no pair $(m, w)$ can appear in both sets.
Part (a)
Problem Statement
Given that an instance of the stable marriage problem has a reformer, show that $S_{orig} \neq S_{new}$.
Proof Idea
Let's look at an example input.
Example 1
Men:
- $m_1 : w_1 > w_2 > w_3$
- $m_2 : w_1 > w_3 > w_2$
- $m_3 : w_3 > w_2 > w_1$
Women:
- $w_1 : m_1 > m_2 > m_3$
- $w_2 : m_2 > m_3 > m_1$
- $w_3 : m_3 > m_1 > m_2$
If we run GS on this example, with the following sequence and final stable matching:
- $m_1$ proposes to $w_1$. She accepts.
- $m_2$ proposes to $w_1$. She rejects since she prefers $m_1$ to $m_2$.
- $m_2$ proposes to $w_3$. She accepts.
- $m_3$ proposes to $w_3$. She accepts since he prefers $m_3$ to $m_2$. $m_2$ is now free.
- $m_2$ proposes to $w_2$. She accepts.
Since all men are now matched, the algorithm ends, producing the stable matching $S_{orig} = \{(m_1, w_1), (m_2, w_2), (m_3, w_3)\}$.
What would happen if we changed $w_3$’s preference list to:
- $w_3 : m_2 > m_3 > m_1$
With this new change, running GS gives:
- $m_1$ proposes to $w_1$. She accepts.
- $m_2$ proposes to $w_1$. She rejects since she prefers $m_1$ to $m_2$.
- $m_2$ proposes to $w_3$. She accepts.
- $m_3$ proposes to $w_3$. She rejects since now she prefers $m_2$ to $m_3$. $m_3$ remains free.
- $m_3$ proposes to $w_2$. She accepts.
Now GS produces $S_{new} = \{(m_1, w_1), (m_2, w_3), (m_3, w_2)\}$. This is not the same as $S_{orig}$.
Now we try and extend this to all $n$. Our family will be defined as follows:
For inputs where $n > 3$ (there are more than $3$ men and women), the preference lists are the same as the input above, with:
- $m_4 > m_5 > ... > m_n$, and
- $w_4 > w_5 > ... > w_n$
appended to each list.
For the remaining men and women, where $i>3$, the lists are:
- $m_i : w_i > w_{i+1} > ... > w_n > w_1 > ... > w_{i-1}$
- $w_i : m_i > m_{i+1} > ... > m_n > m_1 > ... > m_{i-1}$
With the original preference lists, the sequence of proposals is:
- The first five proposals are exactly the same as in the first input above.
- After the first 5 steps, $m_i$ proposes to $w_i$ for $i>3$ and $w_i$ accepts.
This produces $S_{orig} = \{(m_1, w_1), (m_2, w_2), …, (m_n, w_n)\}$
The change to the list of $w_3$ is as above, simply replacing the first three rankings as $m_2> m_3> m_1$ followed by $m_4 > m_5 > ... > m_n$.
Exercise
Argue that with this change $S_{new} = \{(m_1, w_1), (m_2, w_3), (m_3, w_2), (m_4, w_4), …, (m_n, w_n)\}$. Then argue why this is enough to prove part (a).
One thing to remember about GS when doing these problems and examples is that the sequence of proposals does not affect the output of the algorithm. Rather, the algorithm will only ever output one specific stable matching for a given input. For a proof of this property, read about Gale-Shapley in your textbook $:)$.
Part (b)
Problem Statement
Given that an instance of the stable marriage problem has a reformer, show that $S_{orig}$ and $S_{new}$ are disjoint.
Now we want to show a family that still upholds $S_{orig} \ne S_{new}$, but also take it a step further, and show that $S_{orig}$ and $S_{new}$ are disjoint (no person is paired to the same person in both sets). Remember, a stable matching is a set of pairs, not the pair itself.
Looking at the previous example, we see that that specific instance does not satisfy this ($(m_1, w_1)$ appears in both $S_{orig}$ and $S_{new}$).
Question 2 on Homework 2
Problem Statement
The Big G
company is in town and has set up $m$ slots for $n$ engineers
to meet and interview $n$ students
to recruit these students to Big G
(you can and should assume $m > n$). They want to create a schedule such that:
- Each
student
talks to eachengineer
exactly once - No two
students
are scheduled to meet the sameengineer
in the same time slot - No two
engineers
meet the samestudent
in the same slot
Additionally, Big G
has decided that they are going to accept all $n$ students
, so they want to match each student
$S$ to an engineer
$E$ such that after $S$ meets $E$ during their schedule, all of $S$’s and $E$’s subsequent meetings are cancelled.
In other words, the goal for each engineer
$E$ and student
$S$ is to truncate their schedule (cancel every time slot in both schedules after they meet each other) such that no student gets stood up. A student
is stood up if they are scheduled to meet with an engineer
, but that engineer
truncated their schedule before that time slot.
Part (a)
Your first instinct is most likely to design a greedy algorithm (an algorithm that makes decisions without needing the entire input, so in this case, using just student
schedules, or just engineer
schedules) to solve the problem quickly. The problem with this is that these greedy algorithms either violate or outright ignore one of the stated properties or rules for this problem.
Here are two examples:
Algorithm Idea
What if we matched each engineer
to the last student
they were scheduled to meet with, such that the engineers
schedule is truncated at their last meeting, essentially removing the remaining empty slots in their schedule?
Algorithm Details
//input: n is the number of students and engineers //input: m is the number of time slots //input: student schedules are S[i] where 0 < i ≤ n and is of length m //input: engineer schedules are E[i] where 0 < i ≤ n and is of length m //input: each index of the arrays will be 0 if the slot is free, or will // contain a number corresponding to a scheduled engineer or student //return will be a list of pairs of engineers and students ret = [] for i from 1 to n: for j from m-1 to 0: if E[i][j] != 0: ret.append((E[i][j],i)) break return ret
(Dis)Proof by Counter-example
The issue here is each engineer
is not guaranteed to be matched to a unique student
:
Here we see that both engineers
have $S_2$ as their last meeting of the day, but this would mean $S_2$ is matched with $2$ engineers
, which is not allowed.
The main issue here is we are only considering the engineers'
schedules, and not paying any regard to how those schedules work with the schedules of the students
.
Algorithm Idea
What if instead of scheduling the engineers
to their last student
, we matched each student
with their last engineer
?
Algorithm Details
//input: n is the number of students and engineers //input: m is the number of time slots //input: student schedules are S[i] where 0 < i ≤ n and is of length m //input: engineer schedules are E[i] where 0 < i ≤ n and is of length m //input: each index of the arrays will be 0 if the slot is free, or will // contain a number corresponding to a scheduled engineer or student //return will be a list of pairs of engineers and students ret = [] for i from 1 to n: for j from m-1 to 0: if S[i][j] != 0: ret.append((i,S[i][j])) break return ret
Exercise
Explain why the above greedy algorithm does not work.
Part (b)
Problem Statement
Now we want to find an algorithm that actually works for this problem. To do this, you should consider both the student
schedules and the engineer
schedules. Again, the end goal is to be able to match students
to engineers
so the schedules of both can be truncated once they meet with their match.
Hint
As a hint, and to highlight what will be a recurring theme of this course, try and reduce this problem to one we have already seen in class. This will help reduce the original work you need to do, and will help you to see how a lot of the problems and algorithms in this course relate to each other.
Question 3 on Homework 2
Understanding the problem statement
Input to the problem
$m$ hospitals, each with a preference list of all $n$ students.
$n$ students, each with a preference list of all $m$ hospitals.
Difference from Stable Matching Problem in class
- The number of hospitals and the number of students are different.
- One hospital can have more than one open slot, but each student is only assigned to one hospital.
- There are at least as many students as there are total number of available slots for all $m$ hospitals.
Overall Goal
Given $n$ hospitals and $m$ students, output one stable matching assignment of hospitals to students.
As in the Stable Matching problem, a stable assignment is one that does not have any instability. However, unlike the earlier case there can be two kinds of instability, which we talk about next.
Instability (kind 1)
There are students $s$ and $s'$, and a hospital $h$, so that:
- $s$ is assigned to $h$, and
- $s'$ is assigned to no hospital, and
- $h$ prefers $s'$ to $s$.
For example, consider the following input: \[h_1 (2 \text{ spots}): s_1 > s_2 > s_3 > s_4~~~~ s_1: h_2 > h_1~~~~ s_3: h_1 > h_2\] \[h_2 (1 \text{ spot}):~ s_2 > s_4 > s_3 > s_1~~~~ s_2: h_2 > h_1~~~~ s_4: h_1 > h_2\] and the following assignment: \[\{(h_1, s_3), (h_1, s_4), (h_2, s_2)\}.\]
The above assignment is unstable because $s_1$ was not assigned to anyone, but is highest on $h_1$'s preference list.
Instability (kind 2)
There are students $s$ and $s'$, and hospitals $h$ and $h'$, so that:
- $s$ is assigned to $h$ and
- $s'$ is assigned to $h'$, and
- $h$ prefers $s'$ to $s$, and
- $s'$ prefers $h$ to $h'$.
For example, consider the following input: \[h_1 (2 \text{ spots}):~~~~ s_1 > s_2 > s_3 > s_4~~~~ s_1: h_2 > h_1~~~~ s_3: h_1 > h_2\] \[h_2 (1 \text{ spot}):~~~~~ s_2 > s_4 > s_3 > s_1~~~~ s_2: h_2 > h_1~~~~ s_4: h_1 > h_2\] and the following assignment: \[\{(h_1, s_1), (h_1, s_2), (h_2, s_4).\]
The above assignment is unstable because $h_2$ prefers $s_2$ to current ($s_4$) and $s_2$ prefers $h_2$ to current ($h_1$).
An example (input and output and why this is stable)
We will now go over the example given in HW 2.
Input:
3 <- Number of hospitals 5 <- Number of students 1 2 3 5 1 4 <- Preference of the 1st hospital (most preferred first with first index is available slot) 1 5 1 2 4 3 <- Preference of the 2nd hospital (most preferred first with first index is available slot) 2 5 2 3 1 4 <- Preference of the 3rd hospital (most preferred first with first index is available slot) 2 1 3 <- Preference of the 1st student (most preferred first) 3 2 1 <- Preference of the 2nd student (most preferred first) 3 1 2 <- Preference of the 3rd student (most preferred first) 1 2 3 <- Preference of the 4th student (most preferred first) 1 2 3 <- Preference of the 5th student (most preferred first)
We claim that the following is a stable assignment:
(1, 5) <- Pairing of the form (h,s) (2, 1) (3, 2) (3, 3)
Let’s take a look at why this output is stable:
- $(1,5)$: Take a look at $\{2,3\}$ who rank higher in $h_1$'s preference list. If $2$ or $3$ prefer $1$ over their current partner or are unmatched, then there would be an instability.
- $(2,1)$: $5$ is higher on $h_2$'s preference list. If $s_5$ is unmatched or prefers $h_2$ to their current partner, then there would be an instability.
- Repeat for $(3,2)$ and $(3,3)$.
Asymptotic Notation
Overview of "linear" bounds on runtime
- $O(n)$ - upper bound for worst case run time
- an upper bound, not the exact growth rate of the function
- $\Omega(n)$ - lower bound for worst case run time (NOT THE BEST CASE RUNTIME)
- a lower bound, not the exact growth rate
- $\Theta(n)$ - exact run time - if you can show something is $O(n)$ and $\Omega(n)$.
Simple example with search
The search problem
Given a string s
, search for a character, c, in the string.
Consider the most basic way solve the above problem: i.e. iterate over list:
for a in s:
if (a == e):
return a
return -1
$O(n)$ worst case we will need to iterate over every element in the list. As our list grows, our worst case scenario grows linearly.
Another example
Calling the len
function
Store the length in a function before even storing the string. For example if we store $s$ = cat
, then len
= $3$.
What are the asymptotic bounds on calling len
?
Calling len
will take the same amount of time no matter how big the string is, therefore it takes constant time:
- $O(1)$ since worst case we find the length of the string in constant time.
- $\Omega(1)$ since we cannot take less than constant time to find the length of the string.
- Since big $O$ and big $\Omega$ are the same, this solution runs in $\Theta(1)$.
Common runtimes of algorithms
$O(n)$ - linear
Examples: computing the maximum in a list, merging two sorted lists.
$O(n^2)$ - quadratic
Arises naturally from two nested for loops, why?
Examples: brute force seeing two numbers in a list add up to another, brute force substring.
$O(\log n)$ - logarithmic
Example in class with binary search. Recap of why this is true:
[1,2,3,4] → takes 2 steps with a list of size 4 [1,2][3,4] [1][2][3][4]
[1,2,3,4,5,6,7,8] → takes 3 steps with a list of size 8 (double the size) [1,2,3,4][5,6,7,8] [1,2][3,4][5,6][7,8] [1][2][3][4][5][6][7][8]
[1,2,3,4,5] best case: searching for 3, worst case: searching for 5 [3,4,5] [4,5] [5]
So binary search is $O(\log n)$ and $\Omega(1)$ ($\Omega(\log{n})$ lower bound can also be showed-- left as an exercise!).
$O(n \log n)$
Common with any algorithm that splits the input into two and solves each piece recursively and combines the two pieces in linear time.
For example: Mergesort, so also common with any algorithm that sorts first then performs a linear operation on the input since $O(n \log n) + O(n) = O(n \log n)$.
Further Reading
- Make sure you carefully read the policy documents: Syllabus and Homework policies.
- Support page on Asymptotic Analysis.
- Support page on lower bounds.