Recitation 4

Recitation notes for the week of Feb 12, 2025.

Homework reminders

Here are some gentle reminders about HW policies:

Make sure you submit your HW as a PDF only. Otherwise your submission will not be graded.
Remember to put collaborators and sources during your submission. Use None if appropriate.
Can only collaborate with 2 other people and must be for both Q1 and Q2.
NO collaboration is allowed on Q3. Discussion about Q3 (outside of piazza) is not allowed.
We will be running MOSS to check for cheating on Q3.
Proof Idea and Proof Details must be clearly separated and labeled. Not doing so will result in a ZERO for the problem.

Question 1 on Homework 2

For this question, we are still using the Gale-Shapley algorithm (GS) and the stable marriage problem for matching couples, but we are adding a twist to it. One thing that the stable marriage problem does not do is handle divorces, as it assumes that the preferences do not change. What would happen if someone did change their preference list?

Problem Set-Up

In this problem, we assume that men propose to women (as done in the book and unlike what we did in the lecture)
$m_i$, $w_i$ $\in$ $M$, $W$ where $0 <i ≤n$
Each $m$, $w$ have preference lists $L_{m_i}$ or $L_{w_i}$ ranking the other group from most preferred to least preferred
The output of GS on this input is $S_{orig}$
After making $S_{orig}$, we change one preference list from $L_w$ to $L_w’$

Now after changing the preference list of one person, we run GS again. If it outputs a set $S_{new}$ that is disjoint with $S_{orig}$, then we call the person whose list changed a reformer.

Disjoint

Two sets are disjoint if they have no element in common. In this problem, the element is a pair $(m, w)$, so for the sets to be disjoint, then no pair $(m, w)$ can appear in both sets.

Part (a)

Problem Statement

Given that an instance of the stable marriage problem has a reformer, show that $S_{orig} \neq S_{new}$.

Proof Idea

Let's look at an example input.

Example 1

Men:

$m_1 : w_1 > w_2 > w_3$
$m_2 : w_1 > w_3 > w_2$
$m_3 : w_3 > w_2 > w_1$

Women:

$w_1 : m_1 > m_2 > m_3$
$w_2 : m_2 > m_3 > m_1$
$w_3 : m_3 > m_1 > m_2$

If we run GS on this example, with the following sequence and final stable matching:

$m_1$ proposes to $w_1$. She accepts.
$m_2$ proposes to $w_1$. She rejects since she prefers $m_1$ to $m_2$.
$m_2$ proposes to $w_3$. She accepts.
$m_3$ proposes to $w_3$. She accepts since he prefers $m_3$ to $m_2$. $m_2$ is now free.
$m_2$ proposes to $w_2$. She accepts.

Since all men are now matched, the algorithm ends, producing the stable matching $S_{orig} = \{(m_1, w_1), (m_2, w_2), (m_3, w_3)\}$.

What would happen if we changed $w_3$’s preference list to:

$w_3 : m_2 > m_3 > m_1$

With this new change, running GS gives:

$m_1$ proposes to $w_1$. She accepts.
$m_2$ proposes to $w_1$. She rejects since she prefers $m_1$ to $m_2$.
$m_2$ proposes to $w_3$. She accepts.
$m_3$ proposes to $w_3$. She rejects since now she prefers $m_2$ to $m_3$. $m_3$ remains free.
$m_3$ proposes to $w_2$. She accepts.

Now GS produces $S_{new} = \{(m_1, w_1), (m_2, w_3), (m_3, w_2)\}$. This is not the same as $S_{orig}$.

Now we try and extend this to all $n$. Our family will be defined as follows:

For inputs where $n > 3$ (there are more than $3$ men and women), the preference lists are the same as the input above, with:

$m_4 > m_5 > ... > m_n$, and
$w_4 > w_5 > ... > w_n$

appended to each list.

For the remaining men and women, where $i>3$, the lists are:

$m_i : w_i > w_{i+1} > ... > w_n > w_1 > ... > w_{i-1}$
$w_i : m_i > m_{i+1} > ... > m_n > m_1 > ... > m_{i-1}$

With the original preference lists, the sequence of proposals is:

The first five proposals are exactly the same as in the first input above.
After the first 5 steps, $m_i$ proposes to $w_i$ for $i>3$ and $w_i$ accepts.

This produces $S_{orig} = \{(m_1, w_1), (m_2, w_2), …, (m_n, w_n)\}$

The change to the list of $w_3$ is as above, simply replacing the first three rankings as $m_2> m_3> m_1$ followed by $m_4 > m_5 > ... > m_n$.

Exercise

Argue that with this change $S_{new} = \{(m_1, w_1), (m_2, w_3), (m_3, w_2), (m_4, w_4), …, (m_n, w_n)\}$. Then argue why this is enough to prove part (a).

One thing to remember about GS when doing these problems and examples is that the sequence of proposals does not affect the output of the algorithm. Rather, the algorithm will only ever output one specific stable matching for a given input. For a proof of this property, read about Gale-Shapley in your textbook $:)$.

Part (b)

Problem Statement

Given that an instance of the stable marriage problem has a reformer, show that $S_{orig}$ and $S_{new}$ are disjoint.

Now we want to show a family that still upholds $S_{orig} \ne S_{new}$, but also take it a step further, and show that $S_{orig}$ and $S_{new}$ are disjoint (no person is paired to the same person in both sets). Remember, a stable matching is a set of pairs, not the pair itself.

Looking at the previous example, we see that that specific instance does not satisfy this ($(m_1, w_1)$ appears in both $S_{orig}$ and $S_{new}$).

Question 2 on Homework 2

Problem Statement

The Big G company is in town and has set up $m$ slots for $n$ engineers to meet and interview $n$ students to recruit these students to Big G (you can and should assume $m > n$). They want to create a schedule such that:

Each student talks to each engineer exactly once
No two students are scheduled to meet the same engineer in the same time slot
No two engineers meet the same student in the same slot

Additionally, Big G has decided that they are going to accept all $n$ students, so they want to match each student $S$ to an engineer $E$ such that after $S$ meets $E$ during their schedule, all of $S$’s and $E$’s subsequent meetings are cancelled.

In other words, the goal for each engineer $E$ and student $S$ is to truncate their schedule (cancel every time slot in both schedules after they meet each other) such that no student gets stood up. A student is stood up if they are scheduled to meet with an engineer, but that engineer truncated their schedule before that time slot.

Part (a)

Your first instinct is most likely to design a greedy algorithm (an algorithm that makes decisions without needing the entire input, so in this case, using just student schedules, or just engineer schedules) to solve the problem quickly. The problem with this is that these greedy algorithms either violate or outright ignore one of the stated properties or rules for this problem.

Here are two examples:

Algorithm Idea

What if we matched each engineer to the last student they were scheduled to meet with, such that the engineers schedule is truncated at their last meeting, essentially removing the remaining empty slots in their schedule?

Algorithm Details

        //input: n is the number of students and engineers
        //input: m is the number of time slots
        //input: student schedules are S[i] where 0 < i ≤ n and is of length m
        //input: engineer schedules are E[i] where 0 < i ≤ n and is of length m
        //input: each index of the arrays will be 0 if the slot is free, or will 
        //       contain a number corresponding to a scheduled engineer or student
        //return will be a list of pairs of engineers and students
        
        ret = []
        
        for i from 1 to n:
            for j from m-1 to 0:
                if E[i][j] != 0:
                    ret.append((E[i][j],i))
                    break
        
        return ret

(Dis)Proof by Counter-example

The issue here is each engineer is not guaranteed to be matched to a unique student:

Here we see that both engineers have $S_2$ as their last meeting of the day, but this would mean $S_2$ is matched with $2$ engineers, which is not allowed.

The main issue here is we are only considering the engineers' schedules, and not paying any regard to how those schedules work with the schedules of the students.

Algorithm Idea

What if instead of scheduling the engineers to their last student, we matched each student with their last engineer?

Algorithm Details

        //input: n is the number of students and engineers
        //input: m is the number of time slots
        //input: student schedules are S[i] where 0 < i ≤ n and is of length m
        //input: engineer schedules are E[i] where 0 < i ≤ n and is of length m
        //input: each index of the arrays will be 0 if the slot is free, or will 
        //       contain a number corresponding to a scheduled engineer or student
        //return will be a list of pairs of engineers and students
        
        ret = []
        
        for i from 1 to n:
            for j from m-1 to 0:
                if S[i][j] != 0:
                    ret.append((i,S[i][j]))
                    break
        
        return ret

Exercise

Explain why the above greedy algorithm does not work.

Part (b)

Problem Statement

Now we want to find an algorithm that actually works for this problem. To do this, you should consider both the student schedules and the engineer schedules. Again, the end goal is to be able to match students to engineers so the schedules of both can be truncated once they meet with their match.

Hint

As a hint, and to highlight what will be a recurring theme of this course, try and reduce this problem to one we have already seen in class. This will help reduce the original work you need to do, and will help you to see how a lot of the problems and algorithms in this course relate to each other.

Question 3 on Homework 2

Understanding the problem statement

Input to the problem

$m$ hospitals, each with a preference list of all $n$ students.

$n$ students, each with a preference list of all $m$ hospitals.

Difference from Stable Matching Problem in class

The number of hospitals and the number of students are different.
One hospital can have more than one open slot, but each student is only assigned to one hospital.
There are at least as many students as there are total number of available slots for all $m$ hospitals.

Overall Goal

Given $n$ hospitals and $m$ students, output one stable matching assignment of hospitals to students.

As in the Stable Matching problem, a stable assignment is one that does not have any instability. However, unlike the earlier case there can be two kinds of instability, which we talk about next.

Instability (kind 1)

There are students $s$ and $s'$, and a hospital $h$, so that:

$s$ is assigned to $h$, and
$s'$ is assigned to no hospital, and
$h$ prefers $s'$ to $s$.

For example, consider the following input: \[h_1 (2 \text{ spots}): s_1 > s_2 > s_3 > s_4~~~~ s_1: h_2 > h_1~~~~ s_3: h_1 > h_2\] \[h_2 (1 \text{ spot}):~ s_2 > s_4 > s_3 > s_1~~~~ s_2: h_2 > h_1~~~~ s_4: h_1 > h_2\] and the following assignment: \[\{(h_1, s_3), (h_1, s_4), (h_2, s_2)\}.\]

The above assignment is unstable because $s_1$ was not assigned to anyone, but is highest on $h_1$'s preference list.

Instability (kind 2)

There are students $s$ and $s'$, and hospitals $h$ and $h'$, so that:

$s$ is assigned to $h$ and
$s'$ is assigned to $h'$, and
$h$ prefers $s'$ to $s$, and
$s'$ prefers $h$ to $h'$.

For example, consider the following input: \[h_1 (2 \text{ spots}):~~~~ s_1 > s_2 > s_3 > s_4~~~~ s_1: h_2 > h_1~~~~ s_3: h_1 > h_2\] \[h_2 (1 \text{ spot}):~~~~~ s_2 > s_4 > s_3 > s_1~~~~ s_2: h_2 > h_1~~~~ s_4: h_1 > h_2\] and the following assignment: \[\{(h_1, s_1), (h_1, s_2), (h_2, s_4).\]

The above assignment is unstable because $h_2$ prefers $s_2$ to current ($s_4$) and $s_2$ prefers $h_2$ to current ($h_1$).

An example (input and output and why this is stable)

We will now go over the example given in HW 2.

Input:

                3                               <- Number of hospitals
                5                               <- Number of students
                1   2   3   5   1   4           <- Preference of the 1st hospital (most preferred first with first index is available slot)
                1   5   1   2   4   3           <- Preference of the 2nd hospital (most preferred first with first index is available slot)
                2   5   2   3   1   4           <- Preference of the 3rd hospital (most preferred first with first index is available slot)
                2   1   3                       <- Preference of the 1st student (most preferred first)
                3   2   1                       <- Preference of the 2nd student (most preferred first)
                3   1   2                       <- Preference of the 3rd student (most preferred first)
                1   2   3                       <- Preference of the 4th student (most preferred first)
                1   2   3                       <- Preference of the 5th student (most preferred first)

We claim that the following is a stable assignment:

                (1, 5)                          <- Pairing of the form (h,s)
                (2, 1)
                (3, 2)
                (3, 3)

Let’s take a look at why this output is stable:

$(1,5)$: Take a look at $\{2,3\}$ who rank higher in $h_1$'s preference list. If $2$ or $3$ prefer $1$ over their current partner or are unmatched, then there would be an instability.
$(2,1)$: $5$ is higher on $h_2$'s preference list. If $s_5$ is unmatched or prefers $h_2$ to their current partner, then there would be an instability.
Repeat for $(3,2)$ and $(3,3)$.

Asymptotic Notation

Overview of "linear" bounds on runtime

$O(n)$ - upper bound for worst case run time
- an upper bound, not the exact growth rate of the function
$\Omega(n)$ - lower bound for worst case run time (NOT THE BEST CASE RUNTIME)
- a lower bound, not the exact growth rate
$\Theta(n)$ - exact run time - if you can show something is $O(n)$ and $\Omega(n)$.

Simple example with search

The search problem

Given a string s, search for a character, c, in the string.

Consider the most basic way solve the above problem: i.e. iterate over list:

                        for a in s:
                            if (a == e):
                                return a
                            return -1

$O(n)$ worst case we will need to iterate over every element in the list. As our list grows, our worst case scenario grows linearly.

Another example

Calling the `len` function

Store the length in a function before even storing the string. For example if we store $s$ = cat, then len = $3$.

What are the asymptotic bounds on calling len?

Calling len will take the same amount of time no matter how big the string is, therefore it takes constant time:

$O(1)$ since worst case we find the length of the string in constant time.
$\Omega(1)$ since we cannot take less than constant time to find the length of the string.
Since big $O$ and big $\Omega$ are the same, this solution runs in $\Theta(1)$.

Common runtimes of algorithms

$O(n)$ - linear

Examples: computing the maximum in a list, merging two sorted lists.

$O(n^2)$ - quadratic

Arises naturally from two nested for loops, why?

Examples: brute force seeing two numbers in a list add up to another, brute force substring.

$O(\log n)$ - logarithmic

Example in class with binary search. Recap of why this is true:

                [1,2,3,4]     → takes 2 steps with a list of size 4
                [1,2][3,4]       
                [1][2][3][4]

                [1,2,3,4,5,6,7,8]    → takes 3 steps with a list of size 8 (double the size)
                [1,2,3,4][5,6,7,8]        
                [1,2][3,4][5,6][7,8]
                [1][2][3][4][5][6][7][8]

                [1,2,3,4,5] best case: searching for 3, worst case: searching for 5 
                [3,4,5]
                [4,5]
                [5]

So binary search is $O(\log n)$ and $\Omega(1)$ ($\Omega(\log{n})$ lower bound can also be showed-- left as an exercise!).

$O(n \log n)$

Common with any algorithm that splits the input into two and solves each piece recursively and combines the two pieces in linear time.

For example: Mergesort, so also common with any algorithm that sorts first then performs a linear operation on the input since $O(n \log n) + O(n) = O(n \log n)$.

Recitation 4

Homework reminders

Question 1 on Homework 2

Problem Set-Up

Disjoint

Part (a)

Problem Statement

Proof Idea

Example 1

Exercise

Part (b)

Problem Statement

Question 2 on Homework 2

Problem Statement

Part (a)

Algorithm Idea

Algorithm Details

(Dis)Proof by Counter-example

Algorithm Idea

Algorithm Details

Exercise

Part (b)

Problem Statement

Hint

Question 3 on Homework 2

Understanding the problem statement

Input to the problem

Difference from Stable Matching Problem in class

Overall Goal

Instability (kind 1)

Instability (kind 2)

An example (input and output and why this is stable)

Asymptotic Notation

Overview of "linear" bounds on runtime

Simple example with search

The search problem

Another example

Calling the len function

Common runtimes of algorithms

$O(n)$ - linear

$O(n^2)$ - quadratic

$O(\log n)$ - logarithmic

$O(n \log n)$

Further Reading

Calling the `len` function