Adversarial Lower Bounds
Some notes on proving $\Omega$ lower bound on runtime of all algorithms that solve a given problem.
The setup
We have seen earlier how we can argue an $\Omega$ lower bound on the run time of a specific algorithm. In this page, we will aim higher
The main aim
Given a problem, prove an $\Omega$ lower bound on the runtime on any (correct) algorithm that solves the problem.
If the above sounds a bit lofty, it is because proving non-trivial lower bounds for all algorithm seems like a hopeless task: e.g. how can you make sure that you reason about all the crazy things an algorithm can do? In general proving non-trivial lower bounds is something we as human species struggle with (or have at least so far). In this note, we will aim a bit lower and try to prove an "obvious" lower bound. We will start off with the main idea and then see how two obvious implementations fail and then see one way to make the argument (which is known as the adversarial argument).
The first observation
One natural idea (that always works) is to do the following:
Lower Bound based on Output Size
Any algorithm that for inputs of size $N$ has a worst-case output size of $f(N)$ needs to have a runtime of $\Omega(f(N))$ (since it has to output all the $f(N)$ elements of the output in the worst-case).
The problem with the above approach is that it does not work well when the output is of small size: e.g. the output is always a bit (in which case we just get an $\Omega(1)$ lower bound, which is not that interesting since any algorithm that executes any step will have an $\Omega(1)$ lower bound on its runtime).
Given the above it is natural to try and use the following strategy:
Lower Bound argument based on Input Size
Since the algorithm "has" to read its entire input, any algorithm runs in $\Omega(N)$ time on inputs of size $N$.
The problem with the argument above is that it is false. Consider the following problem:
Searching a sorted list
Given $n$ numbers $a_0,\dots,a_{n-1}$ in sorted order and a number $v$, check if there exists an $i$ such that $a_i=v$.
In the above $N=n+1$ and via the well-known binary search algorithm , the above can be solved in $O(\log{n})$ time, which is not $\Omega(n)$.
In the rest of the note, we will prove the following result:
Theorem
Every correct algorithm that solves the problem of searching in a sorted list needs $\Omega(\log{n})$ time.
Another attempt
A natural idea to try and prove the Theorem above is to
Come up with a bad input
Try and come up with one input on which every algorithm takes $\Omega(\log{n})$ time.
As one might imagine, the above does not work. In particular, given any input it is not too hard to come up with an algorithm that solves that particular input correctly in $O(1)$ time. Let us make this claim more precise. Note that this implies that there cannot be a "universal" bad input. Next, we argue this claim.
Algorithms that work well on specific inputs
Let us fix any input $a_0,\dots,a_{n-1};v$. Let us first consider the case that $v$ is not any of the $a_i$'s. Then here is an algorithm that will solve this instance in $O(1)$ time:
Sseraching a non-existent value
// The inputs are in A[0], ..., A[n-1]; v if(v> A[n-1] or v< A[0]) Output "v not found" else Return the output of binary search
Couple of remarks are in order:
- First note that if we did not run binary search in line 5, then the above algorithm will not be correct. Of course we are only interested in proving lower bounds for all correct algorithms.
- Note that the above algorithm is not a counter-example to the Theorem because binary search is known to run in $\Omega(\log{n})$ so the algorithm like the binary search algorithm also runs in $\Theta(\log{n})$ time.
So what happens if indeed $v=a_i$ for some $i$? One can also come up with another algorithm that runs in $O(1)$ time for that specific case (and $O(\log{n})$ in general);
Searching a present value
// The inputs are in A[0], ..., A[n-1]; v and v=A[i] if(v == A[i]) Output i else Return the output of binary search
What the above algorithms make clear is that we cannot pick a "bad input" that is independent of the algorithm. In other words, we should try and pick the hard input based on the algorithm. One nice way to do this is via the so called adversarial argument.
The adversarial argument
Instead of me trying to explain the general structure of an adversarial argument, I'll just refer you to this video:
The adversarial argument above talks about a lower bound for the median of $n$ numbers. While the specific argument of finding the median is not exactly what we are after, if you want to see the lower bound argument for median because the video above ended on a cliffhanger, here is the rest of the argument:
An adversarial argument for the searching problem
We finally return to proving the Theorem. As you can surmise from the section heading, we are going to use the adversarial argument to do this. However, we will use it in a slightly different way than how it was used in the videos above. In particular, the general idea is as follows (which can be applied to other problems too):
Adversarial Argument to Bound Number of Input Accesses
Whenever the algorithm accesses a location in the input, the adversary fixes the value at that location (and potentially other values too). Any future access by the algorithm to any of the fixed values is given for "free". The goal of the adversary is to fix values in such a way so that there is always two ways to fix the unfixed values so that the algorithm on the two resulting inputs has to output different answers. Note that this implies that the algorithm cannot terminate. If the adversary is able to play this game long enough, then we have our lower bound.
Next we implement the above overall scheme to prove the Theorem.
Proof Idea
We will set $v=1$ and assume that all $a_i\in \{0,1,2\}$. The main idea is the following. Whenever the algorithm queries the position $j$ in the numbers $a_0,\dots,a_{n-1}$, the adversary sets either $a_j=0$ or $a_j=2$ depending on how many elements remain unfixed. (Note that since the numbers are sorted this might fix some other values.) In particular, we make sure that after each access the number of unfixed values is at least half of the previous number of unfixed values. Since the initial number of unfixed values is $n$, this implies that the algorithm has to make $\Omega(\log{n})$ accesses to the input. This is because if there are fewer access, there is at least one unfixed value, which could be either a $0$ or $2$ (in which case the algorithm has to output not found) or it can be $1$ (in which the algorithm has to report that $v=1$ is present).
Proof Details
After the $i$th access by the algorithm to the values $a_0,\dots,a_{n-1}$, we define the interval $[\ell_i,r_i]$ to be the positions $j$ such that the adversary has not fixed $a_j$ yet. Before the first access we have $\ell_0=0$ and $r_0=n-1$. Further, we will maintain the following invariants:
- $a_j=0$ for every $j\in [0,\ell_i)$ and $a_j=2$ for every $j\in (r_i,n-1]$.
- $a_j$ for $j\in [\ell_i,r_i]$ has not been fixed.
- $r_{i+1}-\ell_{i+1}+1\ge \frac{r_i-\ell_i+1}{2}$ for every $i\ge 0$.
Before we show how the adversary maintains the above invariants, we assume that the adversary is able to do the above and finish our proof. Since $r_0-\ell_0+1=n$, as long as the number of accesses $i$ is at most $\log_2{n}-1$, the last invariant above implies that $r_i-\ell_i+1\ge \frac{n}{2^{\log_2{n}-1}} =2$. Then by the second invariant this implies that at least one $a_j$ has not been fixed and further from the first invariant it can be picked to be any value in $\{0,1,2\}$, which means the algorithm cannot terminate (since it cannot know for sure if $v=a_j$ for some $j$). This implies that the algorithm has to make at least $\Omega(\log{n})$ accesses, as desired.
Finally, we show how the adversary can maintain the invariants. In particular, let us assume by induction we have able to maintain the invariants for up to $i$ accesses. (Note that in the case case of $i=0$ we have $\ell_0=0$ and $r_0=n-1$.) Now consider the $(i+1)$th access-- let it be to the index $j$. If $j\not\in [\ell_i,r_i]$, then we set $\ell_{i+1}=\ell_i$ and $r_{i+1}=r_i$ (and so we do not fix any new values). Otherwise let $m=\frac{\ell_i+r_i}{2}$ be the "mid-point." If $j\ge m$, then set $\ell_{i+1}=\ell_i$ and $r_{i+1}=j-1$ (in this case set $a_j=2$-- note that this implies that $a_k=2$ for every $k\ge i$). Otherwise, set $\ell_{i+1}=j+1$ and $r_{i+1}=r_i$ (in this case set $a_j=0$-- note that this implies that $a_k=0$ for every $k\le j$). It can be verified that this maintains all the three invariants, as desired.