{\huge A Matchless Result}

A famous algorithm, a new paper, a full correctness proof

\vspace{.5in}

Vijay Vazirani is one of the world experts on algorithmic theory, and is especially known for his masterful book on approximation algorithms. Among many results in computational complexity, two have been Wikified: his theorem with Les Valiant on unique SAT, and the subsequent generalized Isolation Lemma with Ketan Mulmuly and his brother Umesh Vazirani. Lately his focus has been more on computational aspects of auctions, economic systems, and game theory.

Today Ken and I wish to discuss a recent paper by Vijay on matching.

His paper is quite unique---is ``quite'' redundant?---well it is an isolated case of a top researcher taking the time and energy to explain the correctness of one of his famous algorithms, originally in joint work with Silvio Micali. As our field matures we should see more of this.

Given an undirected graph $latex {G=(V,E)}&fg=000000$ a matching, of course, is a set of pairwise non-adjacent edges: no two edges share a common vertex. A maximum matching is a matching that contains the largest possible number of edges. If $latex {|V|}&fg=000000$ is even then a matching with $latex {|V|/2}&fg=000000$ edges is perfect\/. The notorious Petersen graph

\includegraphics[width=1in]{match.png}

has six different perfect matchings: the five ``spokes'' and others using just one spoke. Although matchings are often associated with bipartite graphs, it is important to note that the Petersen graph is not bipartite. This is important since the matching algorithm we will discuss works for general graphs, not just bipartite graphs.

The Importance Of Matching

Finding a maximum matching is one of the foundational problems at the intersection of graph theory and algorithmic theory. The reason for this is:

Here is a picture of some of the applications of matching algorithms.

\includegraphics[width=4in]{apps.png}

Let's turn to look at Vijay's new paper.

The Paper Trail

Actually we need to first look at the original paper called An $latex {O(\sqrt{|V|}|E|)}&fg=000000$ algoithm for finding maximum matching in general graphs by Silvio Micali and Vijay Vazirani. It was published in the 21st Annual Symposium on Foundations of Computer Science in 1980.

We will call this paper the MV paper, and its algorithm the MV algorithm. Note the running time is $latex {O(\sqrt{n}m)}&fg=000000$ using ``modern'' internationally agreed upon notation: we use $latex {n}&fg=000000$ always to denote the number of vertices and $latex {m}&fg=000000$ the number of edges. Paul Halmos was always careful about notation, and once said that his worst nightmare was that someone would type

$latex \displaystyle \lim_{\epsilon \rightarrow \infty} n_{\epsilon} = 0. &fg=000000$

The title of this great paper was created before these agreements, so it is understandable that they used the ``old'' style notation in the title.

This paper presented the MV algorithm with the given running time. The major advance was that it efficiently found a maximum matching in general graphs. There is an interesting cliff that happens in the theory of algorithms for finding maximum matchings. Bipartite graphs are just much easier for matching algorithms. Of course they are easier for many other algorithms: it is trivial to find a 2-coloring of a bipartite graph, but NP-hard to find a 3-colring in a general graph. One of the achievements of the MV paper is that everything works for general graphs.

The new paper is solely authored by Vijay. It has several goals, but the main one is to give a clear but full proof of the correctness of the MV algorithm. The algorithm remains the same, but the correctness proof is new. The previous proof had a flawed case analysis. The new proof avails itself of information that the previous proof bypassed, to make the analysis tighter and more manageable.

We will not give the whole proof, but we will give an algorithmic idea highlighted also by Vijay, which has independent interest.

Duelling Depth-First Searches

Let $latex {G = (V,E)}&fg=000000$ be a directed acyclic graph whose vertices are in layers numbered $latex {0}&fg=000000$ (for sinks) through $latex {\ell}&fg=000000$ (sources). Edges from any non-sink node go to nodes in lower layers, not necessarily the next layer, and every node has a path to a sink in layer $latex {0}&fg=000000$. Let $latex {g}&fg=000000$ and $latex {r}&fg=000000$ be any two non-sink nodes. By Menger's Theorem, either

  1. there are vertex-disjoint paths from $latex {r}&fg=000000$ and $latex {g}&fg=000000$ to sinks $latex {s_r}&fg=000000$ and $latex {s_g}&fg=000000$, or
  2. the set $latex {B_{g,r} = \{v:}&fg=000000$ every path from $latex {g}&fg=000000$ or $latex {r}&fg=000000$ to a sink goes through $latex {v\}}&fg=000000$ is nonempty.

In the latter case, let $latex {b}&fg=000000$ be the element of $latex {B_{g,r}}&fg=000000$ in the highest layer. The algorithm must find this highest bottleneck node $latex {b}&fg=000000$ within time proportional to the total number of edges in paths from $latex {g}&fg=000000$ or $latex {r}&fg=000000$ to $latex {b}&fg=000000$. In the former case the algorithm must output paths to $latex {s_r}&fg=000000$ and $latex {s_g}&fg=000000$ within time proportional to the total number of edges from $latex {g}&fg=000000$ or $latex {r}&fg=000000$ to either of $latex {s_r}&fg=000000$ or $latex {s_g}&fg=000000$.

The main puzzle is, how can we avoid searching past $latex {b}&fg=000000$ and taking more than the time allowed to the latter case, if we don't even know that it holds? For a single local search, this does indeed seem to be impossible. However, what Vijay called ``Double Depth-First Search'' (DDFS) solves it. Two search posses led by rangers Green and Red start respectively at $latex {g}&fg=000000$ and $latex {r}&fg=000000$ and mark out ``green'' and ``red'' nodes and edges both. Neither may tread on the other's ground. The rules are:

\includegraphics[width=1.5in]{GreenHornetLoneRanger.jpg}

The duel follows gentlemen's rules. Red shoots in the air. Green takes the hint and backtracks looking for another node $latex {w}&fg=000000$ at the same level as $latex {v}&fg=000000$ or lower. If Green succeeds, he sends a Western Union telegram to Red, who puts his marker on $latex {v}&fg=000000$ and moves on. If Green gets stuck, then Red has to keep the same bargain. Red backtracks until he either finds another $latex {w}&fg=000000$ at the same level as $latex {v}&fg=000000$ or lower, whereupon he telegrams Green to come back and claim $latex {v}&fg=000000$ as his, or Red gets stuck too. If they are both stuck the both come back to $latex {v}&fg=000000$. Each claims just the respective (necessarily different) edges they followed into $latex {v}&fg=000000$, but they identify $latex {v}&fg=000000$ itself as the highest bottleneck $latex {b}&fg=000000$ and share a bottle of hooch to celebrate.

They can celebrate because they have met the time guarantee: Neither searched below $latex {v}&fg=000000$, and all the edges they traversed in their wanderings had to be on paths to $latex {v}&fg=000000$, since they did not find any open trails going elsewhere. And in the case where they reach separate destinations $latex {s_g}&fg=000000$ and $latex {s_r}&fg=000000$, any backtracking done by one posse because of hitting the other's marked node is inductively on a path to the other's goal, so at most the various routes between $latex {\{g,r\}}&fg=000000$ and $latex {\{s_g,s_r\}}&fg=000000$ get searched.

Open Problems

Read the new paper V. It is one of the clearest expositions of a graph algorithm---a model for others. Also the idea that Vijay went back to an ancient paper---over thirty years old---solely to write the definitive proof of its correctness is something we all should applaud.

Of course the central problem in matching theory is still what is the best time for an algorithm that finds a maximum matching? Given the interest today in huge graphs that arise from social networks we will like a linear time algorithm. Is this possible? See this for some fast approximate algorithms.