{\huge A Matchless Result}

A famous algorithm, a new paper, a full correctness proof

\vspace{.5in}

Vijay Vazirani is one of the world experts on algorithmic theory, and is especially known for his masterful book on approximation algorithms. Among many results in computational complexity, two have been Wikified: his theorem with Les Valiant on unique SAT, and the subsequent generalized Isolation Lemma with Ketan Mulmuly and his brother Umesh Vazirani. Lately his focus has been more on computational aspects of auctions, economic systems, and game theory.

Today Ken and I wish to discuss a recent paper by Vijay on matching.

His paper is quite unique---is ``quite'' redundant?---well it is an isolated case of a top researcher taking the time and energy to explain the correctness of one of his famous algorithms, originally in joint work with Silvio Micali. As our field matures we should see more of this.

Given an undirected graph $latex {G=(V,E)}&fg=000000$ a matching, of course, is a set of pairwise non-adjacent edges: no two edges share a common vertex. A maximum matching is a matching that contains the largest possible number of edges. If $latex {|V|}&fg=000000$ is even then a matching with $latex {|V|/2}&fg=000000$ edges is perfect\/. The notorious Petersen graph

\includegraphics[width=1in]{match.png}

has six different perfect matchings: the five ``spokes'' and others using just one spoke. Although matchings are often associated with bipartite graphs, it is important to note that the Petersen graph is not bipartite. This is important since the matching algorithm we will discuss works for general graphs, not just bipartite graphs.

The Importance Of Matching

Finding a maximum matching is one of the foundational problems at the intersection of graph theory and algorithmic theory. The reason for this is:

The problem is useful in practice. Unlike many problems that are sometimes studied in theory, people really would like to be able to find matchings of general graphs. See this for a detailed list.
The problem is basic to the theory of graphs. László Lovász and Michael Plummer wrote an entire book called Matching Theory on---you guessed right---matchings. They shared the Niveau Prize of the Hungarian Academy of Sciences for this book in 1991. We especially like the Amazon review of the book:
The subject has obviously advanced a lot since the book was published, but the overview provided is still unmatched.
Cool pun.
The problem is basic to the theory of algorithms. Euclid may have the first non-trivial algorithm, his algorithm for finding the gcd of two numbers, but algorithms for matching started over fifty years ago. This is not the place to trace out the long and complex history of who did what algorithm for finding matchings. Suffice it to say that much of the theory of algorithms can be traced to the history of matching algorithms: various tricks and methods that are used through computing often first appeared in matching algorithms. Andras Frank has written a wonderful piece called On Kuhn's Hungarian Method---A tribute from Hungary about the bipartite case---see it for some interesting history.

Here is a picture of some of the applications of matching algorithms.

\includegraphics[width=4in]{apps.png}

Let's turn to look at Vijay's new paper.

The Paper Trail

Actually we need to first look at the original paper called An $latex {O(\sqrt{|V|}|E|)}&fg=000000$ algoithm for finding maximum matching in general graphs by Silvio Micali and Vijay Vazirani. It was published in the 21st Annual Symposium on Foundations of Computer Science in 1980.

We will call this paper the MV paper, and its algorithm the MV algorithm. Note the running time is $latex {O(\sqrt{n}m)}&fg=000000$ using ``modern'' internationally agreed upon notation: we use $latex {n}&fg=000000$ always to denote the number of vertices and $latex {m}&fg=000000$ the number of edges. Paul Halmos was always careful about notation, and once said that his worst nightmare was that someone would type

$latex \displaystyle \lim_{\epsilon \rightarrow \infty} n_{\epsilon} = 0. &fg=000000$

The title of this great paper was created before these agreements, so it is understandable that they used the ``old'' style notation in the title.

This paper presented the MV algorithm with the given running time. The major advance was that it efficiently found a maximum matching in general graphs. There is an interesting cliff that happens in the theory of algorithms for finding maximum matchings. Bipartite graphs are just much easier for matching algorithms. Of course they are easier for many other algorithms: it is trivial to find a 2-coloring of a bipartite graph, but NP-hard to find a 3-colring in a general graph. One of the achievements of the MV paper is that everything works for general graphs.

The new paper is solely authored by Vijay. It has several goals, but the main one is to give a clear but full proof of the correctness of the MV algorithm. The algorithm remains the same, but the correctness proof is new. The previous proof had a flawed case analysis. The new proof avails itself of information that the previous proof bypassed, to make the analysis tighter and more manageable.

We will not give the whole proof, but we will give an algorithmic idea highlighted also by Vijay, which has independent interest.

Duelling Depth-First Searches

Let $latex {G = (V,E)}&fg=000000$ be a directed acyclic graph whose vertices are in layers numbered $latex {0}&fg=000000$ (for sinks) through $latex {\ell}&fg=000000$ (sources). Edges from any non-sink node go to nodes in lower layers, not necessarily the next layer, and every node has a path to a sink in layer $latex {0}&fg=000000$. Let $latex {g}&fg=000000$ and $latex {r}&fg=000000$ be any two non-sink nodes. By Menger's Theorem, either

there are vertex-disjoint paths from $latex {r}&fg=000000$ and $latex {g}&fg=000000$ to sinks $latex {s_r}&fg=000000$ and $latex {s_g}&fg=000000$, or
the set $latex {B_{g,r} = \{v:}&fg=000000$ every path from $latex {g}&fg=000000$ or $latex {r}&fg=000000$ to a sink goes through $latex {v\}}&fg=000000$ is nonempty.

In the latter case, let $latex {b}&fg=000000$ be the element of $latex {B_{g,r}}&fg=000000$ in the highest layer. The algorithm must find this highest bottleneck node $latex {b}&fg=000000$ within time proportional to the total number of edges in paths from $latex {g}&fg=000000$ or $latex {r}&fg=000000$ to $latex {b}&fg=000000$. In the former case the algorithm must output paths to $latex {s_r}&fg=000000$ and $latex {s_g}&fg=000000$ within time proportional to the total number of edges from $latex {g}&fg=000000$ or $latex {r}&fg=000000$ to either of $latex {s_r}&fg=000000$ or $latex {s_g}&fg=000000$.

The main puzzle is, how can we avoid searching past $latex {b}&fg=000000$ and taking more than the time allowed to the latter case, if we don't even know that it holds? For a single local search, this does indeed seem to be impossible. However, what Vijay called ``Double Depth-First Search'' (DDFS) solves it. Two search posses led by rangers Green and Red start respectively at $latex {g}&fg=000000$ and $latex {r}&fg=000000$ and mark out ``green'' and ``red'' nodes and edges both. Neither may tread on the other's ground. The rules are:

Whichever posse is on a higher-level node gets to move; if they are on the same level, Red moves.
If one posse's depth-first search comes to an already-marked empty node $latex {v}&fg=000000$---it could have been marked by the same party---it goes back and tries another down-edge. If there are no more down-edges, it backtracks as in normal depth-first search.
If the other posse is on that node, however, Red and Green fight a duel at 20 paces.

\includegraphics[width=1.5in]{GreenHornetLoneRanger.jpg}

The duel follows gentlemen's rules. Red shoots in the air. Green takes the hint and backtracks looking for another node $latex {w}&fg=000000$ at the same level as $latex {v}&fg=000000$ or lower. If Green succeeds, he sends a Western Union telegram to Red, who puts his marker on $latex {v}&fg=000000$ and moves on. If Green gets stuck, then Red has to keep the same bargain. Red backtracks until he either finds another $latex {w}&fg=000000$ at the same level as $latex {v}&fg=000000$ or lower, whereupon he telegrams Green to come back and claim $latex {v}&fg=000000$ as his, or Red gets stuck too. If they are both stuck the both come back to $latex {v}&fg=000000$. Each claims just the respective (necessarily different) edges they followed into $latex {v}&fg=000000$, but they identify $latex {v}&fg=000000$ itself as the highest bottleneck $latex {b}&fg=000000$ and share a bottle of hooch to celebrate.

They can celebrate because they have met the time guarantee: Neither searched below $latex {v}&fg=000000$, and all the edges they traversed in their wanderings had to be on paths to $latex {v}&fg=000000$, since they did not find any open trails going elsewhere. And in the case where they reach separate destinations $latex {s_g}&fg=000000$ and $latex {s_r}&fg=000000$, any backtracking done by one posse because of hitting the other's marked node is inductively on a path to the other's goal, so at most the various routes between $latex {\{g,r\}}&fg=000000$ and $latex {\{s_g,s_r\}}&fg=000000$ get searched.

Open Problems

Read the new paper V. It is one of the clearest expositions of a graph algorithm---a model for others. Also the idea that Vijay went back to an ancient paper---over thirty years old---solely to write the definitive proof of its correctness is something we all should applaud.

Of course the central problem in matching theory is still what is the best time for an algorithm that finds a maximum matching? Given the interest today in huge graphs that arise from social networks we will like a linear time algorithm. Is this possible? See this for some fast approximate algorithms.