De-Mystifying the Integer Multiplication Algorithm
In class, we saw an $O\left(n^{\log_2{3}}\right)$ time algorithm to multiply two $n$ bit numbers that used an identity that seemed to be plucked out of thin air. In this note, we will try and de-mystify how one might come about thinking of this identity in the first place.
-
With apologies to Enter the Dragon .
The setup
We first recall the problem that we are trying to solve:
Multiplying Integers
Given two $n$ bit numbers $a=(a_{n-1},\dots,a_0)$ and $b=(b_{n-1},\dots,b_0)$, output their product $c=a\times b$.
Next, recall the following notation that we used: \[a^0 = \left(a_{\lceil\frac{n}{2}\rceil-1},\dots, a_0\right),\] \[a^1 = \left(a_{n-1},\dots, a_{\lceil\frac{n}{2}\rceil}\right),\] \[b^0 = \left(b_{\lceil\frac{n}{2}\rceil-1},\dots, b_0\right),\] and \[b^1 = \left(b_{n-1},\dots, b_{\lceil\frac{n}{2}\rceil}\right).\] Next, we try and reason about the $O\left(n^{\log_2{3}}\right)$ algorithm using polynomials (in one variable) .
Enter the Polynomials 1
We being by defining two polynomials \[A(X) = a^1\cdot X + a^0,\] and \[B(X) = b^1\cdot X + b^0.\] To situate these polynomials, we make the following observations (make sure you are comfortable with these):
Exercise 1
Show that \[a= A\left(2^{\lceil\frac{n}{2}\rceil}\right)\text{ and } b= B\left(2^{\lceil\frac{n}{2}\rceil}\right).\]
Now let \[C(X)= A(X)\cdot B(X),\] be the product of the two polynomials. Then note that by Exercise 1, we have that our final answer can be read off as \[c = C\left(2^{\lceil\frac{n}{2}\rceil}\right).\] Next, note that $C(X)$ is a polynomial of degree two and hence can also be represented as \[C(X) = c^2\cdot X^2+c^1\cdot X+c^0.\] Recall the fact that $C(X)=A(X)\cdot B(X)$ we also have \[C(X) = \left(a^1\cdot X+a^0\right)\cdot \left(b^1\cdot X+b^0\right) = \left(a^1\cdot b^1\right)\cdot X^2 + \left(a^1\cdot b^0+a^0\cdot b^1\right)\cdot X+\left(a^0\cdot b^0\right).\] By comparing the coefficient of the powers of $X$, we get that \[c^2 = a^1\cdot b^1,\] \[c^1 = a^1\cdot b^0+a^0\cdot b^1\] and \[c^0 = a^0\cdot b^0.\]
Now going back to the algorithm
The important observation is that in our divide and conquer algorithm the three recursive calls are so that we can compute $c^0, c^1$ and $c^2$. For $c^0$ and $c^2$, the recursive calls are obvious. It is for the case of computing $c^1$ that we needed the supposed "clever" identity
The Identity
\[ a^1\cdot b^0+a^0\cdot b^1 = \left(a^1+a^0\right)\cdot \left(b^1+b^0\right) - a^1\cdot b^1 - a^0\cdot b^0.\]
Now of course one can verify that the identity is correct and it works for our case. But here is another way to arrive at the identity (that uses all the polynomial stuff that we have been talking about so far). In particular consider $C(1)$. Note that we have \[C(1) = c^2+c^1+c^0.\] In other words, we have \[c^1 = C(1) - c^2 - c^0.\] Finally, note that $C(1) = A(1)\cdot B(1) = (a^1+a^0)(b^1+b^0)$ and using this in the above (along with the values of $c^2$ and $c^0$) proves the identity.
Um, why is this any better?
Finally, we give some argument to justify why the argument using polynomials is more "natural." Note that
- We needed to compute the coefficients of the polynomials $C(X)$.
- For any constant value $\alpha$ (e.g. $\alpha=1$) both $A(\alpha)$ and $B(\alpha)$ are numbers with roughly $n/2$ bits and hence $C(\alpha)$ can be computed with multiplication of two numbers with roughly $n/2$ bits.
To complete the last piece of the puzzle, we invoke polynomial interpolation . In particular, to compute the coefficients of $C(X)$ we only need the evaluation of $C(x)$ at three distinct points (each of which corresponds to multiplication of two roughly $n/2$-bit numbers). Indeed this is what has been happening above:
- First note that $C(0) = A(0)\cdot B(0)=c_0$.
- Recall we used $C(1) = A(1)\cdot B(1) = c^2+c^1+c^0$.
- Finally, we can interpret $c^2$ as $\lim_{X\to\infty} \frac{C(X)}{X^2}=\lim_{X\to\infty} \frac{A(X)}{X}\cdot \lim_{X\to\infty}\frac{B(X)}{X}$ and in a hand wavy way we can think of recovering $c^2$ by "evaluating" $C(X)$ at infinity.
The above in fact suggests an alternate algorithm to the one we saw in class:
Exercise 2
Evaluate $C(0), C(1)$ and $C(2)$ and from these values recover $c^2, c^1$ and $c^0$. Then use this to design an $O\left(n^{\log_2{3}}\right)$ time algorithm to multiply two $n$ bit numbers.