{\huge Black vs White Boxes}

What is better than how \vspace{.3in}

Wilhelm Cauer was a German mathematician and engineer who worked in Göttingen and the US between the two world wars. He is associated with the term ``black box,'' although he apparently did not use it in his published papers, and others are said to have used it before. What Cauer did do was conceive a computing device based on electrical principles. According to this essay by Hartmut Petzold, Cauer's device was markedly more advanced and mathematically general than other `analog devices' of the same decades.

Today I want to talk about black boxes and white boxes, no matter who invented them, and their relation to computing.

\includegraphics[width=2in]{blackbox.png}

In computing a black box is a device that given an input returns an output, but we do not get to see inside the box. We only get the output, we do not get any idea how the box computed the value. In a white box, we get to see all of the steps taken in computing the output. Probably better names would be an opaque box and a transparent box. But black and white are the most popular names.

\includegraphics[width=3in]{bwbox.png}

The devices built by Germany's great computing pioneer Konrad Zuse can be viewed as ``white boxes,'' because he devised an entire programming formalism to control them. He even wrote a computer chess program in his Plankalkül\/, in 1941 long before the work of Alan Turing and Claude Shannon and others, but chess was hardly a priority in those years.

What and How

Another way to think about black and white boxes is to notice the parallel with ``what'' and ``how.'' In English we use ``what'' to mean the goal, and ``how'' to mean the method for reaching the goal. A quote from a site on selling products is interesting: ``What" is the end goal, it's directional. "How" is the journey$latex {\dots}&fg=000000$" I have edited this slightly---caps to bold.

A Measure Of Complexity

I realized that we can use the what vs. how as a measure of complexity. Problems that force us to look inside a box are much harder---if not impossible---then those where we can avoid this.

$latex {\bullet }&fg=000000$ Physics:

We see the what/how divide all the time in physics. The famous Newtonian law of attraction between two masses

$latex \displaystyle \frac{Gm_{1}m_{2}}{r^{2}}, &fg=000000$

is a perfect example of what. Issac Newton had no idea why this works---how it works. Albert Einstein, in his discovery of General Relativity, showed that the Newtonian formula was only an approximation. His field equations replace the above to explain more accurately the what of how gravity works. The theory successfully predicts many effects that show it is a refinement of Newton's. Some are: gravitational redshift, light bending by gravity, and the anomalous perihelion shift of Mercury. I would argue that we still have no idea of how gravity works. Yes general relativity says that masses cause space-time to curve, and the curvature directs their motion. In particular, this helps explain how gravity acts at a distance. But does this really answer the how question?

The greatest example for the hundred years since general relativity has been quantum mechanics. No one knows how it works, but it yields incredibly accurate agreement of measurement with prediction of quantities on incredibly small scales. There are many competing white-box attempts to explain it, but this black-box slogan still holds sway:

\includegraphics[width=2in]{ShutUpAndCalculateMug.png}

$latex {\bullet }&fg=000000$ Cryptography:

We see the what/how all the time in cryptography. Indeed many crypto results view protocols as black boxes and use this routinely. Sometimes there are cases where the how is important, where white boxes are needed in crypto. But these are the exception, and usually are the harder cases.

The physical idea of devices as black boxes enters into quantum protocols. Even there, some theoretical arguments can be defeated when it comes to implementations, if light can be shined inside the box to make it white. This is quite literally the basis of an attack with a ``blinding laser'' that we wrote about here.

$latex {\bullet }&fg=000000$ Programming:

We see the what/how divide all the time in programming. The whole point of abstraction, of modules, of object-oriented programming is to hide the how from us. Programmers are happy to use code written by other programmers, the what, without any understanding of the how. It is vital that a change in the how not disturb many other components in a software system that are relying on the what.

$latex {\bullet }&fg=000000$ Mathematics:

We also see the divide in mathematics. We can use the famous classification of finite simple groups, this is the what. We need not, probably cannot, worry at all about the how. The number of experts in group theory that understand all of the how are few.

Yet applications abound of using this theorem. Right now there is great difficulty in understanding the volumnious development of intra-universal Teichmüller theory on which Shinichi Mochizuki's claimed proof of the ABC Conjecture and several other conjectures is based. Some hope that further experience with the what of its implications will help to verify the how of Mochizuki's proofs.

Lower Bounds and How

We see the what/how all the time in lower bounds in complexity theory. The difficulty is that in many lower bounds we must treat the computation as a white box. This forces us into the how situation which is much harder than when we can view computations as black boxes---the what case.

This shows up already with the simplest kind of circuit black box we can imagine, a comparator gate\/. It has two inputs $latex {x}&fg=000000$ and $latex {y}&fg=000000$, and two outputs that just give $latex {\max(x,y)}&fg=000000$ and $latex {\min(x,y)}&fg=000000$. The famous lower-bound theorem that circuits of these gates require $latex {\Omega(n\log n)}&fg=000000$ size to do sorting, which is taught in many undergraduate algorithms courses, is often mis-recalled as applying to all kinds of circuits, Boolean or numerical, and to general machine models. However, in some such models it is false, as shown by the $latex {O(n\log n/\log\log n)}&fg=000000$ upper bound of Michael Fredman and Dan Willard.

Consider a truly general circuit lower bound. All the results---all?---require a white-box view of the circuit. The whole reason that circuit lower bounds are so difficult is that we must argue, it seems, that the circuit cannot do something. And to do that we must get into the details of how a circuit can work. Sometimes the arguments that make full allowance for how do succeed, for example we can get lower bounds on monotone circuits and on constant depth circuits. All of these proofs require white-box arguments.

Consider a Turing Machine lower bound. These results sometimes can treat Turing Machines as black boxes. Turing's original proof of the unsolvability of the halting problem did not depend on details of the machines. Okay this is a bit tricky. He needed that any such machines could be simulated, so he did get involved with the how. However, the details of how were not relevant to the core argument, just that there was some way to simulate the machine.

The same weak dependence on how is used in many other complexity results since Turing's work. The usual diagonalization proofs use only weak information about the how. The myriad relativization results, typified by the existence of oracle sets $latex {A}&fg=000000$ and $latex {B}&fg=000000$ such that $latex {\mathsf{NP}^A \neq \mathsf{NP}^A}&fg=000000$ but $latex {\mathsf{NP}^B = \mathsf{NP}^B}&fg=000000$, attest to the need for white-box understanding of computational efficacy.

Open Problems

Perhaps there is a general way to morph black boxes into white, giving a theory that can provide ``fifty shades of grey'' in computation?