{\huge A Chess Firewall at Zero?}

Humans halve their blunder rate when infinitesimally ahead, but why? \vspace{.25in}

Lucius Seneca was a Roman playwright, philosopher, and statesman of the first century. He is called ``the Younger'' because his father Marcus Seneca was also a famous writer. His elder brother Lucius Gallio appears as a judge in the Book of Acts. Beside many quotations from his work, Seneca is famous for one he is not known to have said:

``To err is human.''

Lest we cluck at human error in pinning down ancient quotations, the source for the following updated version is also unknown---even with our legions of computers by which to track it:

``To err is human, but to really screw things up requires a computer.''

Today I report a phenomenon about human error that is magnified by today's computers' deeper search, and that I believe arises from their interaction with complexity properties of chess.

I have previously reported some phenomena that my student Tamal Biswas and I believe owe primarily to human psychology. This one I believe is different---but I noticed it only a week ago so who knows. It is that the proportion of large errors by human players in positions where computers judge them to be a tiny fraction of a pawn ahead is under half the rate in positions where the player is judged ever so slightly behind.

The full version of what Seneca actually wrote---or rather didn't write---is even more interesting in the original Latin:

``Errare humanum est, perseverare autem diabolicum, et tertia non datur.''

This means: ``To err is human; to persevere in error is of the devil; and no third possibility is granted.'' The phrase tertium non datur is used for the Law of Excluded Middle in logic. In logic the law says that either Seneca wrote the line or he didn't, with no third possibility. We will say more about this law shortly. Amid disputes about whether human behavior and its measurement follow primarily ``rational'' or ``psychological'' lines, we open a third possibility: ``complexitarian.''

State of Computer Chess

Here are some important things to know about chess and computer chess programs (called ``engines'').

Chess is hard. Finding the best move in a position or even telling if you're winning is a concrete case of a $latex {\mathsf{PSPACE}}&fg=000000$-hard problem.
Computers can now slaughter the best human players on even terms using commodity hardware; the champion Komodo program recently gave substantial handicaps to US champion Hikaru Nakamura and still won.
All leading programs work in progressively deeper rounds of search. Under the common UCI protocol they can be configured to compute a value for each possible move at each depth of search, but in the usual ``Single-Line'' playing mode they save time by only bounding inferior moves away from the value of their current best move. The No. 2-ranked Stockfish program cuts corners in a way that often reaches higher depths over the same time than Komodo.
Engines often change their ``mind'' about the values of moves and which one to rank first, but their verdicts become more stable as the search deepens. The values are commonly measured in units of 0.01 called centipawns---figuratively hundredtths of a pawn.
Search depth is measured in plies meaning moves by White or Black, so depth 12 means looking ahead 6 moves for both. Programs have a basic search depth that notches up but can extend their search at any time when the critical nature of the moves warrants doing so.
Basic depth 12 was once projected to subdue the human champion; I suspect it's really depth 18 with today's programs but no matter: they hit that in seconds and reach the high 20s and beyond in games at standard time controls---so much that we recently discussed how close they are knocking on perfection.
Perfection in chess is probably a drawn outcome, which gets the value 0.00 in centipawns for both players. One way 0.00 values occur during a search is when both players must repeat the same sequence of moves---where any deviation gives that player a negative centipawn value according to the engine. The tendency of engines to give 0.00 at higher depths in positions that look crazy-complicated to human players has been remarked more and more in game commentary by grandmasters in magazines such as New in Chess.

One upshot is that depth of cogitation is soldily quantifiable in the chess setting. We have previously posted about our papers giving evidence of its connection to human thinking and error. The new phenomenon leans on this connection but we will argue that it has a different explanation.

The Phenomenon

My training sets include all recorded games in the years 2010--2014 between players rated within 10 points of the same century or half-century mark of the Elo rating system. For example, at the Elo 2200 level commonly regarded as the threshold for ``master,'' the set has 1,083 games with both players between 2190 and 2210. Skipping the first eight moves in any game---that is, 16 plies---the set has 67,545 positions (counting multiple occurrences in games), each of which was reproducibly analyzed in single-line mode by the newly-released version 7 of Stockfish to depth at least 20 and last month's version 9.3 of Komodo to depth at least 19 (which my tests so far indicate is closest in strength).

The value of a position is the same as the value of the best move(s) in the position. In multi-line mode we can get the value of the played move directly, while in single-line mode we can if needed take the value of the next position. A value of +150 centipawns or more is commonly labeled a ``decisive advantage'' by chess software though there are many exceptions; similarly a value of -150cp or worse is considered ``losing'' or at least desperate. Dividing by 100 to get so-called ``pawn values,'' we can group positions by value in deciles +0.01--0.10, +0.11--0.20, +0.21--0.30, all of which are usually labeled ``equal'' by chess software, likewise -0.30 to -0.21, -0.20 to -0.11, -0.10 to -0.01 where the player to move is judged slightly behind by the machine but considered basically equal in human terms. And so on past +0.70--0.80 where the advantage is considered serious to +1.51--1.60 and beyond, and down below the dumps of -1.60 to -1.51 etc.

Let us arbitrarily-but-accordingly classify any move that drops the value by 1.50 pawns or more as a ``blunder.'' If the player starts ahead by (say) 3.00 or more this might not cost anything, but we are going to graph only positions in the value range +1.00 to -1.00 anyway. Value exactly 0.00 gets a range to itself. The final wrinkle is that we will use the engines' highest-depth values to group the positions, but distinguish cases where the human player's move was regarded as a blunder at depth 5, 10, 15, and 20 (or 19 for Komodo in the relatively few cases where it did not reach depth 20). Doing so distinguishes immediate blunders like hanging your queen from subtle mistakes that require a lot of depth to expose. This yields tables like the following tables, first for the 2200 level with Komodo:

\hline Value range \#pos d05 d10 d15 d20
\hline -1.00 to -0.91 782 12 13 17 19
\dots \dots \dots \dots
-0.30 to -0.21 2,510 13 17 17 24
-0.20 to -0.11 2,712 14 17 13 20
-0.10 to -0.01 2,585 16 13 18 24
\hline 0.00 exactly 4,968 71 64 82 100
\hline +0.01 to +0.10 2,378 9 11 11 12
+0.11 to +0.20 2,985 13 15 14 18
+0.21 to +0.30 2,817 19 23 17 16
\dots \dots \dots \dots
+0.91 to +1.00 953 6 5 7 7
\hline Totals
-1.00 to -0.01 17,076 149 145 199 259
+0.01 to +1.00 19,067 100 107 97 105
\hline

What Can Explain It?

Open Problems

\hline Value range	\#pos	d05	d10	d15	d20
\hline -1.00 to -0.91	782	12	13	17	19
\dots	\dots	\dots			\dots
-0.30 to -0.21	2,510	13	17	17	24
-0.20 to -0.11	2,712	14	17	13	20
-0.10 to -0.01	2,585	16	13	18	24
\hline 0.00 exactly	4,968	71	64	82	100
\hline +0.01 to +0.10	2,378	9	11	11	12
+0.11 to +0.20	2,985	13	15	14	18
+0.21 to +0.30	2,817	19	23	17	16
\dots	\dots	\dots			\dots
+0.91 to +1.00	953	6	5	7	7
\hline Totals
-1.00 to -0.01	17,076	149	145	199	259
+0.01 to +1.00	19,067	100	107	97	105
\hline