Cheating Accusations at Corus 2007, and Before

Primary Sources and Discussion Forums (data is below)

Sueddeutsche Zeitung article (in German) by M. Breutigam detailing suspicious behavior by Silvio Danailov during Topalov's games against van Wely in round 2 and Karjakin in round 3.

ChessBase translation of Breutigam's article.

Jan 27 ChessNinja thread, "Foul Play In Chess"

Feb 3 ChessNinja thread, "Recrimination du Jour". Has links to other relevant articles (I'm trying not to clutter this page)...

New 2/9/07: Kommersant article on possible official investigation into cheating claims about several tournaments. Video by amateur Dutch filmer from Corus 2006. Kommersant article's link to the video (no longer works?).

Main Methodology

Since the allegations are clearly about confederate cheating, because current engines in single-line mode seem to get to high search depth faster than their forebears(?), and because quad-core machines are now common, I have extended the testing window up to the 17 or 18-ply round. These tests are on single-core Intel and AMD machines; tests on multi-core machines will come later.

It is still necessary to get a readout of (at least) 10 choices per move at some high search depth in order to assess the significance of a match or non-match, and for similarity metrics under development. It may be important to have multi-line verdicts for every search depth from 11 ply in order to support a notion of the "swing" or "criticality" of a move, along lines of "complexity" in the June 2006 Guid-Bratko study (PDF of paper). My Elista testing did 10-line mode only for the 11,12,13,14 ply rounds. However, here we mainly try a reasonable "shortcut" methodology:

(a) As before, start on (at least) the previous move to the desired range of moves, to fill up the hash table.

(b) Single-line mode until getting a PV at depth 17 (or 18). Clip analysis then.

(c) While at that depth, increase the # of lines computed fully to (I recommend) 10. Clip analysis at 18/18, which marks the end of the 17-ply round (or at 19/19 if you went to depth 18 in single-line mode). The calculation will re-start from the beginning, but it will have the benefit that evals of many lines found in single-line mode will already be stored in hash.

(d) Step ahead to the next move while in 10-line mode. Wait until 11/11 shows. Clip that 11/11 if you can. (This may help fill some hash for later use by step (c), and might also reflect a reality of people getting a quick peek at options before going into single-line mode for optimum depth.) Then click the "-" button to reduce down to single line mode and goto (b).

Steps (b) and (c) give ample time for being away from the machine---a considerable factor for busy people since especially (c) may run over an hour depending on your hardware---until this kind of procedure can be scripted. In cases where all but some number k < 10 of lines are catastrophic, perhaps putting the engine into slow "mate-find" mode, you can reduce the # of lines in step (c) from 10 to k. Or if the move played does not show in the top 10, you can up the # of lines until it shows.

It is also interesting to ask whether doing only single-line mode (to 17 or 18 ply) at every move, then re-starting from the beginning and doing every move in 10-line mode, will give notably different results. The shortcutted version of this involves clipping the multi-line analysis only at 17/17 (or 18/18 or 19/19 if you can; 16/16 if you can't wait). This "two-stage serial shortcut" method is in fact used in Buczyna's data.

Data and Reports

Note in Rd 2 by KWR in Jan 27 thread. (Now I wish I'd done this more formally, and I am doing so with Fritz 9---the full test on one engine takes hours, however! NEW 2/14: see note below.)

Long test file by Jason Buczyna of Junior 10 (and other engines) on the 2nd-round game Topalov-van Wely, from where theory was left at move 17. Tabulated results in this file. This expanded results file shows that Fritz 10 matches all but 2 moves left unmatched by Junior 10; tests on Fritz 10 alone are forthcoming...

Rybka 2.2n2 test file (long) by KWR of the 3rd-round game, with results in this file (not yet summarized).

Log of Fritz 9 on moves 28--36 by KWR, with results in this file. Note that moves 33, 35, and 36 are significantly inferior according to Fritz 9, though they were much closer for Rybka 2.2n2.

New 2/7/07: Based on the above, I gave my initial summary, and maintained it until 2/7: "The results of these runs do not confirm the accusations in the article. Of course they do not prove the absence of cheating---and possibly an engine not among those tested here was used---but in my opinion these results must be considered a counter to the article." However, the following run by Jason Buczyna on Fritz 10 makes a totally striking contrast to his results on Junior 10...

Long test file by Jason Buczyna of Fritz 10 (contrasted with his Rybka 2.2; my "2.2n2" is said to be significantly different) on the 2nd-round game Topalov-van Wely, from where theory was left at move 17. Tabulated results in this file. This expanded results file shows that Fritz 10 matches all but 2 moves. The 15/17 raw match rate will also score high in our metrics because more than half of the matches were in close situations. JB has begun a test of Game 3 with Fritz 10, to compare to my runs which use Fritz 9 and the contemporary Rybka 2.2n2 (note that the slightly-older Rybka 2.2 in his expanded file matches far less often than Fritz 10).

Long Fritz 9 test file by KWR of the same Rd. 2 game, with results and comments in this file and tabulated results. This largely confirms my informal observations in my initial Jan 27 comments at ChessNinja, and contrasts with the Fritz 10 results, as compared here.

Long test file of Round 3, Karjakin-Topalov, by Jason Buczyna. Moves and comments, and tabulated results. Results from this game are highly inconclusive, even excluding moves before move 20, and moves 23 and 27-28 (or 26-27?) when Breutigam reports Danailov as having been "interrupted". Moreover, the moves "shortly before the time control...had become hectic"; all 4 of moves 37--40 are matches, but were seemingly unsignalled.

Guessing at what the formal statistical methods will show, this seems a higher match rate than what one could have called an exoneration, but lower than what one would (together with Round 2) have called a "smoking gun". The run also highlights methodological difficulties caused by differing verdicts between "single-line" and "multi-line" modes.

Conclusions, Opinions, and Further Discussion

This topic and further allegations by Nigel Short (as reported 1/30/07 in this DNA India article and 2/1/07 in this Leonard Barden column.) and others continue to reverberate. I have tried to furnish concrete suggestions and analysis at the following places---search for "KWRegan" for my comments:

Susan Polgar's Blog, Mon. 1/29/07 9:31am, "The Kramnik-Topalov Debate". Appeal for data-gathering help, partial rebuttal to Breutigam's article, and demonstration of how my hard data (from Elista) can counter allegations made by other commenters.

Susan Polgar's Blog, Tue. 1/30/07 10am, "The BIG cheating debate" Proposed definition of requirements for a cheating accusation.

Susan Polgar's Blog, Thu. 2/1/07 7:49pm, "A difficult question about right or wrong". Same proposal as last item, plus noting a distinction made by Leonard Barden which speaks to Susan's original main question in the item.

BCM Blog, Sun. 1/28/07, "Anyone for Cheating?" (2/25/10: Link now shorts out to one's Yahoo profile. The item is still listed in archives at http://www.bcmchess.co.uk/news/arch2007.html, but links there do the same thing.) My comment amplifies wavelengths I detect in John Saunders' musings, and then makes a more-detailed appeal for engine-testing help than I put in Susan's blog.

Mig Greengard's Feb. 3 "Recrimination du Jour" item has many concrete comments and evaluations, by Mark Crowther and John Lee Shaw who were at Corus 2007, and others. Search "kwregan" for mine, many of which relate to data on this page and other factual matters.

Place in this thread where Kommersant article starts being discussed on Feb. 8. My comments stick to giving factual information and posing questions to ask---I do not try to evaluate the video, IMHO a job for professionals.

Feb. 10 ChessNinja item, "Video Killed the Chess Star" on the video in the Kommersant article.

For hopefully lighter humor, if Tonio Dungailov coached the Indy Colts? in response to Susan's blogging the Super Bowl as she does chess games!

Overdub of the video, posted by "dublinty" at YouTube and HREFed by Mig at ChessNinja.com.

If you have research-relevant ideas which may shed more light, by all means please e-mail them to me! My home page has full contact info, plus I am listed.