{\huge Checking the Higgs Boson}
Reliability and scientific discovery
\vspace{.3in}
Dennis Overbye is Deputy Science Editor of the New York Times. He has just written the lead article for Tuesday's ``Science'' section of the Times, which is entirely devoted to the recent discovery of the Higgs Boson.
Today Ken and I want to talk about the large-scale human-reliability and software-reliability side of the equations.
As Overbye reports, the corks have popped on the bubbly, the press releases are out, people are buying their tuxedos, the Nobel and other prizes are coming, all is set---the elusive Higgs particle has been discovered. Or has it? How can we know, and when can we know it?
Overbye is a terrific reporter and writer who has written two books, Lonely Hearts of the Cosmos and Einstein in Love. His article, like his books, details the personal stories of the major scientists involved in these quests for discovery, and how they coped with issues along the way. We suggest you read the article yourself, and Overbye's companion piece, ``All Signs Point to Higgs, but Scientific Certainty Is a Waiting Game.''
Checkpoints
Two things have struck us about Overbye's articles and some surrounding commentary from the web. One is the involvement of over 3,000 people on each of the two major teams, ATLAS and CMS, working on the Higgs detection at CERN's Large Hadron Collider. Only a few below the team leaders are mentioned in the article---of course only a few can be---and these include several graduate students mentioned by name. But surely many more must have had critical and interconnecting roles vital to the integrity of the results, including the seven million lines of code needed to run ATLAS, for one.
The second is the fundamental skepticism in force at various points in the process, especially regarding the December 2011 pre-announcement of 'evidence' which we discussed (see also this). This included the imposition by both teams of ``blind'' procedures to reduce human bias from January through June, 2012, until the data was felt sufficient for results to be ``opened.'' Even now there are two discordant notes in the results: the ATLAS team has two different computations of the Higgs mass, 124.3 GeV versus 126.8 GeV, which lie outside each other's error bars (while CMS says 125.8 GeV), and the data has not yet pinned down that the particle has the spin value of 0 needed to be the Higgs, rather than 2. Of course scientific skepticism is necessary, and the articles show its good side. But all of this still has us wondering:
Have the human and software components been checked as thoroughly as the physics side has?
We are not doubting CERN here---CERN itself is famous as a bastion of the open-source movement and its benefits for reliability. Okay, we are doubting CERN here. There---we said it. Let's first talk about some doubts and then discuss explicitly a criterion that may help.
Who Checks The Checkers?
CERN's collider costs a lot. Here is a way to define ``a lot'': Google could only build about 25 colliders based on Google current capital valuation of 275 billion dollars. The main goal of the CERN collider is---was?---to discover the Higgs particle. The curious situation is that close to all physicists believed that the particle existed, before it was discovered. The physics community ``knew'' that there must be such a particle, but they needed experimental proof. Very good.
To make the experiments more reliable they cleverly had two independent teams setup to do experiments to search for the Higgs. But were they really independent? We claim they were not---rather, they were symbiotic. Imagine the following scenario: One team, say ATLAS, discovers the particle while the other does not. Who would get the credit, the acclaim, the prizes? Of course they were not independent in another sense: they used the same collider. The huge cost of the collider means that right now there is no other place on earth that can run the same experiments.
We are not physicists, although Ken knows a fair bit of physics, but I think there are fundamental issues that Overbye does not address. How do we really know that the Higgs boson has been found?
A Relative Matter
Arthur Eddington famously tested Albert Einstein's theory of general relativity. His theory predicted that light would bend twice as much as Newtonian mechanics predicted. Eddington used the 1919 Solar Eclipse to confirm Einstein. For years there has been discussion that the confirmation could have been tainted by Eddington's strong belief that General Relativity had to be right.
We now know via many other experiments including our daily use of GPS that General Relativity is certainly correct in predicting the almost-double effect of light bending. But in 1919 when the experiments were performed, and for years afterward, there was discussion of how reliable they were. A recent paper by Daniel Kennefick argues that there was no problem with the experiments. The title explains all: ``Not Only Because of Theory: Dyson, Eddington and the Competing Myths of the 1919 Eclipse Expedition.''
So the issue of preconceived beliefs in science are certainly not new. What can we do about them?
Self-Checking in Experiments
The idea of self-checking in programs has been studied in theory for over two decades. One way you can be confident in results obtained via computation is when there is a quicker way to verify the answers after you have them, or when the answers themselves come with a proof of their correctness.
We imagine that most if not all of the LHC's machinery is self-checking. LHC physicist and software manager David Rousseau wrote a column for the Sep.--Oct. 2012 issue of IEEE Spectrum on the software for ATLAS, and noted
``Meanwhile, the reconstructed [particle-collision] events are distributed worldwide, and thousands of histograms are filled with well-known quantities and monitored semiautomatically for a significant departure that would indicate a problem with the data. The process lets physicists analyze an event with confidence in less than one week after it’s recorded.''
Further down he mentions ``a wealth of self-calibrating algorithms'' which ensure that one detector is recording data in a manner consistent with other detectors. He then goes on to discuss how the raw data is refined by a human-guided process that has many heuristics to try to isolate subsets that will have interesting events.
Still, we wonder whether necessary features of the experiments bring their own vulnerabilities. We're prompted by memory of the mistaken reporting by the OPERA experiment of faster-than-light speeds for neutrinos with high confidence. They were apparently faulty owing to relatively simple unverified features. In fact the cause of the faulty data is still on-record as being
``a loose fibre optic cable connecting a GPS receiver to an electronic card in a computer.''
We speculate that the OPERA experiment had less innate self-checkability in general owing to its very fine tolerances on speed measurements and intrinsic difficulties in working with neutrinos over long distances on Earth. Two distant stations had to be synchronized after subtracting out relativistic effects, in a way that seems not to have come with its own verifiability check.
Happily others were able to carry out similar experiments relatively cheaply, and the other $latex {n-1}&fg=000000$ votes had already upheld Albert Einstein's light limit by the time OPERA corrected their equipment-data analysis. Outside checkers for CERN are less available, however, and we are not the only ones to worry about possible weak links in the human-data chains.
Open Problems
Is there a sensible ``self-checkability metric'' for computer-dependent physics experiments?
We also note via Peter Woit's physics blog two articles posted on the Simons Foundation website. The first, by Natalie Wolchover, is about computer-aided mathematical proofs, while the second is an essay by Barry Mazur on the even murkier subject on what kind of ``evidence'' can lead you to believe in mathematical results while finding directions to try to prove them.