Reliability, reproducibility and the Reinhart-Rogoff error

Harvard faculty Carmen Reinhart and Kenneth Rogoff are two of the most respected and influential academic economists active today.

On April 16, 2013, doctoral student Thomas Herndon and professors Michael Ash and Robert Pollin, at the Political Economy Research Institute at the University of Massachusetts Amherst, released the results of their analysis of two 2010 papers by Reinhard and Rogoff, papers that also provided much of the grist for their 2011 best seller Next Time Is Different. The Reinhart-Rogoff papers had analyzed economic growth rates spanning nearly two centuries, in numerous different nations, and concluded that when the ratio of public debt to gross domestic product (GDP) exceeds 90%, then average real growth reduces to -0.1% (i.e., 0.1% decline).

In their analysis of the Reinhart-Rogoff paper, Herndon, Ash and Pollin first attempted to replicate data just from 20 advanced economies over the time period 1946-2009, since these data are the most relevant to present-day U.S. and European policy debates. After being unable to replicate these results, they obtained the actual spreadsheet that Reinhart and Rogoff utilized for this calculation. After analyzing this data, they identified three errors.

The most serious error was that in their Excel spreadsheet, Reinhart and Rogoff did not select the entire row when averaging growth figures; they omitted data from Australia, Austria, Belgium, Canada and Denmark. When this error was corrected, the 0.1% decline became a 2.2% increase. Additional details are in the Herndon-Ash-Pollin paper and also in a nice summary provided by the Economist. The Reinhart-Rogoff paper also employed some relatively bizarre weightings in aggregating data from different countries, and arguably is guilty of various other methodological errors as well.

So in other words, the key conclusion of a seminal paper, which has been widely quoted in political debates in North America, Europe Australia and elsewhere, is invalid. For example, the paper was cited by Paul Ryan in his proposed 2013 budget “The Path to Prosperity: A Blueprint for American Renewal.” Undoubtedly, absent Reinhart and Rogoff, Ryan would have found some other data to support his conservative point of view; but he must have been delighted that he had heavyweight economists such as Reinhart and Rogoff apparently in his corner. Mind you, Reinhart and Rogoff have not tried to distance themselves from this view of their work.

This is not the first time that a data- and/or math-related mistake resulted in major embarrassment and expense. Here are a few other large blunders, as recently summarized by Matthew Zeitlin:

  1. Mariner 1. Only about 5 minutes after its launch in 1962, Mariner’s navigation code malfunctioned, making it necessary to destroy the craft. The problem was a missing hyphen in its computer code for transmitting navigation instructions.
  2. Intel’s Pentium bug. In 1994, a mathematician found a hardware error in the new Intel Pentium processor (for certain arguments, its division was only accurate to limited precision). Intel at first downplayed the error, but ultimately was forced to replace many of the processors, at a cost of approximately $500 million.
  3. USS Yorktown. In 1997, the Yorktown, a large missile cruiser, had to be towed back to port after its propulsion system failed, because its database attempted to divide by zero, and no default path was provided.
  4. Mars Climate Orbiter. In September 1999, NASA’s $125M Mars Orbiter craft crashed onto the surface, because engineers who had designed its landing system used English units, but then forgot to convert to metric units as required by NASA.
  5. Barclays PLC and Lehman Brothers. When Lehman Brothers was going through bankruptcy, Barclays bought a large segment of the now-defunct bank. However, due to a spreadsheet error, their purchase mistakenly included 179 contracts.
  6. European flight ban. In 2010, in the wake of the eruption of the Eyjafjallajokull volcano in Iceland, European transportation officials canceled thousands of flights, at enormous cost to both airlines and passengers. However, a EU transportation official later acknowledged that “many of the assumptions in the computer models were not backed by scientific evidence.”
  7. J.P. Morgan Chase. In 2012, one contributing factor to the “London Whale” fiasco, which resulted in a multibillion-dollar loss for the company, was a spreadsheet error—the sum, rather than the average, of two hazard rates was used, underestimating the potential volatility by a factor of two.

While many different types of errors were involved in these calamities, the fact that the errors in the Reinhart-Rogoff paper were not identified earlier can be ascribed by the pervasive failure of scientific and other researchers to make all data and computer code publicly available at an early stage, preferably when the research paper documenting the study is submitted for review.

This general topic has been discussed by us in a previous Math Drudge blog and related Huffington Post article. We emphasised that the culture of computing has not kept pace with its rapidly ascending pre-eminence in modern scientific and social science research.

Most certainly the issue is not just one for political economists, although the situation seems worst in the social sciences. In a private letter now making the rounds, behavioural psychologist Daniel Kahneman (a Nobel economist) has implored social psychologists to clean up their act to avoid a “train wreck.” He specifically discusses the importance of replication of experiments and studies on priming effects.

In traditional experimental research work, researchers have been taught to record every detail of their work, including experimental design, procedures, equipment, raw results, data processing, statistical methods and other tools used to analyze the results.

In contrast, relatively few researchers who employ computing in modern science, ranging from large-scale, highly parallel climate simulations to simple processing of social science data, typically take such care in their work. In most cases, there is no record of workflow, hardware and software configuration, and often even the source code is no longer available (or has been revised numerous times since the study was conducted).

As emphasized in our earlier blog, the result is a seriously lax environment where deliberate fraud and genuine error can proliferate.

These concerns were addressed in a recent ICERM workshop on Reproducibility in Computational and Experimental Mathematics. It recommended that a major cultural change be enacted in the field, including, for example, new and significantly stricter standards required of papers by journal editors and conference chairs, together with software tools to facilitate the storage of files relating to the computational workflow.

The conference report is available here, and summaries are available in a Simons Foundation article, in Wired and in an earlier blog of ours.

There is plenty of blame to spread around. Science journalists need to do a better job of reporting such critical issues and not being blinded by seductive numbers. This is not the first time impressive-looking data, later rescinded, has been trumpeted around the media. And the stakes can be enormous.

If Reinhart and Rogoff (a chess grand master) had made any attempt to allow access to their data immediately at conclusion of their study, the Excel error would have been caught and their other arguments and conclusions could have been tightened. They might still be the most dangerous economists in the world, but they would not now be in the position of saving face in the light of damning critiques in the Atlantic and elsewhere.

For an economist, the five most terrifying words in the English language are: I can’t replicate your results. But for economists Carmen Reinhart and Ken Rogoff of Harvard, there are seven even more terrifying ones: I think you made an Excel error.

Listen, mistakes happen. Especially with Excel. But hopefully they don’t happen in papers that provide the intellectual edifice for an economic experiment — austerity — that has kept millions out of work. Well, too late. (Matthew O’Brien)

[This also appeared, slightly edited, in the Huffington Post.]

[Added 26 Apr 2013: Economist Robert J. Samuelson commented on the Reinhart-Rogoff error in this Washington Post column. Also, Velichka Dimitrova noted in a New Scientist article that "open data" might have spared us the pain. Finally, Reinhart and Rogoff have responded to the current discussion in their own New York Times Op-Ed.]

[Added 26 May 2013: Economist and New York Times columnist Paul Krugman comments on an open letter to Krugman by Reinhart and Rogoff.]

[Added 31 May 2013: University of Michigan economist Miles Kimball and a colleague, in their analysis of the Reinhart-Rogoff data, found "not a shred of evidence" that high debt levels lead to slower economic growth. Similarly, University of Massachusetts economist Arindrajit Dube's analysis of this data found little or no correlation between growth rates and debt levels beyond a certain moderate level. These results are summarized in a Huffington Post article by Mark Gongloff.]

Comments are closed.