Quick tests for checking whether a new math result is plausible

The proliferation of the Internet and the pressure to make headlines has led to a number of recent self announcements of impressive-looking new mathematical results, often noted in press reports and blogs. This phenomenon is neither entirely new nor always without merit. Some genuine breakthroughs have been announced this way — one example is the discovery in August 2002 of what is now known as the Agrawal–Kayal–Saxena primality test, discovered by three researchers of these names at the Indian institute of Technology in Kanpur, India.

However, there are many other examples of mathematical results touted in press announcements that have not panned out. For example, in August 2010, researcher Vinay Deolalikar, a mathematician at Hewlett-Packard Labs in Palo Alto, California, announced a proof that “P is not NP” — a long-standing famous conjecture, one of the unsolved problems for which the Clay Mathematics Institute has offered an award of USD$1,000,000. His proof was taken seriously, in part because he is quite knowledgeable in the field, but alas, after careful scrutiny by a number of leading researchers, the proof did not stand — fundamental flaws were identified. Deolalikar subsequently revised his proof, but has not yet released any draft. Additional details on P vs NP are available in our previous blog and in a recent New Scientist article.

Very recently, on 3 Jun 2011, a press report and paper surfaced that a German mathematician named Gerhard Opfer had announced a solution to the “Collatz conjecture,” namely the assertion that the iteration “if n is even, divide it by 2, but if n is odd, multiply it by 3 and add 1 to get 3n + 1” will always eventually return to 1. See our previous blog. Alas, this already now appears not to be sound.

These and other instances have led various mathematicians to offer some “tests” that can be used at least for an initial screen. Scott Aaronson, for instance, offers the following rules (condensed from Aaronson blog):

  1. The authors don’t use TeX.
  2. The authors don’t understand the question.
  3. The approach seems to yield something much stronger and maybe even false (but the authors never discuss that).
  4. The approach conflicts with a known impossibility result (which the authors never mention).
  5. The authors themselves switch to weasel words by the end. The abstract says “we show the problem is in P,” but the conclusion contains phrases like “seems to work” and “in all cases we have tried.”
  6. The paper jumps into technicalities without presenting a new idea. If a famous problem could be solved only by manipulating formulas and applying standard reductions, then it’s overwhelmingly likely someone would’ve solved it already.
  7. The paper doesn’t build on (or in some cases even refer to) any previous work. Math is cumulative. Even Wiles and Perelman had to stand on the lemma-encrusted shoulders of giants.
  8. The paper wastes lots of space on standard material.
  9. The paper waxes poetic about practical consequences, deep philosophical implications, etc.
  10. The techniques just seem too wimpy for the problem at hand.

To this list, we add a few more:

  1. The paper wastes a lot of space on preliminaries. If an experienced author has a major result, usually he/she will not spend a lot of time introducing the topic.
  2. The list of references seems skimpy — it does not include one or more centrally important and well-known papers in the field.
  3. The paper includes lots of tables and figures of at best minor relevance. There is a place for heuristic and/or experimental results, but not as a dominant feature of a paper claiming to present a final proof of an important result.
  4. The punchline requires checking that results from earlier works are correct and apply, but no details are given; just a citation — even when the prior papers are short and not easily accessible.
  5. No genuine expert is thanked for comments on an earlier draft.

We could go on. Opfer’s offer sets of alarm bells for many of the reasons just adduced.

Let us start with the presumption that, sight unseen, a claim of a proof of a big result from an unlikely source has at best a 5% chance of holding up. Add some of the above symptoms, and it is best to assume any such result wrong until proven right. Most such proofs come from individuals who have been seized by a psychological certainty that they have done something great. Their colleagues need to help restrain them and avoid the later embarrassment.

Comments are closed.