[Written under the influence of QCSD]
Not so long ago, I happened to watch the totally-accurate-in-every-possible-way film, “Travelling Salesman”. tl;dw: a squad of government-hired mathematicians are finally able to prove that P = NP, giving the state the power to answer every important computational question they can imagine (aka rob banks and fight terrorism). But when requested to hand over their findings, the mathematicians begin to doubt that they are doing the right thing (as opposed to, say, publishing the results and letting everyone enjoy the spoils). A difficult question indeed – unleashing a polynomial algorithm for an NP-complete problem does imply breaking almost all cryptosystems in use today, in addition to creating better flight plans for cross-country salesmen. Definitely, if we ever prove P = NP, the history of mankind will be divided into distinct “before” and “after” phases.
But what if the mathematicians aren’t interested in breaking ciphers, but just want their well deserved fame and glory for solving a very hard problem? It is here that I wonder if the computer science and mathematics communities are ready for non analytical proofs as part of science-doing, and not just as part of complexity classes.
A brief word of explanation first. The rigour with which results are demonstrated differs in various academic fields. In sociology and medicine, statistical significance is basically all there is, so saying “there is a less than 2% chance of our hypothesis being wrong given the clinical tests” is pretty darn good. In experimental physics, too, we can never be more sure than our measurement error. However, in theoretical physics, mathematics and computer science, an analytical proof, filled with lemmas, propositions and an abundance of silly symbols, is needed. Indeed, even computer-assisted proofs, ala the Four Color Theorem, are frowned upon and looked at with a wary eye.
One reason for the wariness of computers, of course, is that computer programs and hardware naturally have bugs, and who’s to say that the calculation was correct? Before stating that something is an irrevocable mathematical truth, one must be sure! Of course, humans make mistakes when writing and reading proofs as well; a small bug hiding in a 300 page manuscript might be able to evade all the reviewers and find itself in publication, impostering as a profound truth.
If printed mathematical texts can be wrong (god forbid!), why not accept that fact and welcome probabilistic proofs – proofs that are probably correct (why probably? because there is randomness involved), up to an error as small as we choose. Computer science already has this well defined – the BPP, MA, SZK, and PCP complexity classes, for example – but what about real-life science? (gee, now I know why Scott complains about naming conventions in complexity theory…)
I propose the following scheme: say our mathematician Bob wants to show the world that he has proved P = NP (and therefore, also P = co-NP), and in fact has discovered an algorithm for solving NP complete problems. However, he does not want to give away the algorithm, just show that he knows it. He thus set up a private server which handles incoming requests, and starts playing a sort of zero knowledge interactive proof protocol with the mathematical community. The community sends him boolean expressions of increasingly larger size n. For each expression s: if s is satisfiable, the server replies with a satisfying assignment. If not, the server replies with a proof that verifies that there does not exist a satisfying assignment. In other words, the server replies whether s is in SAT or in SAT-complement, and either way provides a proof for it. These are NP and co-NP complete problems, so solving them in polynomial time shows that P = NP.
Now, all currently known algorithms take super-polynomial time to run. So if Bob only had access to known algorithms, he would not be able to solve the problems as n becomes larger. However, assuming that the polynomial in Bob’s algorithm isn’t of very high degree, and that its leading coefficients aren’t gigantic, he can answer the questions even for large expressions. Knowing Bob’s computational power, the community could very well know if he runs in exponential or polynomial time, proving that he does not lie.
Where does probability come into play? Well, it may be that Bob is guessing. Bob might be a fraud, and whenever he gets a boolean expression, he randomly generates polynomial-sized certificates, and sees if they prove satisfaction / unsatisfaction of the expression. Of course, his chances of succeeding are exponentially small as n gets larger, but for every finite number of tests the community presses upon him, there is always a finite chance that he gets it right. Also, even if Bob runs a deterministic super-polynomial algorithm, for some lucky expressions he might be able to get the right answers quickly** (see problem below), so the community itself needs to randomize the expressions they send.
As it is, will this kind of “proof” catch on among fellow mathematicians? I doubt it. Even if you exclude the whole “his algorithm runs fast so it must be polynomial” problem, I think we are still far off from a more ideal world: one that has, in addition to science journals, an international array of servers dedicated to interactive, zero knowledge, and other probabilistic proofs (actually, under plausible assumptions, most those can be de-interactivated, no?), representing those theorems and propositions which we almost know are true; we’re just missing an ε.
Notes on note:
Note and future thoughts: The solution I have given is a bit awkward, in that it requires Bob to solve a lot of NP and co-NP problems, instead of just giving a normal zero knowledge proof to the expression “P = NP”. Suppose Bob only knows an algorithm for SAT. Can he encode it in such a way so that he can prove that it is indeed a polynomial algorithm for SAT, but that no one else will be able to use it? This is different from the zero-knowledge that I know of, in the sense that we aren’t just asking whether a specific string is in some language (aka an expression is in SAT) – we want to show that a string (program / proof) has some properties, without showing that actual string. If there are any real complexity theorists reading this, please do leave a comment.
Note to note to note: I recall hearing somewhere about encrypted zero knowledge computation – that is, Bob could send his algorithm to the community in an encrypted form, so that the community will do the calculations but have no idea what they are doing (and the computation will be different for each input, so they will not be able to reproduce the algorithm on different inputs than what they already asked). First of all, this is cool. Second, this will help base the fact that the algorithm is polynomial – but still won’t get around the lucky guesses that Bob could make if he frauds up a super-polynomial algorithm.
Too cool not to note: On the subject of “errors in human-reviewed manuscripts”: if rigour and no-chance-for-error is what you are looking for, why not submit formal proofs to the leading journals? This would constitute a sequence of lines; the first few would be the axioms, and the rest would be logically irrefutable inferences from the previous lines. The last would be the theorem you want to prove, stemming irrevocably from the axioms. Any monkey / mathematician / computer could go over these proofs and verify that they are true, no chance of error involved. Of course, I’m not the one to think of this idea, many have done so before me; for example, the QED manifesto (now, that’s a name I could live by…).
**Problem with solution: I assume here that expressions the mathematical community sends are, probabilistically, hard enough to solve with modern SAT solvers. I know that some of these solvers can work very well in some cases, failing miserably only, for example, on expressions with just one satisfying assignment. In this specific case, the assumption is that enough of the expressions only have one satisfying assignment (I seem to recall that this applies to a lot of boolean expressions).