## Program Verification

J Strother Moore described program verification in his talk entitled “A Mechanized Program Verifier” at the 2005 Verification Grand Challenge Workshop.

The verification community seeks to build a system that can follow the reasoning of an average human engaged in a creative, computational, problem solving task lasting months or years and spanning an arbitrary set of application domains.

Now step back and reconsider what was just said: we seek to build a system that follows the reasoning of an average human in an arbitrary, creative, computational, problem-solving task.

I liked his quote, because it summarizes well the type of software I am trying to build, among which is a verification tool. I also find this interesting because the name of the conference identifies verification as the *Grand Challenge*, implying that I am just an inexperienced outsider, without the academic credentials, foolishly attacking an extremely hard, maybe impossible problem, that have already eluded the top minds.

Moore then compares the 40–year old field to a dark continent still to be explored and that we are still explorers in a sparsely inhabited coast.

If the augmentation of the human mind were a dark continent to be explored, then such applications are on the coast.

I say let us explore. I say let us mount an expedition. Let us go deeply into the heart of the continent. We will not map the whole thing -- only a narrow path. But let's get off the coast. Let's build a track into the interior.

Let's build a machine that reasons. What do I mean by ``reason''? I mean to deduce the consequences of more or less arbitrary assumptions.

Here's my suggestion. Let's build a system that can follow -- effortlessly -- the reasoning of a single talented human being engaged in some deep problem solving task. So ... the machine must be much more powerful than today's mechanical expert assistants.

I don’t think this 40–year old field has been really studied. There was a substantial amount of research done in the 1970s. (Many of the typewritten research papers back then strike me as being of higher quality than those of modern times.) While there has been a recent revival in software verification in the past decade, some of the new work has been mostly pragmatic. Between those two times, an entire generation from the 1980s to 1990s of research was lost. The one major paper, I found from the 1980s was a PhD thesis, “Techniques of Program Verification,” written by Greg Nelson. This turns out to be the same guy, 25 years later, working on the Simplify theorem prover, which is being used today as the basis for a number of verification tools including SLAM, Spec#, and ESC/Java. It seems that just a few people are driving the field of software verification forward. (Much of the field seems to have moved onto hardware verification, which may have had a more critical need for the technology.) An Internet search confirmed my suspicions as one Stanford researcher responded in his email to a colleague on “Why has verification died?”:

From the late 1960's to the early 1980's software verification was a thriving field: new inference methods were developed, tools (like the Stanford Pascal Verifier) were developed, PhD's granted, and so forth. Since then it seems that little has happened: verifiers aren't part of industry's tool kit, computer science students rarely use verification (and even less often, verifiers), etc. What happened?OK. I'll take a wild shot at this. Here's my extreme position:

Verification is hard. But when there is an adequate incentive to be sure about correctness, some combination of verification and testing is worthwhile. There were and still are significant language and semantics issues. The early work was almost all on first-order languages -- Pascal does not allow procedure parameters that have procedure parameters. When you try to go beyond that, you get into lots of questions about semantic models. That has taken a long time to explore. The lofty goal of complete verification is too ambitious. Set your sights lower, and you'll have a better chance of success.

(I don’t actually think any of the three points are true, but I’ll address those in a future post. The issue of higher-order functions, in particular, can be handled in a satisfactory way.)

The late 1990s on, according to Infoworld, saw new source code tools and accelerated research, both of which bode well for the future of automated source code analysis. However, the article is somewhat disheartening. Source code analysis is seen as necessarily consuming vast quantities of computing resources, particularly time, and being primarily driven forward by Moore’s law rather than better technology. “With too much precision, an analysis can take years,” the article writes. Consequently, commercial vendors have given up and shifted focus towards issues like usability.

… there has been a fundamental philosophical shift in how we approach the issue of source code analysis. Early researchers were interested in program correctness, he says. The goal was to prove that “my program will, under all circumstances, compute what I intend it to compute.” Now, he says, the emphasis has switched to a more tractable form of proof: that “there are specific properties my program does not have.” Buffer overflows and deadlocks are examples of such properties.

The problem lies in the naive use of the traditional theorem prover, which, through elementary steps of deduction, leads to exponential analysis times. Many researchers have resigned themselves to notion that analysis must be intractable, because of a theoretical result that states that first-order predicate logic and many other logical theories are not decidable. However, the same theoretical results necessarily apply to human beings, slogging through proofs with pen and paper, yet mathematicians, for some reason, are far more effective.

The analogous approach of the brute-force theorem prover in the human forum is work of formalists, Betrand Russell and David Hilbert, in the early 1900s. As De Millo writes in “Social Processes of Theorems and Programs,”

They saw mathematics as proceeding in principle from axioms or hypotheses to theorems by steps, each step easily justifiable from its predecessors by a strict rule of transformation, the rules of transformation being few and fixed. The Principia Mathematica was the crowning achievement of the formalists. It was also the deathblow for the formalist view. There is no contradiction here: Russell did succeed in showing that ordinary working proofs can be reduced to formal, symbolic deductions. But he failed, in three enormous, taxing volumes, to get beyond the elementary facts of arithmetic. He showed what can be done in principle and what cannot be done in practice. If the mathematical process were really one of strict, logical progression, we would still be counting on our fingers.

…or more succinctly, with a quote from Henri Poincare, “if it requires twenty-seven equations to establish that 1 is a number, how many will it require to demonstrate a real theorem.”

It should be possible for the computer to analyze programs in a more human-like manner; however much overhead we incur in maintaining fidelity to how the human mind works, it can’t be as bad as existing exponential approaches, which are leading researchers to spend considerable time devising a multitude of exotic duct-tape solutions on top of bad infrastructure.

In my talk on February 5th, I’ll describe a novel, simple and stunningly elegant approach, which is partially realized in my NStatic product and could be a better solution to the verification problem.