version |
Stanford Encyclopedia of Philosophy
|
last substantive content change
|
Surprisingly, all of these models are exactly equivalent: anything computable in the lambda calculus is computable by a Turing machine and similarly for any other pairs of the above computational systems. After this was proved, Church expressed the belief that the intuitive notion of "computable in principle" is identical to the above precise notions. This belief, now called the "Church-Turing Thesis", is uniformly accepted by mathematicians.
Part of the impetus for the drive to codify what is computable came from the mathematician David Hilbert. Hilbert believed that all of mathematics could be precisely axiomatized. He felt that once this was done, there would be an "effective procedure", i.e., an algorithm that would take as input any precise mathematical statement, and, after a finite number of steps, decide whether the statement was true or false. Hilbert was asking for what would now be called a decision procedure for all of mathematics.
As a special case of this decision problem, Hilbert considered the validity problem for first-order logic. First-order logic is a mathematical language in which most mathematical statements can be formulated. Every statement in first-order logic has a precise meaning in every appropriate logical structure, i.e., it is true or false in each such structure. Those statements that are true in every appropriate structure are called valid. Those statements that are true in some structure are called satisfiable. Notice that a formula, ϕ, is valid iff its negation, ¬ϕ, is not satisfiable.
Hilbert called the validity problem for first-order logic, the entscheidungsproblem. In a textbook, Principles of Mathematical Logic by Hilbert and Ackermann, the authors wrote, "The Entscheidungsproblem is solved when we know a procedure that allows for any given logical expression to decide by finitely many operations its validity or satisfiability. … The entscheidungsproblem must be considered the main problem of mathematical logic."
In his 1930 Ph.D. thesis, Gödel presented a complete axiomatization of first-order logic, based on the Principia Mathematica by Whitehead and Russell. Gödel proved his Completeness Theorem, namely that a formula is provable from the axioms if and only if it is valid. Gödel's Completeness theorem was a step towards the resolution of Hilbert's entscheidungsproblem.
In particular, since the axioms are easily recognizable, and rules of inference very simple, there is a mechanical procedure that can list out all proofs. Note that each line in a proof is either an axiom, or follows from previous lines by one of the simple rules. For any given string of characters, we can tell if it is a proof. Thus we can systematically list all strings of characters of length 1, 2, 3, and so on, and check whether each of these is a proof. If so, then we can add the proof's last line to our list of theorems. In this way, we can list out all theorems, i.e., exactly all the valid formulas of first-order logic, can be listed out by a simple mechanical procedure. More precisely, the set of valid formulas is the range of a computable function. In modern terminology we say that the set of valid formulas of first-order logic is recursively enumerable (r.e.). Gödel's Completeness theorem was not sufficient, however, to give a positive solution to the entscheidungsproblem. Given a formula, ϕ, if ϕ is valid then the above procedure would eventually list it out and thus could answer, "Yes, ϕ is valid." However, if ϕ were not valid then we might never find this fact out. What was missing was a procedure to list out all the non-valid formulas, or equivalently to list out all satisfiable formulas. A year later, in 1931, Gödel shocked the mathematical world by proving his Incompleteness Theorem: there is no complete and computable axiomatization of the first-order theory of the natural numbers. That is, there is no reasonable list of axioms from which we can prove exactly all true statements of number theory.
A few years later, Church and Turing independently proved that the entscheidungsproblem is unsolvable. Church did this by using the methods of Gödel's Incompleteness Theorem to show that the set of satisfiable formulas of first-order logic are not r.e., i.e., they cannot be systematically listed out by a function computable by the lambda calculus. Turing introduced his machines and proved many interesting theorems some of which we will discuss in the next section. In particular, he proved the unsolvability of the halting problem. He obtained the unsolvability of the entscheidungsproblem as a corollary.
Hilbert was very disappointed because his program towards a decision procedure for all of mathematics was proved impossible. However, as we will see in more detail in the rest of this article, a vast amount was learned about the fundamental nature of computation.
Turing showed that he could build a Turing machine, U,
that was
universal, in the sense that it could run the program of any
other Turing machine. More explicitly, for any i, and
any input w,
U on inputs i and w
would do exactly what
Mi would do on input w, in
symbols,
Turing's construction of a universal machine gives the most fundamental insight into computation: one machine can run any program whatsoever. No matter what computational tasks we may need to perform in the future, a single machine can perform them all. This is the insight that makes it feasible to build and sell computers. One computer can run any program. We don't need to buy a new computer every time we have a new problem to solve. Of course, in the age of personal computers, this fact is such a basic assumption that it may be difficult to step back and appreciate it.
Let M be a Turing machine and n a natural number. We say that M's tape contains the number n, if M's tape begins with a binary representation of the number n (with no unnecessary leading 0's) followed by just blank symbols from there on.
If we start the Turing machine M on a tape containing n and it eventually halts with its tape containing m, then we say that M on input n, computes m: M(n) = m. If, when we start M on input n, it either never halts, or when it halts, its tape does not contain a natural number, e.g., because it has leading 0's, or digits interspersed with blank symbols, then we say that M(n) is undefined, in symbols: M(n) = ↗. We can thus associate with each Turing machine, M, a partial function, M: N → N ∪ {↗}. We say that the function M is total if for all n∈N, M(n) ∈ N, i.e., M(n) is always defined.
Now we can formally define what it means for a set to be recursively enumerable (r.e.) which we earlier described informally. Let S ⊆ N. Then S is r.e. if and only if there is some Turing machine, M, such that S is the image of the function computed by M, in symbols,
Thus, S is r.e. just if it can be listed out by some Turing machine. Suppose that S is r.e. and its elements are enumerated by Turing machine M as above. We can then describe another Turing machine, P, which, on input n, runs M in a round-robin fashion on all its possible inputs until eventually M outputs n. If this happens then P halts and outputs "1", i.e., P(n)=1. If n ∉ S, then M will never output n, so P(n) will never halt, i.e., P(n)=↗. Let the notation P(n)↓ mean that Turing machine P on input n eventually halts. For a Turing machine, P, define L(P), the set accepted by P, to be those numbers n such that P on input n eventually halts,
The above argument shows that if a set S is r.e. then it is accepted by some Turing machine, P, i.e., S = L(P). The converse of this statement holds as well. That is, S is r.e. if and only if it is accepted by some Turing machine, P.
We say that a set, S, is decidable if and only if there is a total Turing machine, M, that decides for all n ∈ N whether or not n ∈ S. Think of "1" as "yes" and "0" as "no". For all n∈ N, if n ∈ S, then M(n) = 1, i.e., M on input n eventually halts and outputs "yes", whereas if n ∉ N, then M(n) = 0, i.e., M on input n eventually halts and outputs "no". Synonyms for decidable are: computable, solvable, and recursive.
For S ⊆ N, the complement of S is N - S, i.e., the set of all natural numbers not in S. We say that the set S is co-r.e. if and only if its complement is r.e. If a set, S, is r.e. and co-r.e. then we can list out all of its elements in one column and we can list out all of its non-elements in a second column. In this way we can decide whether or not a given element, n, is in S: just scan the two columns and wait for n to show up. If it shows up in the first column then n ∈ S. Otherwise it will show up in the second column and n ∉ S. In fact, a set is recursive iff it is r.e. and co-r.e.
Turing actually constructed a non-decidable set. As we will see, he did this using a diagonal argument. The diagonal argument goes back to Georg Cantor who used it to show that the real numbers are uncountable. Gödel used a similar diagonal argument in his proof of the Incompleteness Theorem in which he constructed a sentence, J, in number theory whose meaning could be understood to be, "J is not a theorem."
Turing constructed a diagonal halting set, K, as follows:
That is, K consists of those Turing machines that eventually halt when input their own programs.
It is not hard to see that K is r.e. Suppose for the sake of a contradiction that K is also co-r.e., and let d be the number of a Turing machine that accepts the complement of K. That is, for any n,
But consider what happens when we substitute d for n in the above equation:
However, the definition of K tells us that:
Thus we have that
which is a contradiction. Thus our assumption that K is co-r.e. is false. Thus K is not recursive. It follows that it is not a computable problem to be given a Turing machine and its input and to decide whether or not the Turing machine will eventually halt on that input, i.e., the halting problem is unsolvable.
We begin with the three initial primitive recursive functions:
Now consider the following two operations:
where each wi is a list of ri arguments, perhaps with repetition, from x1, …, xk; and,
Here composition is the natural way to combine functions, and primitive recursion is a restricted kind of recursion in which h with first argument n+1 is defined in terms of h with first argument n, and all the other arguments unchanged. (This is equivalent to the power of bounded iteration, cf. the language Bloop from [Hofstadter, 1979].)
Define the primitive recursive functions to be the smallest class of functions that contains the Initial functions and is closed under Composition and Primitive Recursion.
The primitive recursive functions have a very simple definition and yet they are extremely powerful. Gödel proved inductively that every primitive recursive function can be simply represented in first-order number theory. He then used the primitive recursive functions to encode formulas and even sequences of formulas by numbers. He finally used the primitive recursive functions to compute properties of the represented formulas including that a formula was well formed, a sequence of formulas was a proof, and that a formula was a theorem.
It takes a long series of lemmas to show how powerful the primitive recursive functions are. The following are a few examples showing that addition, multiplication, and exponentiation are primitive recursive.
Define the addition function, P(x,y), as follows:
(Note that this fits into the definition of primitive recursion because the function g(x1,x2,x3) = η(σ(x1)) is definable from the initial functions η and σ by composition.)
Next, define the multiplication function, T(x,y), as follows:
Next, we define the exponential function, E(x,y). (Usually 00 is considered undefined, but since primitive recursive functions must be total, we define E(x,y) to be 1.) Since primitive recursion only allows us to recurse on the first argument, we use two steps to define the exponential function:
Finally we can define E(x,y) = R(η(y),η(x)) by composition. (Recall that η is the identity function so this could be more simply written as E(x,y) = R(y,x).)
The exponential function, E, grows very rapidly, for example, E(10,10) is ten billion, and E(50,50) is over 1084 (and thus significantly more than the estimated number of atoms in the universe). However, there are much faster growing primitive recursive functions. As we saw, E was defined from the slowly growing function, σ, using three applications of primitive recursion: one for addition, one for multiplication, and then one more for exponentiation. We can continue to apply primitive recursion to build a series of unimaginably fast growing functions. Let's do just one more step in the series defining the hyper-exponential function, H(n,m) as 2 to the 2 to the 2 to the … to the m, with a tower of n 2's. H is primitive recursive because it can be defined from E using one more application of primitive recursion:
Thus H(2,2) = 24 = 16, H(3,3) = 2256 is more than 1077 and comparable to the number of atoms in the universe. If that's not big enough for you then consider H(4,4). To write this number in decimal notation we would need a one, followed by more zero's than the number of particles in the universe.
However, the primitive recursive functions do not include all functions computable in principle. To see this, we can again use diagonalization. We can systematically encode all definitions of primitive recursive functions of arity 1, calling them p1, p2, p3, and so on.
We can then build a Turing machine to compute the value of the following diagonal function, D(n) = pn(n) + 1.
Notice that D is a total, computable function from N to N, but it is not primitive recursive. Why? Suppose for the sake of a contradiction that D were primitive recursive. Then D would be equal to pd for some d∈ N. But it would then follow that
which is a contradiction. Therefore, D is not primitive recursive.
Alas, the above diagonal argument works on any class of total functions that could be considered a candidate for the class of all computable functions. The only way around this, if we want all functions computable in principle, not just in practice, is to add some kind of unbounded search operation. This is what Gödel did to extend the primitive recursive functions to the recursive functions.
Define the unbounded minimization operator, μ , as follows. Let f be a perhaps partial function of arity k+1. Then μ [f] is defined as the following function of arity k. On input x1, …, xk do the following:
For i = 0 to
∞ do {
     if
f(i,x1,…,xk)
= 1, then output i
}
Thus if f(i,x1,…,xk) = 1, and for all j < i, f(j,x1,…,xk) is defined, but not equal to 1, then μ [f](x1, …, xk) = i. Otherwise μ [f](x1, …, xk) is undefined.
Gödel defined the set of Recursive functions to be the closure of the initial primitive recursive functions under composition, primitive recursion, and μ . With this definition, the Recursive functions are exactly the same as the set of partial functions computable by the Lambda calculus, by Kleene Formal systems, by Markov algorithms, by Post machines, and by Turing machines.
In this section we are dealing with complexity instead of computability, and all the Turing machines that we consider will halt on all their inputs. Rather than accepting by halting, we will assume that a Turing machine accepts by outputting "1" and rejects by outputting "0", thus we redefine the set accepted by a total machine, M,
The time that an algorithm takes depends on the input and the machine on which it is run. The first important insight in complexity theory is that a good measure of the complexity of an algorithm is its asymptotic worst-case complexity as a function of the size of the input, n. For an input, w, let n = |w| be the length of w. Following [Hartmanis, 1989] we say that a Turing machine M runs in time T(n) if for all w of length n, M(w) takes at most T(n) steps and then halts. This is called worst-case complexity because T(n) must be as large as the time taken by any input of length n. For any function T : N → N, let
Alan Cobham and Jack Edmonds identified the complexity class, P, of problems recognizable in some polynomial amount of time, as being an excellent mathematical wrapper of the class of feasible problems -- those problems all of whose moderately-sized instances can be feasibly recognized,
Any problem not in P is certainly not feasible. On the other hand, natural problems that have algorithms in P, tend to eventually have algorithms discovered for them that are actually feasible.
Many important complexity classes besides P have been defined and studied; a few of these are NP, PSPACE, and EXPTIME. PSPACE consists of those problems solvable using some polynomial amount of memory space. EXPTIME is the set of problems solvable in time 2p(n) for some polynomial, p.
Perhaps the most interesting of the above classes is NP: nondeterministic polynomial time. The definition comes not from a real machine, but rather a mathematical abstraction. A nondeterministic Turing machine, N, makes a choice (guess) of one of two possible actions at each step. If, on input w, some sequence of these choices leads to acceptance, then we say that N accepts w, and we say the nondeterministic time taken by N on input w, is just the number of steps taken in the sequence that leads to acceptance. A nondeterministic machine is not charged for all the other possible choices it might have made, just the single sequence of correct choices.
NP is sometimes described as the set of problems, S, that have short proofs of membership. For example, suppose we are given a list of m large natural numbers: a1, …, am, and a target number, t. This is an instance of the Subset Sum problem: is there a subset of the m numbers whose sum is exactly t? This problem is easy to solve in nondeterministic linear time: for each i, we guess whether or not to take ai. Next we add up all the numbers we decided to take and if the sum is equal to t then accept. Thus the nondeterministic time is linear, i.e., some constant times the length of the input, n. However there is no known (deterministic) way to solve this problem in time less than exponential in n.
There has been a large study of algorithms and the complexity of many important problems is well understood. In particular reductions between problems have been defined and used to compare the relative difficulty of two problems. Intuitively, we say that A is reducible to B (A ≤ B) if there is a simple transformation, τ , that maps instances of A to instances of B in a way that preserves membership, i.e., τ (w) ∈ B ⇔ w ∈ A.
Remarkably, a high percentage of naturally occurring computational problems turn out to be complete for one of the above classes. (A problem, A, is complete for a complexity class C if A is a member of C and all other problems B in C are no harder than A, i.e., B ≤ A. Two complete problems for the same class have equivalent complexity.)
The reason for this completeness phenomenon has not been adequately explained. One plausible explanation is that natural computational problems tend to be universal in the sense of Turing's universal machine. A universal problem in a certain complexity class can simulate any other problem in that class. The reason that the class NP is so well studied is that a large number of important practical problems are NP complete, including Subset Sum. None of these problems is known to have an algorithm that is faster than exponential time, although some NP-complete problems admit feasible approximations to their solutions.
A great deal remains open about computational complexity. We know that strictly more of a particular computational resource lets us solve strictly harder problems, e.g. TIME[n] is strictly contained in TIME[n1.01] and similarly for SPACE and other measures. However, the trade-offs between different computational resources is still quite poorly understood. It is obvious that P is contained in NP. Furthermore, NP is contained in PSPACE because in PSPACE we can systematically try every single branch of an NP computation, reusing space for the successive branches, and accepting if any of these branches lead to acceptance. PSPACE is contained in EXPTIME because if a PSPACE machine takes more than exponential time, then it has exactly repeated some configuration so it must be in an infinite loop. The following are the known relationships between the above classes:
However, while it seems clear that P is strictly contained in NP, that NP is strictly contained in PSPACE, and that PSPACE is strictly contained in EXPTIME, none of these inequalities has been proved. In fact, it is not even known that P is different from PSPACE, nor that NP is different from EXPTIME. The only known proper inclusion from the above is that P is strictly contained in EXPTIME. The remaining questions concerning the relative power of different computational resources are fundamental unsolved problems in the theory of computation.
Neil Immerman immerman at cs dot umass dot edu |