# P-complete

In computational complexity theory, a decision problem is **P-complete** (complete for the complexity class **P**) if it is in **P** and every problem in **P** can be reduced to it by an appropriate reduction.

The notion of **P-complete** decision problems is useful in the analysis of:

- which problems are difficult to parallelize effectively,
- which problems are difficult to solve in limited space.

specifically when stronger notions of reducibility than polytime-reducibility are considered.

The specific type of reduction used varies and may affect the exact set of problems. Generically, reductions stronger than polynomial-time reductions are used, since all languages in **P** (except the empty language and the language of all strings) are **P**-complete under polynomial-time reductions. If we use **NC** reductions, that is, reductions which can operate in polylogarithmic time on a parallel computer with a polynomial number of processors, then all **P**-complete problems lie outside **NC** and so cannot be effectively parallelized, under the unproven assumption that **NC** ≠ **P**. If we use the stronger log-space reduction, this remains true, but additionally we learn that all **P**-complete problems lie outside L under the weaker unproven assumption that **L** ≠ **P**. In this latter case the set **P**-complete may be smaller.

## Motivation[edit]

The class **P**, typically taken to consist of all the "tractable" problems for a sequential computer, contains the class **NC**, which consists of those problems which can be efficiently solved on a parallel computer. This is because parallel computers can be simulated on a sequential machine.
It is not known whether **NC** = **P**. In other words, it is not known whether there are any tractable problems that are inherently sequential. Just as it is widely suspected that **P** does not equal **NP**, so it is widely suspected that **NC** does not equal **P**.

Similarly, the class **L** contains all problems that can be solved by a sequential computer in logarithmic space. Such machines run in polynomial time because they can have a polynomial number of configurations. It is suspected that **L** ≠ **P**; that is, that some problems that can be solved in polynomial time also require more than logarithmic space.

Similarly to the use of NP-complete problems to analyze the **P** = **NP** question, the **P**-complete problems, viewed as the "probably not parallelizable" or "probably inherently sequential" problems, serves in a similar manner to study the **NC** = **P** question. Finding an efficient way to parallelize the solution to some **P**-complete problem would show that **NC** = **P**. It can also be thought of as the "problems requiring superlogarithmic space"; a log-space solution to a **P**-complete problem (using the definition based on log-space reductions) would imply **L** = **P**.

The logic behind this is analogous to the logic that a polynomial-time solution to an **NP**-complete problem would prove **P** = **NP**: if we have a **NC** reduction from any problem in **P** to a problem A, and an **NC** solution for A, then **NC** = **P**. Similarly, if we have a log-space reduction from any problem in **P** to a problem A, and a log-space solution for A, then **L** = **P**.

## P-complete problems[edit]

The most basic **P**-complete problem under logspace many-one reductions is following: given a Turing machine , an input for that machine x, and a number *T* (written in unary), does that machine halt on that input within the first *T* steps? For any x in in P, output the encoding of the Turing machine which accepts it in polynomial-time, the encoding of x itself, and a number of steps corresponding to the p which is there polynomial-time bound on the operation of the Turing Machine deciding , . The machine M halts on x within steps if and only if x is in L. Clearly, if we can parallelize a general simulation of a sequential computer (ie. The Turing machine simulation of a Turing machine), then we will be able to parallelize any program that runs on that computer. If this problem is in **NC**, then so is every other problem in **P**. If the number of steps is written in binary, the problem is EXPTIME-complete.
This problem illustrates a common trick in the theory of **P**-completeness. We aren't really interested in whether a problem can be solved quickly on a parallel machine. We're just interested in whether a parallel machine solves it *much more* quickly than a sequential machine. Therefore, we have to reword the problem so that the sequential version is in **P**. That is why this problem required *T* to be written in unary. If a number *T* is written as a binary number (a string of *n* ones and zeros, where *n* = log *T*), then the obvious sequential algorithm can take time 2^{n}. On the other hand, if *T* is written as a unary number (a string of *n* ones, where *n* = *T*), then it only takes time *n*. By writing *T* in unary rather than binary, we have reduced the obvious sequential algorithm from exponential time to linear time. That puts the sequential problem in **P**. Then, it will be in **NC** if and only if it is parallelizable.

Many other problems have been proved to be **P**-complete, and therefore are widely believed to be inherently sequential. These include the
following problems which are **P**-complete under at least logspace reductions, either as given, or in a decision-problem form:

- Circuit Value Problem (CVP) – Given a circuit, the inputs to the circuit, and one gate in the circuit, calculate the output of that gate.
- Restricted Case of CVP – Like CVP, except each gate has two inputs and two outputs (F and Not F), every other layer is just AND gates, the rest are OR gates (or, equivalently, all gates are NAND gates, or all gates are NOR gates), the inputs of a gate come from the immediately preceding layer
- Linear programming – Maximize a linear function subject to linear inequality constraints
- Lexicographically First Depth First Search Ordering – Given a graph with fixed ordered adjacency lists, and nodes
*u*and*v*, is vertex*u*visited before vertex*v*in a depth-first search induced by the order of the adjacency lists? - Context Free Grammar Membership – Given a context-free grammar and a string, can that string be generated by that grammar?
- Horn-satisfiability – given a set of Horn clauses, is there a variable assignment which satisfies them? This is
**P'**s version of the boolean satisfiability problem. - Game of Life – Given an initial configuration of Conway's Game of Life, a particular cell, and a time
*T*(in unary), is that cell alive after*T*steps? - LZW (algorithm) (1978 paradigm) Data Compression – given strings
*s*and*t*, will compressing*s*with an LZ78 method add*t*to the dictionary? (Note that for LZ77 compression such as gzip, this is much easier, as the problem reduces to "Is*t*in*s*?".) - Type inference for partial types – Given an untyped term from the lambda calculus, determine whether this term has a partial type.

Most of the languages above are **P**-complete under even stronger notions of reduction, such as uniform many-one reductions, DLOGTIME reductions, or polylogarithmic projections.

In order to prove that a given problem in **P** is **P**-complete, one typically tries to reduce a known **P**-complete problem to the given one.

In 1999, Jin-Yi Cai and D. Sivakumar, building on work by Ogihara, showed that if there exists a sparse language that is **P**-complete, then **L** = **P**.^{[1]}

**P**-complete problems may be solvable with different time complexities. For instance, the Circuit Value Problem can be solved in linear time by a topological sort. Of course, because the reductions to a **P**-complete problem may have different time complexities, this fact does not imply that all the problems in **P** can also be solved in linear time.

## Problems not known to be P-complete[edit]

Some **NP**-problems are not known to be either **NP**-complete or in **P**. These problems (e.g. factoring, graph isomorphism, parity games) are suspected to be difficult. Similarly there are problems in **P** that are not known to be either **P**-complete or **NC**, but are thought to be difficult to parallelize. Examples include the decision problem forms of finding the greatest common divisor of two numbers, determining what answer the extended Euclidean algorithm would return when given two numbers, and computing the Maximum weight matching of a graph with large integer weights.

## Notes[edit]

**^**Cai, Jin-Yi; Sivakumar, D. (1999), "Sparse hard sets for P: resolution of a conjecture of Hartmanis",*Journal of Computer and System Sciences*,**58**(2): 280–296, doi:10.1006/jcss.1998.1615

## References[edit]

- Greenlaw, Raymond, James Hoover, and Walter Ruzzo. 1995.
*Limits To Parallel computation; P-Completeness Theory*. ISBN 0-19-508591-4. — Develops the theory, then catalogs 96 P-Complete problems. - Satoru Miyano, Shuji Shiraishi, and Takayoshi Shoudai.
*A List of P-Complete Problems*. Kyushu University, RIFIS-TR-CS-17. December 1990.