Logic for Computer Science/First-Order Logic

First-Order Logic
In propositional logic, we considered formulas made about atomic objects, which could only be either true or false. First-order logic, the topic of this chapter, builds upon propositional logic and allows you to look inside the objects discussed in formulas. We can provide this more refined level of granularity by discussing objects as elements of sets that can be larger than just the set $$\{0, 1\}$$, and also include arbitrarily complex relationships with each other.

We begin first by defining the syntax of first-order (FO) logic, and then give these structures meaning.

Syntax
The domain of discourse for first order logic is first-order structures or models. A first-order structure contains
 * 1) Relations,
 * 2) Functions, and
 * 3) Constants (functions of arity 0).

The vocabulary of first-order logic is
 * 1) a set of relation symbols with associated arities, and
 * 2) a set of function symbols with associated arities.

Here are some example first-order logic vocabularies:
 * 1) A graph
 * 2) * Relation Set = { $$V$$ : unary, $$E$$ : binary }
 * 3) * Function Set = { $$\mathit{next}$$ : unary }
 * 4) Arithmetic
 * 5) * Relation Set = { $$+$$ : ternary, $$\times$$ : ternary, $$\uparrow$$ : ternary, $$=$$ : binary, $$<$$ : binary}
 * 6) *Function Set = { $$\mathit{next}$$ : unary, $$0$$ : const }

Here, a graph is a set of vertices $$V$$ and a set of edges $$E$$. For the arithmetic set note that the use of $$+$$ and $$\times$$ is purely syntactic and we could have used the symbols "plus" and "times" instead. We haven't provided the symbols with any meaning yet.

A term denotes an element in the first-order structure. A term is used to refer to the elements in our domain of discourse. Here are the rules that describe what a term is: For example, if $$f$$ is a binary function and $$g$$ is a ternary function then the following are all terms: $$x, a, f(x, y), g(x, f(x, y), a)$$.
 * every variable is a term, where a variable is simply another set of symbols
 * every constant is a term
 * If $$t_{1}, t_{2}, \ldots, t_{k}$$ are terms, and $$f$$ is a $$k$$-ary function then $$f(t_{1}, t_{2}, \ldots, t_{k})$$ is a term.

An atomic formula is
 * $$R(t_{1}, t_{2}, \ldots, t_{k})$$,

where $$R$$ is a $$k$$-ary relation and $$t_{1}, t_{2}, \ldots, t_{k}$$ are terms. When viewing $$R$$ as a set, then this is just another way of saying that the tuple
 * $$(t_{1}, t_{2}, \ldots, t_{k})\in R$$.

A special relation $$=$$ means "equal" and cannot be interpreted otherwise. For example, $$t_{1} = t_{2}$$ stands for
 * $$=(t_{1}, t_{2})$$.

A first-order formula is an expression built using a given first-order vocabulary and variables and the symbols $$, \neg, \vee, \wedge, \rightarrow, \exists, \forall$$. The set of first-order formulas is the minimum set satisfying the following:
 * atomic formulas are first order formulas
 * If $$\phi$$ and $$\psi$$ are formulas and $$x$$ is a variable, then the following are also formulas:
 * 1) $$(\phi \vee \psi)$$
 * 2) $$(\phi \wedge \psi)$$
 * 3) $$(\neg \phi)$$
 * 4) $$(\phi \rightarrow \psi)$$
 * 5) $$(\exists x \phi)$$
 * 6) $$(\forall x \phi)$$

Semantics
A first-order structure over a given vocabulary consists of These components give meaning to the symbols.
 * 1) A domain, which is a set of elements (also known as a universe)
 * 2) A mapping associating to every $$k$$-ary relation symbol in the vocabulary a $$k$$-ary relation over the domain, and to every $$k$$-ary function symbol a $$k$$-ary function over the domain.

Example:
 * Relation Set = { $$+$$ : ternary, $$\times$$ : ternary, $$\uparrow$$ : ternary, $$=$$ : binary, $$<$$ : binary }
 * Function Set = { $$next$$ : unary, $$0$$ : const }

A first-order structure over this vocabulary is:
 * 1) Domain: the set of integers
 * 2) Mapping : $$+ \rightarrow$$ addition, $$\times \rightarrow$$ multiplication, $$\uparrow \rightarrow$$ exponentiation, $$< \rightarrow$$ ordering, $$\mathit{next} \rightarrow i+1$$

In this structure, the formula
 * $$\exists x \forall y \forall z[\times (y, z, x) \rightarrow ((y = 1) \vee (z = 1))]$$

expresses the statement "there exists a prime number" (the number 1 also satisfies this statement).

Note here that $$\times (y, z, x)$$ is equivalent to $$(x = y \times z)$$.

Scope of Quantifiers
In the formula $$(\forall x \phi)$$ or $$(\exists x \phi)$$, $$\phi$$ is referred to as the scope of quantification of the variable $$x$$. An occurrence of a variable in a formula is bound if it is in the scope of quantification of that variable, otherwise the occurrence is said to be free. You can consider a quantifier use as a variable declaration.

A sentence is a formula with no free variables (i.e., all variables are bound). A sentence is either true or false.

A formula with free variable(s) can be considered as describing the properties of the free variable(s). For example, $$\phi(x)$$ denotes a formula with $$x$$ occurring free and describes the properties of $$x$$.

If a sentence $$\phi$$ evaluates to true over a structure $$I$$, we say $$I$$ satisfies $$\phi$$ and denote this by $$I \models\phi$$.


 * $$I \models R(t_{1}, t_{2}, \ldots, t_{k})$$ iff $$(t_{1}, t_{2}, \ldots, t_{k}) \in I(R)$$, where $$I(R)$$ is the interpretation of $$R$$ by $$I$$.
 * $$I \models \phi \wedge \psi$$ iff $$I \models \phi$$ and $$I \models \psi$$. (Similar for $$\vee$$.)
 * $$I \models \neg \varphi$$ iff $$I \not\models \varphi$$.
 * $$I \models \exists x \phi(x)$$ if $$I \models \phi(x \leftarrow c)$$ for some $$c$$ in the domain.
 * $$I \models \forall x \phi(x)$$ if $$I \models \phi(x \leftarrow c)$$ for every $$c$$ in the domain.

Note: $$\phi(x \leftarrow c)$$ denotes the result of substituting $$c$$ for every free occurrence of $$x$$ in $$\phi$$. Substitution is covered further later in this chapter.

Axiomatic approach
The axioms are a set of sentences meant to distinguish "desired" models from others. But typically, also have "undesired" models, which are called non-standard models.

Examples: Consider the vocabulary $$(\mathit{next}, +, \times, \uparrow, 0, =, <)$$, where the symbols have the usual meaning (defined by FO sentences over this vocabulary). The standard interpretation for this set of axioms (see the end of this chapter) is the integers, but these are also satisfied by the set of integers modulo $$p$$. This possibility is ruled out by adding the following sentence to the axioms: $$\forall x (x < \mathit{next}(x))$$

Question: Can all non-standard models be axiomatized away? The answer is no. Consider the model shown in [TODO: import figure] for the vocabulary containing only $$0,<,\sigma$$ (all elements in the upper line are larger than those on the lower line).

There is no set of first-order sentences that can distinguish this model from the natural numbers. Intuitively, the reason is that we cannot "backtrack" arbitrarily (towards $$0$$) in a first-order sentence. A similar non-standard model can be obtained for the full vocabulary above.

Evaluating FO Sentences
Given a first-order structure $$I$$ and a FO sentence $$\phi$$, can we tell if $$I \models \phi$$?

This problem is decidable for finite structures. For example, suppose $$E$$ is a binary relation representing the edges of a graph, and the given sentence is
 * $$\exists x_{1} \exists x_{2} \exists x_{3}(E(x_{1}, x_{2}) \wedge E(x_{2}, x_{3}) \wedge E(x_{3}, x_{1}) \wedge x_{1} \neq x_{2} \wedge x_{2} \neq x_{3} \wedge x_{3} \neq x_{1} )$$.

We can evaluate the truth of this sentence by trying all possible values for $$x_{1}, x_{2}$$ and $$x_{3}$$. (Naive evaluation: "nested loop".) This is polynomial time (in domain size) but exponential in the size of the formula.

On infinite domains, we may or may not be able to evaluate the truth of a sentence. With the vocabulary $$(N, 0, \sigma, <, +)$$, where $$N$$ is the set of natural numbers, it is decidable if a sentence is true. (This is known as Presburger arithmetic.) However if we include multiplication, the truth of a sentence becomes undecidable.

A Deductive System for FO
A set $$\Delta$$ of FO sentences is
 * Satisfiable if there is some structure $$I$$ such that $$I \models \Delta$$
 * Unsatisfiable if it is not satisfiable
 * Valid if $$I \models \Delta$$ for all structure $$I$$

Given a set of FO sentences $$\Sigma$$ and a sentence $$\phi$$
 * $$\Sigma$$ entails $$\phi$$ (denoted $$\Sigma \models \phi$$), if every structure that satisfies $$\Sigma$$ also satisfies $$\phi$$.
 * $$\Sigma \models \phi$$ iff $$\Sigma \cup \{\neg \phi\}$$ is unsatisfiable.
 * If $$\Sigma$$ is finite then, $$\Sigma \models \phi$$ iff $$\neg \Sigma \lor \phi$$ is valid.
 * If a formula $$\phi$$ has free variables $$\vec{X} = (x_{1}, \ldots, x_{k})$$, $$\phi$$ is valid iff $$\forall \vec{X} \phi$$ is valid.

Axioms for Validity
The axioms for validity are from three categories:
 * 1) Axioms for boolean validity. These are inherited from propositional logic.
 * 2) Axioms for equality.
 * 3) Axioms for quantification.

Boolean Validity
Given a FO formula $$\varphi$$, a boolean form of $$\varphi$$ a propositional formula $$\psi$$ such that $$\varphi$$ is obtained from $$\psi$$ by replacing each propositional variable in $$\psi$$ by a subformula of $$\varphi$$.

The set of Boolean forms of $$\varphi$$ is denoted $$BF(\varphi)$$.

Examples:
 * For the FO formula $$\forall x P(x) \lor \neg \forall x P(x)$$, the boolean form is $$p \lor \neg p$$.
 * If $$\varphi \equiv \forall x G(x,y) \land \exists x G(x,y) \land (G(z,x) \lor \forall x G(x,y)) $$, then $$BF(\varphi) \equiv x_{1} \land x_{3} \land (x_{2} \lor x_{1}) $$ where $$x_1 = \forall x G(x,y), x_2 = G(z,x), x_3 = \exists x G(x,y)$$.

Claim: If $$\psi \in BF(\varphi)$$ and $$\psi$$ is valid, then $$\varphi$$ is valid.

Equality Axioms
Let $$t, t_{1}, \ldots, t_{k}, t'_{1}, \ldots, t'_{k}$$ be terms. The following are valid formulas.


 * $$t = t$$
 * $$(t_{1} = t'_{1} \land \cdots \land t_{k} = t'_{k}) \rightarrow (f(t_{1}, \ldots, t_{k}) = f(t'_{1}, \ldots, t'_{k}))$$, where $$f$$ is a $$k$$-ary function.
 * $$(t_{1} = t'_{1} \land \cdots \land t_{k} = t'_{k}) \rightarrow (R(t_{1}, \ldots, t_{k}) \rightarrow R(t'_{1}, \ldots, t'_{k}))$$, where $$R$$ is a $$k$$-ary relation.

Substitutions
Given a formula $$\varphi$$ in which variable $$x$$ occurs free (denoted by $$\varphi(x)$$) and a term $$t$$, we define the substitution of $$t$$ for $$x$$ in $$\varphi$$, denoted $$\varphi(x \leftarrow t)$$, as the formula that results from replacing every free occurrence of $$x$$ in $$\varphi$$ by $$t$$, subject to the constraint that $$t$$ contains no variable $$y$$ quantified in $$\varphi$$ such that $$x$$ occurs free within the scope of quantification of $$y$$. If $$x$$ does not occur free $$\varphi$$, then $$\varphi(x \leftarrow t)$$ is defined as $$\varphi$$.

Example: Let $$\varphi$$ be
 * $$((x = 1) \rightarrow \exists x (x = y))$$,

then
 * $$\varphi(x \leftarrow (y + 1))$$ is a valid substitution
 * $$\varphi(y \leftarrow (x + 1))$$ is not a valid substitution.

Quantification Axioms

 * 1) $$\forall x \varphi \rightarrow \varphi(x \leftarrow t)$$, where $$t$$ is a term and $$\varphi(x \leftarrow t)$$ is a valid substitution
 * 2) $$(\forall x (\varphi \rightarrow \psi)) \rightarrow ((\forall x \varphi) \rightarrow (\forall x \psi))$$
 * 3) $$\varphi \rightarrow \forall x \varphi$$
 * 4) $$\exists x \varphi \leftrightarrow \neg \forall x \neg \varphi$$

In summary, the axioms for validity are $${\mathbf {Ax}}$$:
 * 1) Axioms for boolean validity. These are inherited from propositional logic.
 * 2) Axioms for equality.
 * 3) Axioms for quantification.

Definition: A proof is a sequence $$\phi_{1}, \phi_{2}, \ldots, \phi_{k}$$ of FO sentences such that for each $$i \in \{1, \ldots, k\}$$ either $$\phi_{i} \in{\mathbf {Ax}}$$ or $$\exists j, k < i$$ such that $$\phi_{j} \equiv \psi$$ and $$\phi_{k} \equiv (\psi \rightarrow \phi_{i})$$.

Notation: If there is a proof of $$\phi$$ using $${\mathbf{Ax}}$$, we denote this fact by $$\vdash_{\mathbf {Ax}}\phi$$.

Fact (Soundness): If $$\vdash_{\mathbf {Ax}} \phi$$, then $$\phi$$ is valid.

Remark: The set of formulas which can be proven valid is recursively enumerable.

Example: Proof of the formula $$\forall x (\phi \land \psi) \rightarrow (\forall x \phi \land \forall x \psi)$$


 * $$\forall x (\phi \land \psi)$$
 * $$\forall x (\phi \land \psi) \rightarrow (\phi \land \psi) (x \leftarrow x)$$
 * $$\phi \land \psi$$
 * $$\phi$$
 * $$\forall x (\phi \land \psi) \rightarrow \phi$$
 * $$\forall x (\forall x (\phi \land \psi) \rightarrow \phi)$$
 * $$\forall x \forall x (\phi \land \psi) \rightarrow \forall x \phi$$
 * $$\forall x (\phi \land \psi) \rightarrow \forall x \phi$$
 * $$\forall x \phi$$

The proof of $$\forall x \psi$$ is symmetrical.
 * $$\forall x \phi \land \forall x \psi$$
 * $$\forall x (\phi \land \psi) \rightarrow (\forall x \phi \land \forall x \psi)$$

Useful Equivalences

 * $$\forall x (\phi \land \psi) \leftrightarrow (\forall x \phi) \land (\forall x \psi)$$
 * $$\forall x (\phi \lor \psi) \not\leftrightarrow (\forall x \phi) \lor (\forall x \psi)$$ e.g. $$\phi(x) \approx$$ "x is even" and $$\psi(x)\approx$$ "x is odd"
 * $$\forall x (\phi(x) \lor \neg \phi(x)) \not\leftrightarrow (\forall x \phi(x) \lor \forall x \neg \phi(x))$$
 * $$\exists x (\phi \lor \psi) \leftrightarrow (\exists x \phi) \lor (\exists x \psi)$$
 * $$\neg \exists x \phi \leftrightarrow \forall x \neg \phi$$
 * $$\neg \forall x \phi \leftrightarrow \exists x \neg \phi$$

Prenex Normal Form
Claim: Every FO sentence is equivalent to a FO sentence of the form $$Q_{1} x_{1} \cdots Q_{k} x_{k} \phi$$, where $$Q_{i} \in \{\exists, \forall \}$$ and $$\phi$$ is quantifier free.

This is proved using the distributive properties of quantifiers. It may be necessary to rename variables, to avoid ambiguities. Although you can always move all quantifiers to the left, the resulting formula can actually be exponentially larger than the original one.

Example:
 * $$\phi \equiv \forall x (G(x,x) \land (\forall y G(x,y) \lor \exists y \neg G(y,y))) \land G(x,0)$$
 * $$\equiv (\forall x (G(x,x) \land (\forall y G(x,y) \lor \exists y \neg G(y,y)))) \land G(w,0)$$
 * $$\equiv\forall x (G(x,x) \land (\forall y G(x,y) \lor \exists y \neg G(y,y)) \land G(w,0))$$
 * $$\equiv\forall x (G(x,x) \land (\forall y G(x,y) \lor \exists z \neg G(z,z)) \land G(w,0))$$
 * $$\equiv\forall x (G(x,x) \land \forall y (G(x,y) \lor \exists z \neg G(z,z)) \land G(w,0))$$
 * $$\equiv\forall x \forall y (G(x,x) \land (G(x,y) \lor \exists z \neg G(z,z)) \land G(w,0))$$
 * $$\equiv\forall x \forall y (G(x,x) \land \exists z (G(x,y) \lor \neg G(z,z)) \land G(w,0))$$
 * $$\equiv\forall x \forall y \exists z (G(x,x) \land (G(x,y) \lor \neg G(z,z)) \land G(w,0))$$

Recall the Axiomatic Method

 * 1) Describe the desired model as closely as possible using a (possibly infinite but recursive) set $$\Delta$$ of axioms (FO sentences).


 * 1) Prove things using deduction.

Notation: We say $$\phi$$ is a valid consequence of $$\Delta$$ (denoted $$\Delta \models \phi$$), if every model satisfying $$\Delta$$ also satisfies $$\phi$$.

Fact: $$\Sigma \models \phi$$ iff $$\Sigma \cup \{\neg \phi\}$$ is unsatisfiable.

Aside: Nonlogical axioms for number theory
The following fourteen first-order axioms describe the properties of arithmetic and numbers, i.e. addition ($$+$$), multiplication ($$\times$$), exponentiation ($$\uparrow$$, equality ($$=$$), ordering ($$<$$), successor function ($$\sigma$$) and remainder (mod). This example shows the expressive power of first-order statements, which were originally hoped to provide a basis for the "one true mathematics."

Question: Why are they called "nonlogical" axioms?


 * NT1: $$\forall x(\sigma(x) \ne 0)$$
 * NT2: $$\forall x \forall y (\sigma(x) = \sigma(y) \rightarrow x=y)$$
 * NT3: $$\forall x (x=0 \vee \exists y \sigma(y) = x)$$
 * NT4: $$\forall x (x+0=x) $$
 * NT5: $$\forall x \forall y(x+\sigma(y) = \sigma(x+y)) $$
 * NT6: $$\forall x (x\times 0 = 0) $$
 * NT7: $$\forall x \forall y (x\times \sigma(y) = (x \times y ) + x) $$
 * NT8: $$\forall x ( x \uparrow 0 = \sigma(0)) $$
 * NT9: $$\forall x \forall y (x \uparrow \sigma(y) = (x \uparrow y ) \times x ) $$
 * NT10: $$\forall x (x<\sigma(x)) $$
 * NT11: $$\forall x \forall y ( x < y \rightarrow (\sigma (x)\le y)) $$
 * NT12: $$\forall x \forall y ( \neg (x<y) \leftrightarrow y \le x) $$
 * NT13: $$\forall x \forall y \forall z (((x<y) \wedge (y<z)) \rightarrow x<z) $$
 * NT14: $$\forall x \forall y \forall z \forall z' (mod(x,y,z)\wedge mod(x,y,z') \rightarrow z = z') $$