22 Proof Systems, Syntax, Semantics, Equivalence, Satisfiability, Tautologies, Normal Forms, Types of Statements

Lecture from: 02.12.2024 | Video: Videos ETHZ

Formalizing Proof Systems within Logic

This last chapter aims to formalize the concepts of statements and proofs, treating them as mathematical objects rather than meta-concepts. We’ll build upon our previous exploration of proof systems and delve deeper into their structure, specifically within the context of logic.

Recall the core components of a proof system:

Statements ( $S$ ): The set of all possible statements, often represented as strings or formulas.
Proofs ( $P$ ): The set of all possible proofs, also typically represented as strings or structured sequences of steps.
Semantics ( $τ$ ): The interpretation function, assigning truth values to statements: $τ : S \to {0, 1}$ .
Verification Function ( $ϕ$ ): The function checking the validity of proofs: $ϕ : S \times P \to {0, 1}$ .

A well-defined proof system $Π = (S, P, τ, ϕ)$ should ideally possess two key properties:

Correctness (Soundness): If a proof is accepted, the corresponding statement must be true: $ϕ (S, P) = 1 ⟹ τ (S) = 1$ .
Completeness: If a statement is true, there exists a proof for it: $τ (S) = 1 ⟹ \exists P \in P (ϕ (S, P) = 1)$ .

Clarifying the Roles of $τ$ and $ϕ$

It’s essential to distinguish clearly between the roles of $τ$ (semantics) and $ϕ$ (verification function):

$τ$ (Semantics): $τ$ deals with the meaning of statements. It assigns a truth value to a statement based on its interpretation within the given logical system. For example, in propositional logic, $τ$ would evaluate the truth of a formula based on the truth values of its constituent propositions. Crucially, $τ$ doesn’t consider proofs ; it operates solely on statements. It determines whether a statement is inherently true or false based on logical rules of the systems without regard to whether we currently can produce such a proof or not.
$ϕ$ (Verification Function): $ϕ$ deals with the validity of proofs. It checks whether a given proof $P$ is a valid justification for a statement $S$ . It doesn’t assess the truth of $S$ directly ; instead, it checks whether $P$ adheres to the rules of the proof system and convincingly demonstrates $S$ . It is thus linked to syntactic properties such as correct application of the rules.

Focusing on Logical Proof Systems

In the previous lecture, we introduced a specific type of proof system based on logical consequence. In these systems:

Statements: Take the form $(M, G)$ , where $M$ is a set of formulas (premises or axioms) and $G$ is a single formula (the goal).
Semantics: $τ ((M, G)) = 1$ if and only if $M ⊨ G$ ( $G$ is a logical consequence of $M$ ).
Proofs: Consist of sequences of derivation steps, each justified by a rule of the proof calculus.
Verification: $ϕ ((M, G), P)$ checks if $P$ is a valid derivation of $G$ from $M$ according to the rules of the system.

Rebuilding from Scratch

Our goal now is to rebuild the foundations of propositional and predicate logic within this formal framework of proof systems. We’ll precisely define the syntax, semantics, and proof calculi for these logical systems, establishing their soundness and, where possible, completeness. This rigorous approach will provide a deeper understanding of how logical reasoning can be formalized and automated.

Incompleteness and the Limits of Formal Systems

As discussed previously, every sufficiently complex proof system has inherent limitations. Gödel’s incompleteness theorems demonstrate that within any formal system capable of expressing basic arithmetic, there will always be true statements that are unprovable within the system itself. Furthermore, a proof system cannot be used to prove its own consistency (freedom from contradictions). These limitations highlight the inherent boundaries of formal systems. Statements involving the properties of the calculus itself often fall into this realm of unprovable truths due to the circularity issue discussed earlier.

Defining Syntax: The Structure of Formulas

To work with logical statements formally, we need a precise syntax. Syntax defines the allowed structure and form of expressions within a logical system. It dictates which sequences of symbols constitute well-formed formulas.

We define an alphabet of symbols, $Λ$ , and formulas are sequences of symbols from this alphabet: $f_{1}, f_{2}, ..., f_{k} \in Λ^{*}$ . The syntax specifies which of these sequences are valid formulas. This is akin to defining the grammar of a programming language using EBNF notation, specifying how to correctly write for loops, if statements etc.

Inductive Definition of Formulas:

We often define the syntax of formulas inductively. For example, using the logical connectives $\neg$ (negation), $\land$ (conjunction), and $\lor$ (disjunction), we can define formulas as follows:

Atomic Formulas: The base case of the induction. These are the simplest formulas, often represented by letters (e.g., $A$ , $B$ , $C$ ). We can formalize this by saying atomic formulas are from the set $A = {A_{1}, A_{2}, A_{3}, ...}$ . We often use single capital letters as a more compact alternative.
Inductive Steps:
- If $F$ is a formula, then $\neg F$ is a formula.
- If $F$ and $G$ are formulas, then $(F \land G)$ is a formula.
- If $F$ and $G$ are formulas, then $(F \lor G)$ is a formula.

This inductive definition generates all syntactically valid formulas. It ensures that parentheses are used correctly and that formulas are built up from atomic formulas using the allowed connectives.

Examples of Syntax

1. Propositional Logic

In propositional logic, atomic formulas are typically individual propositional variables (e.g., $A, B, C$ ). The inductive definition above specifies how to construct more complex formulas using connectives.

Example: $((A \land \neg B) \lor (C \land \neg A))$ is a syntactically valid propositional formula. We typically omit the outermost parentheses where these don’t change the meaning and write it as $(A \land \neg B) \lor (C \land \neg A)$ .

2. Predicate Logic

Predicate logic extends propositional logic with predicates, variables, quantifiers ( $\forall$ , $\exists$ ), and function symbols.

Example: $\exists x \forall y (P (x, f (y)))$ is a syntactically valid predicate logic formula. Here, $P$ is a predicate, $x$ and $y$ are variables, and $f$ is a function symbol.

3. Arithmetic Expressions

We can also define syntax for arithmetic expressions.

Example: Consider the expression $(a + b) \cdot (a - b)$ . The alphabet would be $Λ = {(,), +, -, \cdot, a, b, ...}$ . We can define rules to specify valid arithmetic expressions (e.g., ensuring proper use of parentheses and operators).
A Note on Semantics: The statement $(a + b) \cdot (a - b) = a^{2} - b^{2}$ is not purely syntactic. It involves semantics – the meaning of the operators and the equality relation, and it assumes certain properties (like commutativity and distributivity) that depend on which ring we are considering (integers, reals, etc.) We could consider this to be syntactic if we introduced ”=” as another symbol and introduced appropriate derivation rules corresponding to mathematical axioms. The semantic correctness depends on the underlying algebraic structure (e.g., a commutative ring).

Semantics: Assigning Meaning to Formulas

Syntax defines the structure of formulas; semantics gives them meaning. Semantics specifies how to interpret formulas and determine their truth values. This process is more complex than it might initially seem and involves several key components.

Free Variables

The first step is to identify the free variables in a formula. Free variables are symbols within a formula that need to be assigned values before the formula’s truth value can be determined. We can define a function $f ree (F)$ that returns the set of free variables in formula $F$ .

Examples of Free Variables

Propositional Logic: In propositional logic, the atomic propositions (e.g., $A, B, C$ ) are the free variables. For example, in the formula $(A \land B) \lor C$ , the free variables are $f ree ((A \land B) \lor C) = {A, B, C}$ .
Predicate Logic: In predicate logic, free variables are variables not bound by a quantifier. For instance, in $\exists x \forall y (P (x, f (y)) \lor Q (z))$ , the free variables are $z$ , $P$ , $f$ and $Q$ : $f ree (\exists x \forall y (P (x, f (y)) \lor Q (z))) = {\underline{P}, \underline{f}, \underline{Q}, \underline{z}}$ . We usually assume a fixed universe (domain of discourse) for the interpretation. The functions and predicate names themselves are akin to free propositional variables from propositional logic. If we give them concrete functions and predicates the formula becomes concrete and we can evaluate it.

Interpretations

An interpretation $A$ assigns meaning to the symbols in a formula. It consists of two parts:

Universe/Domain: The set of values that variables can range over. For instance, in a inequality example we could be working with natural numbers as universe. In the predicate logic example our universe could be the real numbers.
Assignments: Assigning concrete values to the free variables in the formula. These must come from the universe. Furthermore, interpretations also specify mappings for function symbols (to functions over the universe) and predicate symbols (to relations over the universe).

An interpretation is matching for a formula if it provides values for all free variables in the formula.

Valuation Function ( $σ$ )

The valuation function $σ (F, A)$ (or simply $A (F)$ ) assigns a truth value (0 or 1) to a formula $F$ under a given interpretation $A$ .

If $A (F) = 1$ , we say that $A$ models $F$ , written as $A ⊨ F$ . This means $F$ is true under the interpretation $A$ .

Examples of Interpretations and Valuation

Propositional Logic: For the formula $(A \land B) \lor C$ , a matching interpretation must provide truth values for $A$ , $B$ , and $C$ . For instance, if $A (A) = 1$ , $A (B) = 0$ , and $A (C) = 1$ , then $A ((A \land B) \lor C) = (1 \land 0) \lor 1 = 0 \lor 1 = 1$ . Thus $A ⊨ F$ in this case.
Arithmetic Expression: For the formula $(a + b) (a - b) = (a \cdot a) - (b \cdot b)$ , an interpretation would involve choosing a universe (e.g., real numbers, integers) and assigning values to $a$ and $b$ from that universe. The valuation function would evaluate both sides of the equation according to the usual rules of arithmetic in the chosen universe and determine whether the equality holds. If our universe are matrices, then $A ((a + b) (a - b) = (a \cdot a) - (b \cdot b))$ could be either true or false depending on our choice of $a$ and $b$ . If we consider $A (a)$ and $A (b)$ as diagonal matrices in $R^{n \times n}$ , the equation will hold. Thus $A ⊨ F$ in this case.

Defining Semantics for Logical Connectives

We define the semantics of logical connectives by specifying how their truth values are determined based on the truth values of their subformulas.

Example: Conjunction ( $\land$ ): $A (F \land G) = 1$ if and only if $A (F) = 1$ and $A (G) = 1$ . This is an inductive definition: the meaning of a complex formula is defined in terms of the meanings of its simpler subformulas. It is important to note that this “and” is not a logical and, rather defines the logical and operator $\land$ .

By defining free variables, interpretations, and the valuation function, we establish a rigorous framework for assigning meaning to formulas and evaluating their truth values under different interpretations.

Showing Equivalence of Formulas

Having established the syntax and semantics of formulas, we can now explore how to demonstrate the equivalence of different formulas. Equivalence means that two formulas have the same truth value under all interpretations.

Logical Consequence (Entailment)

First, we need to define logical consequence (or entailment), denoted as $F ⊨ G$ . This signifies that whenever $F$ is true, $G$ must also be true, under any interpretation that makes sense for both formulas.

Logical Consequence ( $F ⊨ G$ )

$F ⊨ G$ if and only if for every interpretation $A$ that is matching for both $F$ and $G$ , if $A (F) = 1$ (i.e., $A ⊨ F$ is true), then $A (G) = 1$ (i.e., $A ⊨ G$ is also true).

This can be stated more concisely: every model of $F$ is also a model of $G$ .

Equivalence ( $\equiv$ )

Equivalence, denoted $F \equiv G$ , is defined as two-way logical consequence:

Equivalence ( $F \equiv G$ )

$F \equiv G$ if and only if $F ⊨ G$ and $G ⊨ F$ .

This means that for every matching interpretation $A$ , $A (F) = A (G)$ . The formulas $F$ and $G$ have the same truth value under all interpretations. They are semantically interchangeable.

Example: De Morgan’s Law

Let’s demonstrate how to show an equivalence using De Morgan’s Law:

\neg (F \land G) \equiv \neg F \lor \neg G

To prove this equivalence, we need to show both directions of the logical consequence:

$\neg (F \land G) ⊨ \neg F \lor \neg G$ : Assume an arbitrary interpretation $A$ such that $A (\neg (F \land G)) = 1$ . This means $A (F \land G) = 0$ . By the definition of conjunction, this implies that either $A (F) = 0$ or $A (G) = 0$ (or both). If $A (F) = 0$ , then $A (\neg F) = 1$ , and thus $A (\neg F \lor \neg G) = 1$ . Similarly, if $A (G) = 0$ , then $A (\neg G) = 1$ , and thus $A (\neg F \lor \neg G) = 1$ . Therefore $A (\neg (F \land G)) = 1 ⟹ A (\neg F \lor \neg G) = 1$ .
$\neg F \lor \neg G ⊨ \neg (F \land G)$ : Assume an arbitrary interpretation $A$ such that $A (\neg F \lor \neg G) = 1$ . This means either $A (\neg F) = 1$ or $A (\neg G) = 1$ (or both). If $A (\neg F) = 1$ , then $A (F) = 0$ . Consequently, $A (F \land G) = 0$ , and so $A (\neg (F \land G)) = 1$ . A similar argument applies if $A (\neg G) = 1$ . Therefore $A (\neg F \lor \neg G) = 1 ⟹ A (\neg (F \land G)) = 1$ .

Since both directions of the logical consequence hold, we have established the equivalence $\neg (F \land G) \equiv \neg F \lor \neg G$ . This method of demonstrating equivalence by proving both directions of logical consequence is a fundamental technique in logic.

Satisfiability, Unsatisfiability, and Tautology

We can now classify formulas based on their truth values under different interpretations:

Satisfiable: A formula $F$ is satisfiable if there exists at least one interpretation $A$ such that $A (F) = 1$ (i.e., $A ⊨ F$ ). In other words, there’s at least one way to assign values to the free variables that makes the formula true.
Unsatisfiable (or Contradictory): A formula $F$ is unsatisfiable if there is no interpretation $A$ such that $A (F) = 1$ . The formula is false under all interpretations. We often use $⊥$ to denote a generic unsatisfiable formula (a contradiction).
Tautology (or Valid): A formula $F$ is a tautology if $A (F) = 1$ for every interpretation $A$ . The formula is true under all interpretations. We often use $⊤$ to represent a generic tautology.

Lemmas and Equivalences

Lemma 6.2: Negation of a Tautology

The negation of a tautology is unsatisfiable.

Proof

For every formula $F$ : $⊨ F$ (meaning $F$ is a tautology) if and only if $\neg F ⊨ ⊥$ (meaning $\neg F$ is unsatisfiable). This follows directly from the definitions. If $F$ is a tautology, then $A (F) = 1$ for all $A$ . Therefore, $A (\neg F) = 0$ for all $A$ , meaning $\neg F$ is unsatisfiable. Conversely, if $\neg F$ is unsatisfiable, meaning $A (\neg F) = 0$ for all $A$ , then $A (F) = 1$ for all $A$ thus $F$ is a tautology.

Lemma 6.3: Logical Consequence and Tautology

We’re often interested in whether a formula $G$ is a logical consequence of a set of formulas ${F_{1}, F_{2}, ..., F_{k}}$ .

${F_{1}, F_{2}, ..., F_{k}} ⊨ G$ if and only if $(F_{1} \land F_{2} \land ... \land F_{k}) \to G$ is a tautology.

Proof

Let’s analyze why these formulations are equivalent.

${F_{1}, F_{2}, ..., F_{k}} ⊨ G$ : This means that for any interpretation $A$ , if $A ⊨ F_{i}$ for all $i$ from 1 to $k$ , then $A ⊨ G$ .
$(F_{1} \land F_{2} \land ... \land F_{k}) \to G$ is a tautology: This means for any interpretation $A$ , $A ((F_{1} \land F_{2} \land ... \land F_{k}) \to G) = 1$ . By the definition of implication, this is equivalent to saying that if $A (F_{1} \land F_{2} \land ... \land F_{k}) = 1$ , then $A (G) = 1$ . And $A (F_{1} \land F_{2} \land ... \land F_{k}) = 1$ if and only if $A (F_{i}) = 1$ for all $i$ .

Thus both formulations express the same condition.

A Third Equivalent Formulation: Unsatisfiability

${F_{1}, F_{2}, ..., F_{k}} ⊨ G$ if and only if the set ${F_{1}, F_{2}, ..., F_{k}, \neg G}$ is unsatisfiable.

Proof

This formulation is particularly relevant for resolution calculi, which focus on proving unsatisfiability.

${F_{1}, F_{2}, ..., F_{k}, \neg G}$ is unsatisfiable: This means there’s no interpretation $A$ such that $A ⊨ F_{i}$ for all $i$ and $A ⊨ \neg G$ .
Contrapositive: The contrapositive is: If there exists an interpretation such that $A ⊨ F_{i}$ for all $i$ , then it cannot be the case that $A ⊨ \neg G$ , meaning we must have $A ⊨ G$ , which is precisely the definition of ${F_{1}, F_{2}, ..., F_{k}} ⊨ G$ .

These lemmas provide alternative ways to express logical consequence, which are useful in different contexts and for different proof techniques. They link the concepts of tautology, satisfiability, and logical entailment. This connection is particularly relevant for resolution-based proof methods, where the goal is to demonstrate the unsatisfiability of a set of formulas.

Normal Forms in Propositional Logic

It’s often desirable to express propositional formulas in a standardized format called a normal form. This is analogous to simplifying algebraic expressions to a standard form like sums of products. Normal forms facilitate manipulation and comparison of formulas and are crucial for certain automated reasoning techniques.

Two common normal forms are Conjunctive Normal Form (CNF) and Disjunctive Normal Form (DNF). Before defining these, we introduce the concept of a literal.

Literals

A literal is either an atomic formula (e.g., $A$ , $B$ ) or the negation of an atomic formula (e.g., $\neg A$ , $\neg B$ ).

Conjunctive Normal Form (CNF)

A formula is in CNF if it’s a conjunction of clauses, where each clause is a disjunction of literals. Structurally, it looks like this:

Example: $(A \lor B \lor \neg C) \land (\neg A \lor D) \land (C \lor \neg B)$ is in CNF.

Disjunctive Normal Form (DNF)

DNF is the dual of CNF. A formula is in DNF if it’s a disjunction of conjunctions of literals.

Example: $(A \land B \land \neg C) \lor (\neg A \land D) \lor (C \land \neg B)$ is in DNF.

Clauses

A clause is a disjunction of literals. It forms a basic building block in CNF. In DNF, the corresponding building blocks are conjunctions of literals.

Theorem: CNF and DNF Equivalence

Every propositional formula can be converted into an equivalent formula in CNF and an equivalent formula in DNF.

Constructing DNF and CNF from a Truth Table

One way to construct equivalent DNF and CNF formulas is using a truth table:

Constructing DNF: For each row of the truth table where the formula evaluates to 1, create a conjunction of literals that corresponds to that row. Then, take the disjunction of all these conjunctions. This results in an equivalent DNF formula. For example, if a row has $A = 1, B = 0, C = 1$ and $F = 1$ , the corresponding conjunction is $(A \land \neg B \land C)$ .
Constructing CNF: For each row where the formula evaluates to 0, create a disjunction of literals that corresponds to the negation of that row. For example, if we have $A = 1$ , $B = 0$ , $C = 1$ and $F = 0$ , we have $(\neg A \lor B \lor \neg C)$ . Then, take the conjunction of all these disjunctions. This results in an equivalent CNF formula. You can also construct the DNF of the negated formula, and use De Morgan’s laws to obtain an equivalent CNF formula.

Efficiency Considerations

While the truth table method guarantees a conversion to CNF or DNF, it can be inefficient for formulas with many variables, as the truth table size grows exponentially with the number of variables ( $2^{n}$ rows for $n$ variables). More efficient methods, using equivalence transformations, often lead to more compact CNF or DNF representations. These transformations systematically apply equivalences like De Morgan’s laws, distributive laws, and double negation elimination to manipulate the formula into the desired normal form.

Using these equivalence transformations we can construct equivalent CNF or DNF formulas much faster than using the truth table method.

Types of Statements in Logic

While we’ve focused on formulas, it’s important to distinguish between formulas and statements about formulas. Statements express assertions or claims about formulas, their properties, or their relationships. We can categorize these statements into different types:

1. Logical Consequence from a Set of Formulas

This type of statement asserts that a formula $F$ is a logical consequence of a set of formulas $T$ .

Form: $T ⊨ F$ , where $T$ is a set of formulas and $F$ is a single formula.
Meaning: $F$ is true under every interpretation where all formulas in $T$ are true.
Example: In axiomatic systems, $T$ represents the set of axioms, and $F$ is a theorem we want to prove. Group theory or the Peano axioms are examples.

2. Satisfiability, Insatisfiability, and Tautology

These statements assert properties of a single formula.

Forms:
- $F$ is satisfiable.
- $F$ is unsatisfiable.
- $F$ is a tautology.
Meanings:
- Satisfiable: There exists at least one interpretation where $F$ is true.
- Unsatisfiable: $F$ is false under every interpretation.
- Tautology: $F$ is true under every interpretation.
Example: Checking whether a propositional formula is satisfiable is a classic NP-complete problem.

3. Truth Value under a Specific Interpretation

These statements assert the truth value of a formula under a given interpretation.

Form: $A ⊨ F$ or $A (F) = 1$ , where $A$ is a specific interpretation.
Meaning: $F$ is true under the interpretation $A$ .
Example: If $A (A) = 1$ and $A (B) = 0$ , then $A ⊨ (A \lor \neg B)$ .

4. Meta-Statements about Logical Systems

These are statements about properties of logical systems or proof calculi themselves.

Form: These statements are not about the truth of specific formulas but about higher-level properties of the logic itself (soundness, completeness, consistency, etc.)
Example: “The resolution calculus is sound and complete for propositional logic.” This statement asserts a property of the resolution calculus, not the truth of any particular propositional formula. Another example: “Gödel’s incompleteness theorem shows that certain formal systems contain true but unprovable statements.” These “meta-statements” are typically proven within a different, more powerful logical framework.

Continue here: 23 Predicate Logic Reintroduced, Syntax, Semantics, Universe Size

CS Notes

Explorer

22 Proof Systems, Syntax, Semantics, Equivalence, Satisfiability, Tautologies, Normal Forms, Types of Statements

Formalizing Proof Systems within Logic

Clarifying the Roles of τ and ϕ

Focusing on Logical Proof Systems

Rebuilding from Scratch

Incompleteness and the Limits of Formal Systems

Defining Syntax: The Structure of Formulas

Examples of Syntax

1. Propositional Logic

2. Predicate Logic

3. Arithmetic Expressions

Semantics: Assigning Meaning to Formulas

Free Variables

Examples of Free Variables

Interpretations

Valuation Function (σ)

Examples of Interpretations and Valuation

Defining Semantics for Logical Connectives

Showing Equivalence of Formulas

Logical Consequence (Entailment)

Equivalence (≡)

Example: De Morgan’s Law

Satisfiability, Unsatisfiability, and Tautology

Lemmas and Equivalences

Lemma 6.2: Negation of a Tautology

Proof

Lemma 6.3: Logical Consequence and Tautology

Proof

A Third Equivalent Formulation: Unsatisfiability

Proof

Normal Forms in Propositional Logic

Literals

Conjunctive Normal Form (CNF)

Disjunctive Normal Form (DNF)

Clauses

Theorem: CNF and DNF Equivalence

Constructing DNF and CNF from a Truth Table

Efficiency Considerations

Types of Statements in Logic

1. Logical Consequence from a Set of Formulas

2. Satisfiability, Insatisfiability, and Tautology

3. Truth Value under a Specific Interpretation

4. Meta-Statements about Logical Systems

Table of Contents

Graph View

Clarifying the Roles of $τ$ and $ϕ$

Valuation Function ( $σ$ )

Equivalence ( $\equiv$ )