Mathematical Proof and the Principles of Mathematics/Sets/History

We're now leaving the subject of logic (finally) and entering mathematics proper. This introductory section on set theory includes some historical and philosophical context.

The invention of sets
If you grew up in a generation where sets were introduced to you in elementary school, then you may be surprised to learn that the modern concept of a set dates back less than 150 years. The concept was created, not as a way of creating a formal basis for mathematics, by as a byproduct of the study of the rather specialized area of Fourier analysis. Nevertheless, it was discovered after just one generation that sets could be used to provide a logical foundation for the concept of a number and from there all of mathematics. When we say 'discovered' though, we should add that that was what was believed at the time, and now that a few more generations have gone by the initial optimism has been tempered somewhat.

The idea of a set is actually very simple, you could even say boring. You can think of a set as an bag of objects with no internal structure or relationships. It's such a simple idea that it's rather surprising that no one came up with the idea much earlier, or even that, like the concept of number, it wasn't created independently by civilizations all over the world.

Perhaps the problem is that concept is too simple, so simple in fact that it seems like nothing interesting could come from it. Yet despite its simplicity, there are subtleties. For example for some object, say Socrates, we distinguish between the object itself and the set which contains that object; Socrates is not (Socrates} in this example.

The initial idea
At first, the idea of a set was similar to that of a predicate; an object $$x$$ for which a given predicate $$P(x)$$ was True belonged in the set, and those for which $$P(x)$$ was False did not. The difference was that the set corresponding to $$P(x)$$ became a new object
 * $$\{x: P(x)\}$$

which was now part of the universe of discourse. This idea is so simple that it was thought, by some at the time at least, that sets could be subsumed into logic, and since mathematics could be subsumed into set theory, all of mathematics could be derived as a branch of logic. This program was actually attempted in the early part of the 20th century, mainly by Gottlob Frege and Bertrand Russell.

Russell's paradox
It wasn't long before cracks in this supposed foundation for mathematics became evident. At first they seemed minor and easily plastered over, but then Russell himself found a major paradox and it would require some major restructuring to salvage what had been produced so far.

The problem really comes from not being careful enough to avoid self referential statements. In fact the paradox can be stated without using the language of sets at all but using only predicates. If predicates are allowed to be objects, then it's possible to talk about predicates applied to other predicates. Once could then ask whether a predicate is true of itself or not. So you could define a predicate $$P(x)$$ meaning $$x(x)$$, in other words $$P(x)$$ is True if $$x$$ is a predicate and True for $$x$$. This isn't a problem yet but we've already got some self-reference going on. From here it's a small step to define the predicate $$Q(x)$$ meaning not $$x(x)$$, in other words $$Q(x)$$ is True if $$x$$ is a predicate and False for $$x$$.

The paradox kicks in when we try to determine whether $$Q(Q)$$ is True or False. If $$Q(Q)$$ is True then, since one of the requirements is that $$Q(Q)$$ be False, it cannot be said to be True of itself, and therefore $$Q(Q)$$ is False. But if $$Q(Q)$$ is False then we can check the requirements, $$Q(x)$$ is a predicate and $$Q(Q)$$ is False, so $$Q(Q)$$ is True. Either way we arrive at a contradiction. The mechanics of this are very similar to the Liar paradox which we mentioned when we were first talking about what a statement is. The root cause here, just as with the earlier paradox, is self-reference, though it's not as immediately apparent.

If you do use the language of set theory then the statement becomes a bit simpler. Write $$x \in y$$ when $$x$$ belongs to the set $$y$$. (We'll make this more formal on the next page.) We define $$Q$$ to be the set $$\{x: \operatorname{not} x \in x\}$$ and this time ask if $$Q \in Q$$. Again, by a similar argument, $$Q \in Q$$ is True implies it is False, and $$Q \in Q$$ is False implies it is True.

Resolving the paradox
Since the root cause of the paradox is self-reference, the way to avoid it is to prevent self-reference in the first place. Russell did this by introducing a hierarchy of types. Roughly, simple objects are at the bottom with type 0. Predicates on simple objects are at the next level with type 1. This scheme continues indefinitely with predicates at each level only applicable to objects at a lower level. Using this idea, Russell and his co-author, Alfred North Whitehead, carried through his program to put mathematics on a logical basis, resulting in the multi-volume Principia Mathematica. While a monumental and influential work, it was not entirely successful in its goal to give mathematics a purely logical foundation. For one thing, the hierarchy of types is cumbersome to work with and simpler alternatives are available. A second issue is that some of the axioms used have a distinctly mathematical, rather than logical, flavor and the position that the work is based on logic alone is hard to defend.

Another way to avoid self-reference, and the way that has been adopted as standard via the Zermelo–Fraenkel axioms of set theory, is to put severe restrictions on which predicates can be turned into sets. There are a few general axioms in the theory, and nearly a dozen set constructors which allow you to create new sets for various specific predicates. Due to the restrictions, self-reference is eliminated but there is enough variety in the constructors to still form a basis for most of mathematics. The disadvantage over Russel and Whitehead's hierarchy of types is that a separate axiom is needed for each constructor and most of the constructors seem be custom made for specific mathematical structures. This means that the axioms are definitely no longer based on logic alone and whether they are even true is still a matter of debate.

We'll introduce the general axioms next and add constructor axioms a few at a time, more or less as needed.