User:TakuyaMurata/Set theory

Sets and functions
It is tricky to define sets. For example, is $$\{ x ; x \not\in x \}$$ a set? (The answer is negative according to the standard axioms of set theory. Read Russell's paradox to find out why.) Lucky for us, we can still study most of problems in analysis with native understanding of sets, except for Axiom of Choice, which we need to obtain several important results in analysis. Instead of studying set theory for its own sake, we shall define some basic concepts (e.g., functions) and study properties of sets, in particular, those of infinite sets.

A pair $$(x, y)$$ is, by definition, a set $$\{ x, \{ y \} \}$$. Clearly, $$(x, y) \ne (y, x)$$. n-tuple is defined recursively: $$(x_1, x_2, ..., x_n) = \{ (x_1, x_2, ..., x_{n-1}), \{x_n\} \}$$. Given a finite set $$A$$, $$|A|$$ denotes the number of elements in $$A$$. So, for example, $$|\{x, x \}| = 1$$ while $$|(x, x)| = 2$$.

A function $$f$$ is a nonempty set consisting of pairs such that $$(a, b), (a, c) \in f$$ implies that $$b = c$$. We thus understand that $$f$$ uniquely maps an element $$a$$ to $$b$$, which we denote by $$f(a)$$. We write $$f:A \to B$$ to refer to a function $$f$$ such that $$f(x)$$ is defined and $$f(x) \in B$$ for every $$x \in A$$. $$A$$ is called the domain of $$f$$. Given a set $$S$$, define $$f(S) = \{ f(x) | x \in S \}$$; this is called the image of $$S$$ under $$f$$. This means that $$f$$ induces a mapping from the power set of its domain into the power set of its range. When $$A$$ is the domain of $$f$$, $$f(A)$$ is called the image or range of $$f$$.

It is of a theoretical importance (or even philosophical one) that functions may not be given by a formula. While for a function, say $$f(x) = 2x$$, we know how it maps elements; i.e., f sends 1 to 2, $$\pi$$ to $$2\pi$$, etc, there need not be a way to specify precisely (i.e., closed form) how a function maps each element. Often, we only know of or hypothesize the existence of a function having certain properties without knowing exactly how it is constructed.

The most troublesome kind of function is a choice function. Let $$\Omega$$ be a nonempty collection of sets. A choice function
 * $$f: \Omega \to \bigcup \Omega$$

is a function such that $$f(s) \in s$$. A choice function can be defined easily if $$\Omega$$ is finite. Say, $$\Omega = \{ \{ 1, 2, 3 \}, \{ 2, 4, 6 \}, \{ 10, 20 \} \}$$. Define $$f$$ by
 * $$f(\{1, 2, 3\}) = 1$$, $$f(\{2, 4, 6\}) = 2$$ and $$f(\{10, 20\}) = 10$$.

On some occasions even if $$\Omega$$ is infinite, we can still define a choice function without much trouble: suppose $$\Omega$$ consists of subsets of $$\mathbf{Z}$$. Let $$f(S) = \min S$$ for $$S \in \Omega$$. Then $$f$$ is a choice function.

If I is any index set, and {Xi&thinsp;|&thinsp;i ∈ I} is a collection of sets indexed by I, then the Cartesian product of the sets in X is defined to be


 * $$\prod_{i \in I} X_i = \{ f : I \to \bigcup_{i \in I} X_i\ |\ (\forall i)(f(i) \in X_i)\},$$

that is, the set of all functions defined on the index set such that the value of the function at a particular index i is an element of Xi.

The pre-image of $$S$$ under $$f: A \to B$$, denoted by $$f^{-1}(S)$$, is the set $$\{ x \in A | f(x) \in S \}$$. Thus, $$f^{-1}$$ is a map from sets to A. (There is a slight technical problem in that any set is allowed, but that's really a non-issue since $$f^{-1}(S) = f^{-1}(S \cap B)$$.) A function $$f$$ is said to be (i) injective, (ii) surjective and (iii) bijective when (i) $$|f^{-1}(\{x\})| \le 1$$, (ii) $$|f^{-1}(\{x\})| \ge 1$$ and (iii) $$|f^{-1}(\{x\})| = 1$$, respectively. Equivalently, $$f$$ is injective if and only if $$f(x) = f(y)$$ implies $$x = y$$ for every $$x, y \in A$$, and $$f$$ is surjective if $$f(A) = B$$. Clearly, a function is bijective if and only if it is injective and surjective.

Given two functions $$f, g$$, we write $$f \circ g (x) = f(g(x))$$ and this is called a composition. It is clear that $$f \circ (g \circ h) = (f \circ g) \circ h$$. On the other hand, compositions need not commute. In fact, let $$f(x) = 3$$ and $$g(x) = 5$$. Then $$(g \circ f)(x) = 5 \ne 3 = (f \circ g)(x)$$ for all $$x$$. The identity function, or the identity for short, $$\mbox{id}_A:A \to A$$ is a function such that $$\mbox{id}_A(x) = x$$ for all $$A \in x$$. A bijection $$f:A \to B$$ induces the map
 * $$x \mapsto f^{-1}(\{x\}): B \to A$$,

which we denote by $$f^{-1}$$, and we have: $$f^{-1} \circ f = \operatorname{id}_A$$ and $$f \circ f^{-1} = \operatorname{id}_B$$. Note that every injection is bijective between its domain and its range. This can be put more precisely as follows:

1.1 Theorem ''Let $$f:A \to B$$ be a function. Then Proof: ($$\Rightarrow$$) As remarked above, $$f : A \to \operatorname{ran}(A)$$ is bijective, and so we take g to be the inverse of this function. Extend g to B. ($$\Leftarrow$$) If f(x) = f(y), then $$x = (g \circ f) (x) = (g \circ f) (y) = y$$. ($$\Rightarrow$$) in (ii) can be proved similarly. The proof of the converse, however, is nontrivial because we need a choice function. Let $$\Omega = \{ f^{-1}(\{x\}) | x \in B \}$$. That f is surjective ensures that $$\Omega$$ doesn't contain the empty set. Let $$\varphi$$ be a choice function on $$\Omega$$. (The existence of this function comes from Axiom of Choice.) Since
 * (i) $$f$$ is left-invertible; i.e., $$g \circ f$$ is the identity for some g if and only if $$f$$ is injective.
 * (ii) $$f$$ is right-invertible; i.e., $$f \circ g$$ is the identity for some g if and only if $$f$$ is surjective.
 * $$\varphi(f^{-1}(\{x\})) \in f^{-1}(\{x\})$$,

or equivalently,
 * $$f(\varphi(f^{-1}(\{x\}))) = \{x\}$$

if we define $$g: B \to A$$ by $$g(x) = \varphi(f^{-1}(\{x\})$$. then g is exactly the required function. $$\square$$

The restriction of $$f: A \to B$$ to $$C \subset A$$, denoted by $$f \mid_C$$, is a function $$g: C \to B$$ such that $$g(x) = f(x)$$ for every $$x \in C$$. Likewise, an extension of $$f: A \to B$$ to $$A \subset C$$ is any function $$g: C \to B$$ such that $$g(x) = f(x)$$ for every $$x \in A$$. Thus, every restriction is necessarily unique, while an extension need not. Consider this example. Let $$f(1) = 10, f(2) = 20, f(3) = 20$$. Then the domain (resp. the image) of $$f$$ is $$\{ 1, 2, 3 \}$$ (resp. $$\{ 10, 20 \}$$). The function $$f$$ is an example of a non-injection, while the restriction $$g$$ of $$f$$ to $$\{ 1, 2 \}$$ is injective.

Note that two finite sets have the same number of elements if and only if there is a bijection between the two sets. This gives the way to tell whether two sets have the same number of elements, without the notion of natural numbers. It thus generalizes to infinite sets; we shall write $$|A| = |B|$$ when there is a bijection between sets $$A$$ and $$B$$. It is immediate that $$|A| = |B|$$ and $$|B| = |C|$$ implies $$|A| = |C|$$. More generally, we have:

1 Theorem (Bernstein) ''Let $$f: A \to B$$ and $$g: B \to A$$ be injections. Then $$|A| = |B|$$.''

Proof: Define
 * $$S = \bigcup_{n=0}^\infty (g \circ f)^n (A \backslash g(B))$$

and
 * $$h(x) = \begin{cases}

f(x) & \mbox{ if } x \in S \\ (g|_{g(B)})^{-1}(x) & \mbox{ if } x \in A \backslash S \end{cases}$$ We claim $$h$$ is bijective. For that, first note that the following are equivalent. Also note that $$h^{-1} (\{ x \})$$ is the disjoint union of $$S \cap f^{-1}(\{x\})$$ and $$(A \backslash S) \cap \{ g(x) \}$$ for any $$x \in A$$. Hence, $$h^{-1} (\{ x \})$$ contains exactly one element for every $$x \in A$$. $$\square$$
 * $$\{ y \in S | x = f(y) \}$$ is nonempty.
 * $$x = f \circ (g \circ f)^n (z)$$ for some $$z \in A \backslash g(B)$$ and some $$n \ge 0$$.
 * $$g(x) \in S$$.

1 Corollary A countable union of at most countable sets is at most countable.

1 Theorem (Cantor, 1891) ''Let $$A$$ be a set. Then $$|A| < |\mathcal{P}(A)|$$. Here, $$\mathcal{P}(A) = \{ S | S \subset A \}$$, the power set of $$A$$.''

Proof : Define $$i: A \to \mathcal{P}(A)$$ by $$i(x) = \{ x \}$$. Then $$i(x) = i(y) = \{x\} = \{y\}$$ implies $$x = y$$. This is to say, $$i$$ is injective. Next, to show that no function $$f: A \to \mathcal{P}(A)$$ is surjective, let $$s = \{ x \in A | x \not\in f(x) \}$$. Then $$s \in \mathcal{P}(A)$$, but $$s$$ is not in the image of $$f$$. In fact, suppose $$s = f(x_0)$$ for some $$x_0 \in A$$. If $$x_0 \in f(x_0)$$, then $$x_0 \in s$$, which means, by definition, $$x_0 \not\in f(x_0)$$. On the other hand, if $$x_0 \not\in f(x_0)$$, then $$x_0 \in s = f(x_0)$$ by definition and so $$x_0 \in f(x_0)$$. We conclude $$s = f(x_0)$$ only leads to a contradiction.$$\square$$

In particular, the set of all functions from $$\mathbf{N}$$ to $$\{ 0, 1 \}$$ is uncountable.

Note that by definition an infinite set is a set that is not finite. This definition is negative. Following Dedekind, we now give a positive definition of infinite sets.

1 Theorem A set $$A$$ is infinite if and only if $$A$$ contains a proper subset $$B$$ such that $$|A| = |B|$$.

Proof (using Axiom of Choice): If $$A$$ is finite, then clearly $$|B| < |A|$$ when $$B$$ is a proper subset of $$A$$.

1 Corollary Every infinite set contains a countable subset.

If $$P$$ is a collection of sets, then we say an $$a$$ is maximal in $$P$$ if $$a \subset b \in P$$ implies $$a = b$$.

1 Theorem The following are equivalent. Proof: To show the converse, let $$S$$ be a collection of nonempty sets and $$P$$ be the family of all functions $$f$$ such that $$f$$ is a choice function on some $$Z \subset S$$. Now, Zorn's Lemma applied to $$P$$ gives a choice function on $$S$$.
 * Axiom of Choice.
 * Zorn's Lemma: If $$P$$ is a nonempty collection of sets, and if every $$Q \subset P$$ that is totally ordered by $$\subset$$ contains an element $$a$$ such that $$b \subset a$$ for all $$b \in Q$$, then $$P$$ has a maximal element.

We say a nonempty subset $$f$$ of the power set of $$X$$ is a filter (over $$X$$) if $$f$$ is called an ultrafilter if, in addition to the above,
 * (i) $$\varnothing \not\in f$$
 * (ii) If $$A$$ and $$B$$ are in $$f$$, then $$A \cap B \in f$$.
 * (iii) If $$A \in f$$ and $$A \subset B \subset X$$, then $$B \in f$$.
 * (iv) If $$A \cup B \in f$$, then either $$A \in f$$ or $$B \in f$$.

1 Proposition ''Let $$f$$ be a filter. If there is a finite set $$S \in f$$, then $$f$$ is principal.''

1 Theorem ''Let $$\mathcal{F}$$ be a nonempty collection. Then the following are equivalent: Proof: Suppose (i). If $$A \subset \mathcal{G}$$, then since $$\mathcal{F}$$ is an ultrafilter, $$A$$ or $$A^c$$ is in $$\mathcal{F}$$. If $$A^c \in F$$, then that means $$A \cap A^c = 0 \in \mathcal{F}$$, contradicting that $$F$$ has the finite intersection property. Thus, (i) $$\Rightarrow$$ (ii). To show (ii) implies (iii), let $$\mathcal{G}$$ consist of sets $$A$$ such that $$A_1 \cap A_2 ... \cap A_n \subset A$$ for some nonempty finite sequence $$A_1, A_2, ... A_n \in \mathcal{F}$$. Claim: $$\mathcal{G}$$ is a filter. Indeed, (1) $$0 \not\in \mathcal{G}$$ since no member of $$\mathcal{F}$$ is the empty set. (2) For $$A, B \in \mathcal{F}$$ both $$A$$ and $$B$$ must contain the intersection of the two intersections of finite subsets of $$\mathcal{G}$$; thus, $$A \cap B \in \mathcal{F}$$. (3) For $$A \in \mathcal{F}$$ and $$A \subset B$$ since $$A$$ contains the finite intersection so does $$B$$; hence, $$B \in \mathcal{F}$$. Using (ii) we have: $$\mathcal{F} = \mathcal{G}$$. Let $$\mathcal{F}_A = \mathcal{F} \cup \{A\}$$ and $$\mathcal{F}_B = \mathcal{F} \cup {B}$$ and claim: either $$\mathcal{F}_A$$ or $$\mathcal{F}_B$$ has the finite intersection property. To get a contradiction, suppose not; that is, we can find $$C$$ and $$D$$ in $$\mathcal{F}$$ such that $$C \cap A = 0 = D \cap B$$. But this then means: by the distributive law $$(C \cap D) \cap (A \cup B) = C \cap ( (D \cap A) \cup (D \cap B )) = C \cap D \cap A =$$ empty, a contradiction to the axioms of filters since both $$C \cap D$$ and $$A \cup B$$ are in $$\mathcal{F}$$. Since the claim holds, appealing to (ii) again we get either $$\mathcal{F} = \mathcal{F} \cup \{A\}$$ or $$\mathcal{F} = \mathcal{F} \cup \{B\}$$. We conclude that (ii) implies that (iii). Finally, (iii) implies (i) since for any $$A$$ $$A \cup A^c = 0^c \in \mathcal{F}$$ and by (iii) $$A$$ or $$A^c$$ is in $$\mathcal{F}$$. $$\square$$
 * (i) $$\mathcal{F}$$ is an ultrafilter.
 * (ii) If $$\mathcal{F} \subset \mathcal{G}$$ and $$\mathcal{F}$$ and $$\mathcal{G}$$ have the finite intersection property, then $$\mathcal{F} = \mathcal{G}$$. (i.e., $$\mathcal{G}$$ is maximal.)
 * (iii) $$\mathcal{F}$$ is a filter and if $$A \cup B \in \mathcal{F}$$, then either $$A \in \mathcal{F}$$ or $$B \in \mathcal{F}$$.

With regard to (ii) in the theorem, the real question is if such an ultrafilter indeed exists, which we shall show cannot be unprovable without Axiom of Choice.

1 Lemma (Ultrafilter lemma) Every filter $$s$$ is contained in an ultrafilter.

Proof : Let $$\Gamma$$ be the set of all filters that contains $$s$$. Let $$\gamma \subset \Gamma$$ be a linearly ordered subcollection. We claim that the union of $$\gamma$$, which we call $$u$$, is in $$\Gamma$$. Trivially, $$s \subset u$$. Next, since $$u$$ is the union of filters, it doesn't contain the empty set. If $$A, B \in u$$, then $$A \in g_1$$ and $$B \in g_2$$. By linearity, either $$g_1 \subset g_2$$ or $$g_2 \subset g_1$$. Either way, we have $$A \cap B \in u$$. Finally, if $$A \in u$$ and $$A \subset B \subset X$$, then $$A \in g$$ for some $$g \in \gamma$$ and so $$B \in g \subset u$$. By Zorn's lemma, $$\Gamma$$ contains a maximal element. $$\square$$