Sets are not the end all be all

Disclaimer: I am by no means an expert on foundations, so I cannot guarantee that mistakes, pedagogical/philosophical oversights, etc are not present.

Mathematicians study collections of objects. (Here when we say “collection” we mean it in a very informal sense.) For instance, we can make claims about every even integer, or every isosceles triangle.

Once upon a time, there was this naive conception of collections as sets, and in order to describe as many collections as possible using the language of naive set theory, a notion of unrestricted comprehension was adopted. (We will get into what that means shortly.)

However, this wasn’t good enough: with unrestricted comprehension we got Russell’s Paradox. This is probably old news for most readers of this blog, but we will quickly and informally cover Russell’s Paradox again.

Unrestricted comprehension says that given some predicate ϕ\phi, there exists a set SS whose members are precisely those that satisfy ϕ\phi. Because xxx \not \in x is a valid predicate, there exists a set SS consisting of sets that do not contain themselves. Yet the existence of this SS is self-contradictory:

The axiom of unrestricted comprehension was strong. Too strong, in fact, which is why it led to a contradiction. So rather than having a set of implicit, “obvious” axioms with which you could construct sets, mathematicians actually shored up the exact axioms. They comprise Zermelo–Fraenkel Set Theory with the Axiom of Choice, abbreviated to ZFC.

So how does our newly constructed set theory really work, exactly? In other words, what is ZFC? This is where the YouTube videos all stop, and where the meat really begins.

I am sure that at many points in your life, you have referred to the set of all natural numbers. But step back for a second. How do we even know such a set exists? In other words, what justifies us taking this collection of all natural numbers (which may or may not be a set) and decreeing it to be a set? It is ZFC that precisely allows us to do this.

Our set theory is constructive in nature. Here is how ZFC works: given some pre-existing sets, we can apply an axiom of ZFC to explicitly construct another set.1 For example, given two sets AA and BB, we may construct the set A,B{A, B}. Furthermore, ZFC gives us two sets to start with: the empty set, i.e. the set with no elements, and an infinite set. And we may only decree a collection as a set once we have shown it can be constructed via the axioms of ZFC.2

Furthermore, in ZFC, the only objects of study are sets. On the surface, this seems inadequate: how can we define the natural numbers, let alone functions, shapes, etc? But the answer is simple: we define the natural numbers as a sequence of sets. In particular, we define

(Note that this implies n+1=0,1,,nn + 1 = {0, 1, \ldots, n} as well.)

We also define functions as subsets of Cartesian Products. Say we have a function f:{0,1}{0,1}f \colon \{0, 1\} \to \{0, 1\} where f(x)=xf(x) = x. Then we may define ff to be the set {(0,0),(1,1)}\{(0, 0), (1, 1)\}. Note that there is exactly one ordered pair with 00 as the xx-coordinate, and likewise for 11. The yy-coordinate is the value associated with the function.

Of course, you might ask what an ordered pair is. And the answer, again, is a set: specifically, (x,y)={x,{x,y}}(x, y) = \{x, \{x, y\}\}. (It is left as an exercise to show that (x1,y1)=(x2,y2)(x_1, y_1) = (x_2, y_2) if and only if x1=x2x_1 = x_2 and y1=y2y_1 = y_2, in other words, that ordered pairs indeed uniquely identify the values of the two coordinates.)

You get the idea. We now can explicitly show that certain sets exist and study them. Furthermore, we can encode the rest of mathematics within the language of set theory.

But there is one major catch. Let’s go back to the reason mathematicians created ZFC in the first place: because naive set theory led to a contradiction (Russell’s Paradox). And all of our work with ZFC is largely pointless if ZFC turns out to also lead to a contradiction. So is ZFC consistent? That is, is ZFC free of contradictions?

I have two pieces of bad news for you. The first is that we don’t know whether ZFC is consistent. And the second much worse piece of news is that if ZFC is consistent, we will never know. This is because of Godel’s Second Incompleteness Theorem: because ZFC is sufficiently strong,

Therefore, the only way we will ever know whether ZFC is consistent is by finding an inconsistency. And this is true of any logical system that is strong enough to perform arithmetic. So we will never truly know whether our model of set theory is self-consistent, regardless of what model we use.

But ZFC seems reasonable enough and no one has found a contradiction yet. Having done the best we can, we move on.3

Classes

I can prove to you that any singleton set SS satisfies the (trivial) statement “xSySx=yx \in S \land y \in S \implies x = y”. Similarly, I can make claims about every group, every homeomorphism, and so on. So the collection of, say, singleton sets is worth making statements about.

But the collection of all singleton sets is not a set! For if it was, then we may take {{a},{b},}\bigcup \{\{a\}, \{b\}, \ldots\} and get {a,b,}\{a, b, \ldots\}, i.e. the set of all sets, which leads to a contradiction (as that set would then contain itself).

It is possible to just be satisfied with statements like “for every singleton set…” and move on. But it is quite natural to ask what the collection of singleton sets really is, and whether it has anything in common with other non-set collections like the collection of groups. To that end, we define a notion of a class.

Very briefly, a class is just a collection of sets that can be described. To really understand what a class is, we return to unrestricted comprehension.

We cannot define {SS is a singleton set}\{S \mid S \text{ is a singleton set}\} as a set. But we may define it as a class. And this is how classes are defined in general: it is just the collection of sets satisfying a certain predicate ϕ\phi. So saying that set SS belongs to class CC really just means that SS satisfies ϕ\phi.

A little more detail about ZFC: because unrestricted comprehension leads to Russell’s paradox, we instead have a notion of restricted comprehension. That is, we may not say {xx satisfies ϕ}\{x \mid x \text{ satisfies } \phi\}, but given some “parent set” SS, we may say {xSx satisfies ϕ}\{x \in S \mid x \text{ satisfies } \phi\}, which means that we can construct subsets of an already existing set whose members satisfy some predicate. This is why it is a big deal to talk about collections formed by unrestricted comprehension, because they cannot be sets.

You might be asking why classes don’t run into Russell’s paradox. Why isn’t there a class of all classes that do not contain themselves? Because by definition, classes may only contain sets as members. So it does not make sense to talk about classes of classes, and the hope is we sidestep such paradoxes.

Probably. Recall by Godel’s Second Incompleteness Theorem that we have no way of actually knowing whether our universe of sets and classes leads to a paradox. And again, there are multiple ways of creating new axioms to define classes, each of which may be subtly different. The same way set theory does not necessarily have to be formalized via ZFC, class theory does not have to be formalized via any particular set of axioms.

There does exist a conservative extension of ZFC with classes known as NBG. By “conservative” I mean that any proof in NBG that solely refers to sets is also valid in ZFC. And there is a very good StackExchange post concerning the relative advantages of ZFC and NBG.

Finally, I must mention that not every collection of objects may be modeled as a class. Why? As a dumb example, the collection of all classes is surely something worth studying. But it is not a class, because every member of a class must be a set. The existence of proper classes means the collection of classes cannot be a class itself.

More generally, classes are “describable” by a predicate, but not all collections are describable.

Why bother?

Quick note: when I say “proper class” I mean a class that cannot be modeled as a set.

In most fields of math, as far as I am aware, there is no reason to prefer discussing something like “the class of all continuous functions” as opposed to just saying “if a function is continuous…” So as one would expect, these fields implicitly use ZFC as their set theory and make no mention of classes.

But there is one field that explicitly does mention such collections: category theory. For example, the collection of objects in the category of sets is “the class of all sets”. And by making a distinction between sets and classes, we may describe a category as “small” if the collection of its objects and the collection of its arrows are both sets, and “large” if one of these collections is a proper class. Then we may discuss, for instance, the category of all small categories, which avoids the issue of self-containment (as this category would be large).


  1. The biggest exception is the Axiom of Choice, which infamously only gives the existence of a choice function rather than explicitly creating one.↩︎

  2. There is one important descriptive rather than constructive constraint: the Axiom of Foundation, which states that in every non-empty set SS there exists some element xSx \in S such that xSx \cap S is the empty set. From this we may conclude that no set contains itself. The rules which we use to construct sets show that certain classes of sets exist; it can be much harder to show that certain classes of sets do not exist. And it can actually be shown that the Axiom of Foundation is independent of the rest of ZFC. As many set-theoretic constructions rely on the Axiom of Foundation, it is convenient to have it. However, it is not really necessary for some definition of “necessary”.

    There is also another descriptive constraint: the Axiom of Extensionality, which states that two sets are equivalent if and only if they have the same elements. It is much more obvious why this is an axiom.↩︎

  3. Of course, given that mathematics has such diversity of thought, this is not the end of the story for everyone. Alternative foundations have been proposed. Some reject the Law of Excluded Middle, which informally states that a proposition is either true or false (formally, x¬x=x \lor \lnot x = \top). Others wish to use a type-theoretic model, an idea closely tied to computer science. Still others build their foundations from category theory, rather than set theory.

    Foundations are also not dogmatic. You do not need to “believe in” ZFC and study it to the exclusion of all else. After all, when we study ZFC set theory, we are not really proving that “no sets contain themselves”, we are proving that “under the axioms of ZFC, no sets contain themselves”. Thus it is perfectly fine to make a statement like “in category theory, there exists a category of all categories”, because it makes no claims about ZFC. But of course, even in category theory we must put a large asterisk next to the “category of all categories”, because depending on the foundations your category theory rests on, self-containment is still an issue that needs to be addressed.↩︎