Learning is About Conviction

For a good bit of time, I’d never been able to sing in the G4-A4 range consistently and effortlessly. Part of the reason was that I really didn’t try. Recently the choir has been working on a few pieces that require me to go up in that range in chest voice.

I finally figured it out when I decided to “go low to go high” (more technically, relax the jaw and drop the larynx to get a more relaxed sound). To put it into simpler terms: I convinced myself, fully believed that G4 was not actually that high.

And then I got it.

Beyond mantras

None of the things I said like “drop the larynx” were new to me. But to me they were just words. It wasn’t until I started at a note that actually was easy for me (a C4) and slid up to around a G4. The whole time I convinced myself “this note isn’t very high.” Only once I actually, fully believed it did I hit the G4.

I think math is much the same. You can repeat mantras like “category theory is about universal properties”, or “Yoneda’s says an object is determined by the morphisms into/out of it” — and you should know them — but it is not enough. You really have to believe them, and when you do, they almost become second nature.

How do you get this conviction?

For singing, it took a willingness to believe and a couple of successful attempts before I got G4. A4b came in a couple more days. And I swear I’ll have A4 by the end of this week.

If you’ve done a physical activity like rock climbing, you know this all too well. You can be shown the “beta” (the intended route), but it takes truly believing that you can and ought to do a certain move in a certain way to actually execute. Sure, if it is a sufficiently easy route, having this conviction is almost trivial. You and I both believe we can climb a ladder, whatever you consider a “ladder” to be. But often the last barrier to climb a route is the barrier of belief, when our physical strength is enough and we know (but don’t quite believe in) the important moves.

But maybe this can all be chalked up to muscle memory. Your body can learn quickly and deeply, but math doesn’t really exercise your muscles. How can you build this conviction then?

I think the answer is through pain. We all despise pain, even mental. Consider that certain constructions would be agonizing if we did not have the right abstractions in order to perform them.

I will present a few examples, and for each of them, I want you to think about how annoying it would be to prove these statements without the correct abstractions. Sorry, they are all going to be algebraic geometry: I think it is the best example of a field which seems to have a lot of nonsense abstractions, only for the abstractions to prove obviously useful when hit with a certain pain point.

Example 1: Sheaf on a Base

For example, I can tell you that a sheaf can be defined on just a base. You can repeat the mantra after me: “it’s because you can describe regular functions locally via stalks at every point”. But you’ll believe this is just some abstraction I made up until I tell you what an affine scheme is.

At that point you will hopefully understand how horrible the definition would be if we couldn’t define a sheaf on a base. Then you see: oh, the expected behavior of the sheaf $\mathcal{O}_{\text{Spec } R}$ is horrible to explicitly describe on arbitrary opens, but if we just restrict to distinguished affine opens, our lives are good again.

Example 2: Kernels are Arrows

Another great example is the idea that a kernel is not actually an object. For example, you may think of the kernel of an $R$ -module map $\phi : M \to N$ as $\ker \phi$ , the submodule of $M$ consisting of the elements mapped to $0$ . But I think of it as the inclusion arrow $\ker \phi \hookrightarrow M$ satisfying some universal property.

You may even choose to take it at face value when I tell you this. But if you are to actually believe me, I think it is best to convince you that right adjoints are left-exact. Let me actually define these terms for you (besides an adjunction). A sequence $A \to^f B \to^g C$ in, say, the category of $R$ -modules (but abelian groups, rings, etc work; anything with “an abelian group structure”, more formally an abelian category, will suffice) is exact if $\ker f = \text{im } g$ . A longer sequence such as $0 \to A \to B \to C$ is exact if every kernel is equal to the image of its successor.

A functor $F$ is left-exact if it preserves exact sequences $0 \to A \to B \to C$ , i.e. applying $F$ yields an exact sequence $0 \to F(A) \to F(B) \to F(C)$ . I will now tell you

right adjoints preserve limits,
kernels are limits,
and crucially, $0 \to A \to B \to C$ is exact if and only if $A \to B$ is a kernel of $B \to C$ .

Example 3: Why do we care about exact sequences and functors?

It is the combination of these three facts that immediately implies right adjoints are left-exact. In particular, step 3 is the important one which is less commonly stated in the luxury. (I told you considering kernels as arrows would be useful!)

In particular, it is very useful to know localization is an exact functor in algebraic geometry. It is used to prove the “qcqs” lemma, which asserts that given a “nice” (i.e. quasicompact and quasiseparated) scheme $X$ , a global function $f$ on $X$ , and a “nice” (i.e. quasicoherent) $\mathcal{O}_X$ module, the rings $F(X)_f$ and $F(X_f)$ are isomorphic.

The idea is that on affine opens $\text{Spec } R \subseteq X$ , we have that $F(\text{Spec } R_f)$ is isomorphic to $F(\text{Spec } R)_f$ for any $f \in R$ . Somehow we take the exact sequence characterizing a sheaf and then localize at $f$ . And we know this localization preserves the exact sequence precisely because of the discussion before.

This theorem’s proof is another reason we care about exact sequences and expressing the sheaf condition as an exact sequence: the usual way we build an isomorphism among schemes is by taking an affine open covering of each and building isomorphisms between each affine open patch. But here $F(X)_f$ does not quite come from any sheaf. But if we just want an isomorphism along global sections, an isomorphism of exact sequences is enough to induce that.

With all of that said, how do we prove localization is exact, i.e. preserves exact sequences $0 \to A \to B \to C \to 0$ ? Actually, we already know through the localization-forgetful functor that localization is right-exact as it is a left adjoint. So we only need to check exactness of $0 \to S^{-1} A \to S^{-1} B$ , i.e. injectivity of $S^{1} A \to S^{-1} B$ , which is immediately obvious either through the explicit construction of localization or by applying the universal property.

Example 4: Yoneda’s Lemma and the Functor of Points

The whole idea of category theory is to consider objects through the vantage point of some universal property so we don’t have to make arbitrary choices and suffer from their consequences.

But the same problem can come into play when constructing these categorical objects. As an example, what is the fibered product of schemes? Constructing it explicitly is kind of a pain. The whole point of category theory is to construct as many objects in a universal/choice-free manner, and only then check the desired properties arise out of the universal description.

So here is what we do. Given scheme morphisms $X \to Z$ and $Y \to Z$ , we take the fibered product of $\text{Hom}(-, X)$ and $\text{Hom}(-, Y)$ . (We know the fibered product always exists in the category of functors from Scheme to Set; this is part of the work we do with representables and the functor category.) Now we just need to check the fibered product, a universally constructed object that we can actually work with, is representable.

Yes, the ideas are largely the same as in the direct construction. But crucially, you do not have to perform a somewhat cumbersome check that your object does not implode on itself as you define it.

Metaexample: Category theory is useful

All of this put together should convince you that category theory is useful. I wouldn’t have even been able to formulate what a tensor product is without category theory. Sure, understanding a tensor product as some quotient space might require less intellectual work in the short term. (Heck, the even lazier option is to understand tensors as an array of numbers. But you already know this doesn’t end well for matrices; why should the higher-dimensional analogue be any different?) But could you imagine trying to prove the tensor product commutes with the direct sum with these definitions? And how would we ever be able to see these kinds of facts or frameworks are useful if we can’t even talk about them properly?