May 7th, 2021

On preorders, partial orders, directed sets, and Zorn’s lemma

Originally published at 狗和留美者不得入内. You can comment here or there.

I learned of nets and ultrafilters in 2017 but in 2021 I could not remember them in detail. I have never written about them before. In fact, I would say that I never genuinely appreciated the significance of partially ordered sets. It was in the process of doing gymnastics on subsets of [公式] for the construction of the Lebesgue measure that I developed some non-trivial intuition for posets. Sets are more difficult than naturals or integers or rationals or reals in that they are not totally ordered. The lack of total ordering results in more ambiguity in some sense, making it more difficult to develop intuition. The binary relation most lacking in ambiguity in this sense is of course the equivalence relations, which are defined by three properties, which are reflexive, symmetric, and transitive, which even the average person devoid of mathematical maturity has at least an intuitive idea of. It is similar for the total order, which is characterized by the reflexive, transitive, and connected properties.

I just read that a net is a preorder on a set [公式] with the additional property that for all [公式], there exists [公式] such that [公式] . Having never known before the precise definition or preorder, I shall know that there is the prefix “pre-” because it is the more general type of order-like binary relation, or equivalently, the one with the fewest constraints. Every order-like binary relation must be reflexive and transitive. It is common sense that in any order, every element is bounded above by itself, and moreover if [公式] is bounded above by [公式] , it is also bounded above any element by which [公式] is bounded above. The property not universally posessed by a preorder that is required for a partial order is anti-symmetry, which states that if [公式] , then [公式] , which of course holds for most orders that people typically work with. Imposing the equivalence relation corresponding to this on a preorder turns the equivalence classes into a well-defined partial order.

Another aspect of partial orders that counters naive intuition is that there are pairs of elements for which neither one is definitely greater than or equal to the other. The best example with which to illustrate this is of course is arguably set inclusion. The set of functions with the same totally ordered codomain such as the integers or the reals is another good one, in the case of which it is apparent that one can have two such functions [公式] such that [公式] for some [公式] and also [公式] for some [公式], which means that neither is an upper bound for the other everywhere. In this case if [公式] is the relation corresponding to the partial order, we have that [公式] . Somebody once noted to me that smartness or intelligence across individuals is more or less a partial order. I also noticed that one can alternatively describe it as a preorder. To prove this, after noting that the reflexivity and transitivity properties are more or less trivial, one notes how for any two people, you either have cases such as me and Jack Ma, wherein one is unambiguously strictly smarter than the other, or you have cases where the two are about equally smart (both have their own intellectual strengths relative to the other equally as much that one can’t reasonably regard one as smarter than the other in absolute terms), for which an example would be Tomonaga and Schwinger. If one regards intelligence as a preorder, then for any two elements [公式] , [公式] and/or [公式] . Alternatively, one can define an equivalence relation based on classes of intelligence, wherein everyone is assigned to a class, for [公式] in the same class, either [公式] or [公式] , with the latter occurring iff [公式] prior to imposing the equivalence relations of approximately the same leve of intelligence. In other words, when one defines as a partial order, there are cases wherein when comparing two distinct people, we give a verdict iff one is strictly smarter than the other, and we leave the comparison inconclusive in the other way round. One observes that “strictly smarter than” or “comparing person [公式] with person [公式] ” is a partial order, while “as smart as” is a preorder. Overall, the partial order intepretation of intelligence is better since the relation corresponds directly to a qualitatively stronger result. Of course, given that one can always derive one from the other deterministically, the two interpretations are formally equivalent.

In order that the partial order is irreflexive, we define the following.

Definition 1 A strict partial order is a binary relation [公式] on a set [公式] which satisfies the following properties.

  1. For all [公式] , [公式] is false (irreflexivity).
  2. For all [公式] , if [公式] and [公式] , then [公式] (transitivity).

Proposition 1 A strict partial order is assymmetric. Formally, if [公式] , then [公式] is necessarily false.

Proof: If by contradiction, [公式] and [公式] , then by transitivity [公式] , which violates the first property of a strict partial order. [公式]

Definition 2 A chain of an order defined on set [公式] is a subset [公式] such that the order restricted elements of [公式] is a total order.

Definition 3 A maximal chain of an order is a chain such that there does not exist any element outside the chain that, per the order, is greater than or equal to every element of the chain.

Definition 4 A maximal element of an order is an element [公式] such that [公式] iff [公式] .

Theorem 1 (Zorn’s lemma) A partial order on [公式] such that every one of its chains is bounded above by some element of [公式] has at least one maximal element.

Proof: By hypothesis, for every chain [公式], here indexed by [公式] , there exists [公式] such that [公式] for all [公式]. There exists such a [公式] iff [公式] is not a maximal chain, by the definition of maximal chain. If there were no maximal chains in [公式] , then for any chain [公式] , we can construct chain [公式], wherein [公式] and [公式] for all [公式] . Since we have assumed that there are no maximal chains in [公式] , there are no elements outside [公式] greater than or equal to all the elements in it. At the same time, there is no [公式] such that for all [公式] , [公式] for all [公式] . This violates our hypothesis, which means there exists a maximal chain in [公式] .

Take a maximal chain in [公式] . By our hypothesis and definition of maximal chain, there exists some [公式] in the chain that is an upper bound for all elements of the chain per this partial order. Since there no elements [公式] outside the chain such that [公式] , [公式] is a maximal element. This completes our proof. [公式]

Definition 5 A directed set is a set [公式] with a preorder binary relation on it such that for all [公式], there exists [公式] such that [公式].

Corollary 1 We call a directed set wherein every chain is bounded above by some element in the directed set a bounded directed set. We have that

  1. If a poset is a bounded directed set, there exists exactly one maximal element.
  2. For any pair [公式] in the set of maximal elements of a bounded directed set in which every chain is bounded above by some element in the directed set, [公式] and [公式].

Proof: For (1), by Zorn’s lemma, there exists at least one maximal element. If there were distinct maximal elements [公式] , then the property of directed set tells us there exists an [公式] such that [公式] . Thus, either [公式] or [公式] or both of them are not maximal elements.

Every directed set by definition is a preorder. We can via an aforementioned equivalence relation on preorders transform this preorder into a partial order on its equivalence classes. By (1), there exists exactly one maximal equivalence class, and the definition of our equivalence relation gives us the property stated in (2). [公式]

Zorn’s lemma is used to prove Tychonoff’s theorem, which states that the product of (possibly uncountably many) compact topological spaces is compact. It is said that if mathematicians were to vote on the most important theorem in general topology (or point set topology), they would either choose Tychonoff’s theorem or Urysohn’s lemma. Urysohn’s theorem was proven in the 1920s and Tychonoff’s theorem was proven in the 1930s, both by Russian/Soviet mathematicians. I had mentioned before that Russian mathematicians arguably played the biggest role in the founding of the field of topology, especially point set topology. The next are the French and then the Germans. The separation axioms of point set topology are named after the following mathematicians: Kolmogorov, Frechet, Hausdorff, Urysohn, and Tychonoff. I also know of Russian/Soviet mathematician Pavel Alexandrov who made monumental contributions to topology though I know very little about what he actually did. Interestingly, Alexandrov was advised by Dmitry Egorov and Nikolai Luzin. As for Egorov, I know and several years ago re-proved myself his theorem regarding uniform convergence on all but [公式] measure of a finite measure set. I know nothing about what Luzin did, though I know that he was the doctoral advisor of Kolmogorov and Alexandrov, both of whom testified against him in 1936 during the Great Purge.

On the definition of a topological space and convergence in it

Originally published at 狗和留美者不得入内. You can comment here or there.

Understanding the delta epsilon definition of limit is major milestone in the initial stage of development of non-trivial mathematical maturity. However, one who knows topology is well aware that limit and convergence can be defined in much more generality in the context of topological spaces. In the initial stage, limits were studied almost exclusively on the reals, as limits of a discrete sequence or of a real function wherein the input variable tends continuously to some value. The limit’s being equal to infinity was also separately defined.

There is also that sequences were not generalized to nets in the 1920s and ultrafilters in the 1930s after point-set topology had reached a certain stage of maturity. They are of course more abstract and in some sense seemingly elusive, but with a certain level of depth of thought, one should view them as more or less a natural development or generalization. Another major generalization was going from a concrete metrizable space to an abstract topological space for which convergence of a sequence or net is defined in terms of neighborhoods.

It is not uncommon for a student to initially struggle with topology, getting lost in the abstract formalism devoid of any meaningful intuition for the definitions. I certainly was one of them. So here, I will explain how one would naturally arrive at the definition of open and closed sets, neighborhoods, topological spaces, and convergence.

Intuitively, a neighborhood of a point is a set containing it and its most immediate surroundings. An open set is a set that is a neighborhood of all of its points. A sequence of points [公式] in a topological space converges to a point [公式] in it if and only if for every neighborhood of [公式] , there exists an [公式] such that for all [公式] , [公式] lies in that neighborhood. This is basically saying that no matter how small a set containing [公式] , as long as it contains it and its most immediate surroundings, the sequence past some point will be constrained to that set.

What are the requirements for a neighborhood of a point [公式] then, without reference to the definition of open set? Well, of course, it must contain the point [公式] itself. If it includes a neighborhood of [公式] , it should also be a neighborhood, which is also a no-brainer. The last two requirements are more non-trivial. They are that

  • The intersection of two neighborhoods of [公式] is also a neighborhood. This makes sense because intuitively the neighborhood of a point contains the most immediate vicinity of it. The intersection of two sets which satisfy this containment criteria naturally also satisfies it.
  • If [公式] is a neighborhood of [公式] , then there exists a neighborhood of [公式] , [公式] such that [公式] is a neighborhood of all the points of [公式] . Crudely speaking, this is saying that any set containing the most immediate vicinity of an arbitrary point [公式] should be regarded as containing the most immediate vicinity of every point in the most immediate vicinity of [公式] .

The usage of “most immediate vicinity” is imperfect or sloppy in the sense that in topological spaces generally thought of, such as [公式] , there is no strictly most immediate vicinity. Via the ball of radius [公式] for [公式] large, one gets a very immediate vicinity, but no matter how large [公式] is here, there always exists an even more immediate vicinity. Yet the intersection of all such balls gives the singleton set [公式] itself, which does not contain any immediate vicinity of [公式] and is thus not a neighborhood.

Naively, one might expect the intersection of infinitely many neighborhood to also be a neighborhood, especially when one thinks in terms of most immediate vicinity. Thus, we modify “most immediate vicinity” to “all arbitrarily immediate vicinities”. Then, it occurs that if we take a sequence of successively strictly more immediate vicinities such that for any set that counts as an immediate vicinity, there is a strictly even more immediate vicinity in the sequence, their intersection should not necessarily contain all arbitrarily immediate vicinities. One notes that mathematical induction, when applied to intersection of two finite sets, extends to intersection of a finite number of sets but not to infinity, and there is analogous is measure theory, with countable additivity, usually involving taking limits of sequences of non-negative numbers, much harder to prove than finite additivity.

As for definition of topological spaces via open sets, one who has studied topology should remember that a topological space on [公式] , as a family of subsets of [公式] , must contain both [公式] . The universal set [公式] of the space should be regarded as a neighborhood of all the points in it. Another way of viewing this is that if [公式] were not in the topological space, then some point in the space would not have a neighborhood, which would be absurd. As for [公式] , because there are no elements in it, it need not be a neighborhood of any point, and thus, nothing here violates the definition of an open set. This is like how you initialize your boolean variable to true when evaluating logical conjunction on a list of booleans. If there is no boolean to be evaluated, you should return true.

As for closure under unions, including uncountably many, this is obvious since for each point in the union, the union must contain one of the open sets passed as an argument to the union operator, which by definition of open set, is a neighborhood of it. For finite intersection, we simply have that the intersection is a neighborhood of all the points in the intersection.

After defining topological space and convergence in it, one can define the set of accumulation points of a subset [公式] of the set [公式] on which the topological space is defined to be the set of points sequences in [公式] can converge to. The set of accumulation points of [公式] is sometimes also called the closure of [公式] or [公式] . [公式] is a closed set iff [公式] .

Another fact of convergence in point-set topology that might be counterintuitive to a person who has learned convergence of sequences in the reals, where a sequence converges to either one point or none, too “dogmatically” is that in a topological space, one can have convergence to multiple points. Even on [公式], there is the trivial topology [公式] . Every point in this topological space has only one neighborhood [公式] or only one immediate vicinity, which is all of [公式] dimensional space, wherein one can go arbitrarily far away if one thinks in terms of the Euclidean metric. Here, in the coarsest topology possible, every sequence converges to every element of [公式] dimensional real space.

Two points [公式] are called topologically indistinguishable if a set is a neighborhood of [公式] iff it is a neighborhood of [公式] . One can easily verify that topologically indistinguishability is an equivalence relation. Moreover, if two points are topologically indistinguishable, a sequence converging to one of them necessarily also converges to the other. In the trivial topology, all points are mutually topologically indistinguishable.