Originally published at 狗和留美者不得入内. You can comment here or there.

I’ve pretty much always sucked at inequalities. Hopefully, now that I am older and more mature, I can develop some deeper intuition for them. I feel like the biggest difference between the me after age 25 and the me before then is that the former, older version of me is much better at thinking deeply and learning and understanding pretty much everything systematically, which is of course closely connected with self-discipline and self-control. The younger me, when faced with something that did not come extremely naturally to me, would often be totally at a loss on where to start. The older me is also much better at problem decomposition as well as working on problems of a more ambiguous nature. I am leagues better at learning from experience. In contrast, in undergrad, not only was I often not able to solve problems, I could not even understand their solutions deeply enough to pretty much never forget it. The older me is also much more creative and independent-thinking. As for this “creativity” matter, I will acknowledge however that I have never composed by own math or programming contest problems despite having solved a good number. I may well succeed if I try now, though I certainly wouldn’t have succeeded in that before age 25, when I struggled too much even to solve them.

As for inequalities, I was too dumb in high school to solve any actual olympiad problem in that category, though I was aware of AM-GM (arithmetic geometric mean) and to a lesser extent Cauchy’s inequality (which comes up more in real math). Later, I learned of the rearrangement inequality, for which it is immediate to me now that the proof of which, I’ve never done completely myself, involved a swap argument. There is also Jensen’s inequality, which is trivial in some sense, though I also feel like I’ve never really ever thought of it independently, until perhaps now.

**Definition** **1**: A function is *convex* if for any , for any , given the function defining the line joining , we have . To picture this, one can draw a line between two points on the plane and then draw a curve underneath it with non-negative first derivative. If instead, , we say that is concave.

**Theorem 2 (Jensen’s inequality)**: If is *convex*, then for any ,

*Proof*: This follows directly from the definition of convexity in that represents any arbitrary point on the line segment joining the two points .

**Theorem 3 (Jensen’s inequality for finite set)**: Suppose is *convex*. Take any and any probability distribution on it with as the corresponding probabilities. Then,

*Proof*: We prove this by induction. We note how

gives rise to a binary distribution. As our inductive hypothesis, we can assume that

This in combination with Jensen’s inequality on two variables completes our inductive proof.

**Theorem 4 (Jensen’s inequality for continuous probability distribution)**: Given a probability density function characterizing a random variable and a convex function on the reals,

*Proof*: If we are using the Riemann integral, we simply apply Theorem 3 to partitions and take the limit. If we are using the Lebesgue integral, we can do analogous for simple functions.

**Theorem 5 (Young’s inequality)**: Given non-negative and such that ,

*Proof*: One who is familiar with Jensen’s inequality should upon seeing the RHS of it, notice that one can use Jensen’s to derive a special case of it. The logarithm is a concave function, in which case the inequality in Jensen’s is reversed.

Monotonicity of the logarithm yields our desired result.

**Theorem 6 (Holder’s inequality)**: Given positive such that and , we have

or equivalently,

*Proof*: Since the norm operator on for any is absolutely homogenous, we can assume WLOG that . With monotonicity of the integration operator, we have by Young’s inequality,

**Theorem 7 (Minkowski’s inequality)**: For ,

*Proof*: We have

This gives us

wherein we used , which completes our proof.

I’ll be honest and say that after trying for an hour and failing, I looked up the proof for Holder’s inequality. From it, I learned of Young’s inequality. Due to never having internalized Jensen’s inequality, I couldn’t easily prove Young’s inequality myself, and even went astray, along the futile AM-GM direction. However, after reading the proof of Young’s inequality and also that one can assume WLOG that , Holder’s inequality became more or less trivial. I’m now aware that after first realizing to simplify by assuming unit norms WLOG, one then ought to realizes that one is free to take any power of the norms on the RHS without altering its value. Using that in combination with expressions that are powers of that unit norm as a medium of comparison then becomes the natural next step. With monotonicity of the integral operation in mind on the partial order of non-negative functions in mind, one then finds the symmetric expression on which Jensen’s inequality can be applied.

Too bad, I was too dumb to figure this out myself. My lack of any non-trivial understanding of inequalities in the past, of course, also had much to do with it. On the plus side, I did within an hour figure out Minkowski’s inequality on my own, which may be the first non-trivial and important inequality I proved myself without ever having looked up the proof. I might have seen the proof before of course, but only in a very cursory and superficial way, such that I did not remember anything about it.