I’ll start off by briefly describing my lifelong experience with special relativity. I learned about Einstein in grade school, and even read a biography of him. Most memorable was how it was said that when the public asked Einstein for an explanation of relativity, his reply was along the likes of
It’s like how when you’re in physically extremely uncomfortable conditions, time seems to take forever to pass, while on the other hand, when you’re with a cute girl, time passes by so fast.
I read Stephen Hawking’s A Brief History of Time in my first year of high school. At that age, I could only feel awe about physics, which ranged from galaxies to subatomic particles. Those genius physicists mentioned in that book from Newton to Einstein to Feynman to CN Yang/TD Lee seemed essentially otherworldly to me. I began to self-learn calculus nearing the end of first year of high school. Naturally, since calculus is intimately connected to physics, I became interested in physics as well, but predictably, physics was much harder for me than calculus, the computational and problem solving side of which was quite easy for me.
In my second year of high school, I took a physics course. It was certainly not well taught, and I didn’t really learn all that much and in fact felt quite frustrated with the elusiveness of the subject. Most of the other kids in the class were much dumber than me, especially at mathematics. A few of them would go, “physics!”, trying to come across as cool, though of course they were obviously much more devoid of rigorous and formal understanding than I was. We used a bullshit non-calculus based high school physics textbook that emphasized concrete real world examples, including in technology, without any serious presentation of the physical principles behind them. For special relativity, it was covered via the likes of trains and clocks and signals, with some derivations of time dilation and length contraction using only algebra. I easily confused myself in the process of trying to do the mental gymnastics working with a form that is some sense rather disorganized and unsystematic, not mention full of more sense-based distractions included to appeal more to high schoolers. Dissatisfied with the textbook, I actually downloaded and printed an English translation of Einstein’s paper in 1905 on the school library computer; needless to say, the material went way over my head.
In university, I had to, in order to satisfy a requirement, take a stupid physics course in mechanics and electromagnetism that was non-calculus based with a bunch of idiots that was covered by a shitty textbook with mostly pointless homework assignments. Special relativity was covered in more the same way as in high school. Again, aside from lacking in formality, it had too much distractions of a directly physical nature. I was stupid enough to mentally dizzy myself by thinking of how time would be measured in practice, taking into account the correction for the time it was take for the signal traveling at the speed of light to reach the observer. After some frustrations, I asked a graduate student who did physics undergrad at a top school in China to explain to me. I recall that though he emphasized transformations between frames given the spacetime invariant, I was afterwards unable to proceed with that in any meaningful way and eventually gave up. By then I had already learned linear algebra and some vector calculus and real analysis, and by all means, physics felt way harder for me than mathematics.
After university, I befriended a guy with absolutely stellar credentials who at that time knew way more than I did about pretty much everything who considered electromagnetism and special relativity to be a speciality of his in physics. When I asked him about contracting rods and time dilation in special relativity, he responded that serious physics thinkers don’t really think that way, instead going about with the Lorentz group. He also said that Feynman Lectures is too redundant and informal for him. Despite the hint, I still never proceeded in any meaningful way really. I did however sort of derive or at least go through the example of an field as field with a change of reference frame and vice-versa, applying length contraction appropriately for the moving charges in the current, which would result in density difference between the positive and negative charges in the wire and henceforth an field in the frame of the moving point particle, for which holds respectivelyI also gained better and clearer understanding of the example of a muon moving at relativistic speed and to the earth observer appearing to have a longer half-life, which was likely mentioned in the high school physics textbook, but which I never really meaningfully understood. I knew it was due to length contraction, but I had difficulty thinking it out clearly. Now it is apparent that in the muon frame the earth and the atmosphere above it is moving towards it in relativistic speed which implies length contraction of that space, which means that once it’s reached the earth, much less time has actually passed in the muon frame than appears to an observer on earth based on non-relativistic mechanics or the Galilean transformations.
In China, after I had some free time, I tried to understand special relativity again, after I had learned some statistical mechanics in a serious way from the textbook of Kerson Huang. (See gmachine1729：我的关于物理学的写作（My writings about physics) for a collection of my writings about physics, almost all in Chinese). This time, perhaps much with the benefit of study of statistical mechanics from a serious book, coupled with the much saner and more realistic and materialist cultural and political atmosphere in China, it was not difficult. As for the process behind that, I had seen that serious textbooks in the likes of Landau’s would derive the Lorentz transformations rather tersely with hyperbolic functions, which I was rather uncomfortable with. Though of course I had known of the existence of hyperbolic functions long ago, I never developed any meaningful understanding of them. There were a bunch of identities associated with , but I had no idea how anybody would have come up with them, despite that I had before read a bit about hyperbolic geometry. So I set out to derive the Lorentz transformations on my own using the most elementary methods. With a bit of help from the Wikipedia page, I succeeded in deriving it in a straightforward and easily understandable way not actually included on the Wikipedia page or any textbook I had seen, unless in a very tacit way the equivalence of which is not noticeable to the beginner. I am well aware that textbook or lecture note presentations are often terse and hide the intuition or process behind the derivations, as well as certain details which would make it easier for the reader to follow and understand more deeply. So in this sense, the writeup of the derivation of Lorentz transformation, gmachine1729：从狭义相对论的简单假定得到时空间隔不变量及洛伦兹变换 , might actually provide some value not already easily found elsewhere on the internet. Subsequently, with some help from various internet sources, none of which I regard as having explained very well, I did a derivation of the celebrated and in some sense popularized and widely misunderstood , published at gmachine1729：如何推导出相对质量及 E = mc^2 . I remember it was in some sense tricky to not misinterpret force in the relativisty context.
The year after university, when I saw the relativistic electromagnetism in the textbook by Griffiths, I felt like I did not quite have the brainpower to maintain a clear head when thinking about such abstract physical phenomena, and especially I was not confident at all in my ability to do detailed calculations there. Special relativity itself felt hard enough; incorporating it into electromagnetism felt simply way beyond me. In 2020, though I felt quite comfortable doing writeups for derivations of Lorentz transformation and that were by no means more or less copying from another source, I still would have felt mostly at a loss on how to relate special relativity with electromagnetism in a detailed way. Yes, I had seen four-vectors, but there was no actual non-superficial understanding of them. But maybe it was more that the people who had developed the covariant formulation of electromagnetism (or later breakthroughs heavily involving it like the Dirac equation and Yang-Mills theory) were simply too smart than it was that I was not smart enough, given that I remember a physics professor at top school had once said to me that though he has already learned the covariant formulation of electromagnetic three times, he would not be able to say anything about it at that given moment.
It was an encounter with the electromagnetic field tensor
in Landau-Lifshitz shorter course, book 1, mechanics, electrodynamics that brought me to think in a non-superficial way about four-vectors for the first time. I recall that the text mentioned that transforms like the product , but there was no explicit derivation of that. I decided that I would carry it out on my own. I saw in the text that the electromagnetic four-potential is also Lorentz covariant, i.e. the scalar associated with the Minkowski metric of it, , is invariant under Lorentz transformations of the four-potential. I also saw on Wikipedia that the 4-current is too Lorentz covariant, noticing crudely an analogy between it and the four-potential. From there, it occurred to me that proving in detail the Lorentz covariance of all those four-vectors, starting from four-velocity, four-momentum, and four-force, would be a suitable exercise.
However, doing so would in some sense premature if I hadn’t already derived the Lorentz transformations directly from the Lorentz covariance of spacetime coordinates in a way cleverer and more connected to the mathematical structure of the Lorentz group than solving for all four variables in the transformation matrix, as was done was in , which is more of a brute force manipulation that does not really shed any further light as far I can tell, though it does obtain the desired Lorentz transformation.
So how would we guess this spacetime interval invariant satisfying Lorentz transformation a in much simpler and more insightful way than in ? We first notice that Minkowski metric if indices are disregarded is represented by
in contrast to the identity matrix for the Euclidean metric. Under the Euclidean metric, the orientation preserving rotation or isometry group is given in the form
apply any matrix in the group to any vector and the Euclidean norm of the output vector will equal that of the input vector. Our strategy to guess the matrices corresponding to the Lorentz group consists of modifying the above Euclidean norm preserving matrix to one that preserves the Minkowski norm, noting that only a sign change is required. We notice that due to , the square of the Euclidean norm of
is . However, in the Minkowski norm, is subtracted instead of added, which means that in order for the factor to vanish (we will use to represent modified functions in the case of the Minkowski norm), the matrix must be symmetric, and a way to alter it is such that
in which case the Minkowski norm becomes
This means the existence of real functions satisfying everywhere would suffice for our desired Lorentz transformation. We shall now seek such functions in terms of , which is the velocity of the -axis with respect to the unprimed frame given the parameter . In order that , we set the point of the spacetime event to be at the origin of the -axis. Then, we have
Substituting into gives us the desired Lorentz transformations
As for , gives , and moreover we note that is to be viewed here as an element of the Lorentz group or as a change of reference frame. If are elements of the Lorentz group, then corresponds to the change of reference frame that obtained by first changing reference frame via and then via . The corresponding velocity addition formula, which can be obtained by a composition of Lorentz transformations, is
which is actually a hyperbolic identity.
We have in some sense via already implicitly defined the hyperbolic functions and corresponding hyperbolic angles of rotation. The reader is of course welcome to investigate this in more detail on his own.
Finally, I shall note that the time coordinate in the result of ,
does not include the speed of light factor, and if we include it, the above matrix equation is transformed to
which is equivalent to .