Defining covariance and contravariance of vectors
I had read about covariance and contravariance of vectors as an undergraduate student but I struggled to understand and remember permanently the associated concepts, easily disorganized and lost in the required mental gymnastics. In this piece, I will explain why it is reasonable to name and define in such a way, having implicitly in mind as well that illustration through a concrete example of classic physical significance is likely to leave a firmer imprint on one who wishes to solidify understanding of the concept.
Some people who learn about covariance and contravariance of vector might have trouble remembering which is which. Having this in mind, I will emphasize that covariance by virtue of the co- means the quality changes according to the change of basis while contra- changes per the inverse of the change of basis. Per this definition, basis vectors are necessarily covariant vectors.
Then, one might ask: why not define it such that covariance corresponds to the coordinates (which are actually contravariant by the way), which appear way more than basis vectors? It is because coordinates are only well defined with respect to a basis, the existence of which is often implicit. For instance, or only has meaning given the basis vectors . The number , without any explicit unit attached to it, is also defined with respect to any implicit unit element . A real number is a scalar, which is a vector of dimension , that is defined with a multiplicative identity element that is conventionally denoted as , which is its one and only basis element. If we multiplied by the basis of the real number, then the same real number, which is a contravariant one dimensional vector would be transformed in reverse of the basis change, e.g. in the implicit standard basis would become . To be more explicit, we have . The one-dimensional scale change example is the simplest example of the definition of covariance and contravariance, and from it, it is easy to see that for the actual quantity irrespective of basis to remain the same, the representation of it in a change basis via a contravariant vector with respect to that basis much undergo a transformation that in some sense nullifies or cancels out the change of basis.
In Einstein notation, components of covariant or basis vectors, through which contravariant vectors, are well-defined are denoted with lower indices. This makes sense in that in typical algebraic notation, subscripts, as opposed to the superscripts used to denote exponents, are used as an optional or “lower scope” part of variable names. On the other hand, it also makes sense to denote components of contravariant vectors, which do not exist intrinsically, with upper indices.
In matrix multiplication, the convention is to put the vector at the right of the matrix, in the form of . Taking as the identity matrix that implicitly sets the basis to
we observe that conventionally, covariant (or basis) vectors are row vectors, while contravariant vectors are column vectors. We emphasize again that the basis exists intrinsically independent of coordinates, though in practice, we assign row vectors forming a linear independent set to them as an explicit representation, with the above identity matrix representation as of course the most convenient representation. Under our intrinsically defined basis, is represented via coordinates as .
Moreover, we can regard
as a change of basis where in the rows of become the vectors of our new basis. Let be the coordinates with respect to our new basis. Since we desire it to be equal to or in coordinates with respect to our old basis, we have
which is essentially a statement of contravariance of coordinate or column vectors.
Application of covariance and contravariance to the Lorentz transformation
be the contravariant spacetime coordinates in one reference frame and
be the contravariant spacetime coordinates in another reference frame. For simplicity, we also assume that in this change of reference frame. It was derived in gmachine1729：How I systematically guessed hyperbolic functions and the Lorentz group from the spacetime invariant  that
We now define the gradient operator with respect to the spacetime coordinates in the respective frames as
By the multivariate chain rule, we have
Upon noticing that
is the inverse of the Jacobian of the Lorentz transformation corresponding to as given in , it becomes apparent that if we regard the spacetime coordinates as contravariant, the gradient operator is essentially a basis comprised of covariant vectors which correspond to directional derivative operators. To be more explicit, the transformation given in , being contravariant per the use of upper indices, is the inverse of the transformation corresponding to the spacetime basis change associated with the change in reference frame, and an inverse of an inverse of the basis change transformation would give a transformation equivalent to the basis change transformation itself, which is exactly what we obtained for the transformation of the gradient operator to another reference frame.
In , there was mentioned explicitly “Lorentz covariant” and “covariant formulation of electromagnetism”. The reader might ask why not name it Lorentz contravariant, given that position vectors are contravariant with respect to change of basis of their coordinates. The answer is that if we think more generally and intrinsically, spacetime coordinates can also be regarded as covariant, in which case the set of all directional derivative operators would form a vector space of contravariant vectors, and moreover, the underlying basis, set of dimensions, or scale of our physical system would be one that specifies coordinates for directional derivative operators. In any case, whether it’s spacetime coordinates or gradient operators, we can find a basis the quantity with respect to which transforms under a representation of the Lorentz group, e.g. the matrix corresponding to or its generalization to all four coordinates, or its inverse, elements of which are parametrized by the velocity. Moreover, a Lorentz covariant scalar or Lorentz invariant such as the norm of any velocity vector (the speed of light) or the spacetime interval (the norm of the position vector) is a scalar which remains unchanged under any Lorentz transformation.
The Minkowski metric tensor
The norm for Minkowski spacetime coordinates is given by , and per this, we define the Minkowski inner product (which despite satisfying linearity and conjugate symmetry is, lacking positive definiteness, in some sense not a real inner product) via the following explicit bilinear form
In Einstein notation however, the metric tensor in contrast to the above matrix form is actually rank-2 covariant tensor such that with respect the standard orthonormal basis for Minkowski space denoted by ,
Expressing in terms of tensors of row and column vectors and using the tensor identity
in which we lowered the index with the Minkowski metric. Moreover,
Raising of an index is similar, and the reader is welcome to work out the details of that himself.