Special relativity: electromagnetism

From Scholarpedia
Wolfgang Rindler (2012), Scholarpedia, 7(7):10906. doi:10.4249/scholarpedia.10906 revision #125879 [link to/cite this article]
Jump to: navigation, search
Post-publication activity

Curator: Wolfgang Rindler

Special relativity (SR) is a physical theory based on Einstein's Relativity Principle, which states that all laws of physics (including, for example, electromagnetism, optics, thermodynamics, etc.) should be equally valid in all inertial frames; and on Einstein's additional postulate that the speed of light should be the same in all inertial frames. The present article describes electromagnetism in Special relativity; see the articles SR: kinematics and SR: mechanics for the prerequisites.



We were fortunate that (non-gravitational) Newtonian mechanics allowed itself so readily to be extended to a manifestly Lorentz-invariant new mechanics in terms of 4-vectors (see SR: mechanics). As we mentioned earlier, Maxwell's theory (at least in vacuum) needs no extension. It already is Lorentz-invariant, though checking this directly is somewhat laborious. Nevertheless, from what we have seen of relativistic mechanics, one would surely hope that Maxwell's theory, too, could be expressed in some elegant 4-dimensional manner, that fits into the concept of spacetime and makes its Lorentz-invariance manifest. One might perhaps expect that the electric and magnetic field 3-vectors, \(\mathbf{E}\) and \(\mathbf{B}\), henceforth written \(\mathbf{e}\), \(\mathbf{b}\), could be extended to corresponding 4-vectors along the lines of the 3-momentum \(\mathbf{p}\). But the fact that they intermingle as one changes IFs, a priori prevents this. It turns out that \(\mathbf{e}\) and \(\mathbf{b}\) together give rise to a single 4-tensor. And in terms of that, an elegant 4-dimensional and manifestly Lorentz-invariant formulation of Maxwell's theory results.

So what are 4-tensors? They are a step up from 4-vectors. While 4-vectors are visualizable as displacements in spacetime, 4-tensors can be visualized as various more complex geometric objects in spacetime, whose details are best not even thought of. One main characteristic of 4-tensors is that they allow themselves to be described by components like \(A_{\mu},B_{\mu \nu },C_{\nu}^{\mu},D_{\nu \rho}^{\mu},\) etc., where, here and throughout, Greek indices will range from 1 to 4. The main characteristic of 4-tensors, for our purposes, is that equations between 4-tensors of equal type are Lorentz-invariant. There is no limit to the number of indices that tensor components may carry, and thus no limit to the number of components that may be required for the full description of some tensorial object. The level of each index – up or down – will have significance. Four-vectors are actually 4-tensors of type \(A^{\mu}\).

The key idea is that a tensorial object is described by a distinct set of components in every standard IF (namely, one in which standard coordinates \(x,y,z,ct\,\) have been set up); but what makes a tensor a tensor is the way its components transform from one IF to another – that is, how they transform under general LTs. But before we elaborate on this, we need to set up some notational and other preliminaries.

Since we need to deal with sets of components in various inertial reference systems \(S,S',S''\cdots\), we reserve different index alphabets for the different IFs (the values of the indices always run from 1 to 4): \[ \mu ,\nu ,\rho ,\cdots \; \; \text{for}\; S\] \[ \mu' ,\nu' ,\rho' ,\cdots \; \; \text{for}\; S'\] \[\tag{1} \mu'' ,\nu'' ,\rho'' ,\cdots \; \; \text{for}\; S''\;\;\text{etc.}\]

Thus if \(A_{\mu \nu }\) denotes the components in \(S\), \(A_{\mu'\nu'}\) denotes them in \(S'\), etc. The index alphabets are completely independent, so that in the same equation \({\mu }\) can be 3 while \({{\mu }'}\) is 1. For reasons that will become apparent, we denote the coordinates themselves with superscripts, though they are not tensors:

\[\tag{2} x^{\mu}=(x,y,z,ct),\;\;\; x^{\mu'}=(x',y',z',ct'), \;\;\text{etc.} \]

Another notational convention is the summation convention (invented by Einstein!): if any index appears twice in a given term, once as a superscript and once as a subscript, summation over all values of that index is implied. Thus, for example,

\[A_{\mu}B^{\mu}=\sum_{\mu}A_{\mu}B^{\mu} =A_{1}B^{1}+A_{2}B^{2}+A_{3}B^{3}+A_{4}B^{4},\]

\[ \tag{3}{A_{\mu \nu \rho}B^{\nu \rho} = \sum_{\nu}\sum_{\rho} A_{\mu \nu \rho}B^{\nu \rho} = A_{\mu 11}B^{11} + A_{\mu 12}B^{12} + A_{\mu 21}B^{21} + \cdots + A_{\mu 44}B^{44} },\]

and so on. By a slight extension, we also understand summation in such expressions as

\[ \tag{4} {\frac{\partial u^{\mu }}{\partial x^{\mu }}\;, \qquad \frac{\partial q}{\partial x^{\mu }}\frac{dx^{\mu }}{\mathit{d\tau }} }\;, \;\; \text{etc.}\]

Since all these sums are finite, all elementary rules apply, such as the allowed change of the order of summation in multiple sums, or differentiating under the imagined summation sign. The repeated indices signaling summation are called "dummy" indices, since they can be replaced by any other pair of indices not already used in an expression\[{A_{\mu }B^{\mu }=A_{\nu }B^{\nu }}\]. In fact, such replacement is often indicated when in a calculation we find ourselves headed for a confusing triple occurrence of the same index.

The transformation of tensor components from $S$ to $S'$ will involve the partial derivatives of the coordinates in the LT that links the two frames, just as it did in the 4-vector transformation analogous to Equation 9 of SR: mechanics. We introduce the following streamlined notation for these partial derivatives:

\[ \tag{5}\frac{\partial x^{\mu'}} {\partial x^{\mu}}=p_{\mu }^{\mu'}\;, \qquad \frac{\partial x^{\mu}}{\partial x^{\mu'}} = p_{\mu'}^{\mu}\;, \]

and note, by the chain rule, that

\[ \tag{6}{p_{\mu'}^{\mu}\,p_{\mu''}^{\mu'}=p_{\mu''}^{\mu}\;, \qquad p_{\mu'}^{\mu}\,p_{\nu}^{\mu'}=\delta_{\,\nu}^{\,\mu}}\;,\]

where \(\delta_{\,\nu}^{\,\mu}\), the Kronecker delta, equals 1 or 0 according as \({\mu =\nu }\) or \({\mu \neq \nu }\). It is of importance to note the "index substitution" role of \(\delta\) exemplified by \({A_{\mu \nu \rho}\delta_{\,\sigma }^{\,\nu}=A_{\mu \sigma \rho}}\) (in the summation all terms vanish except when \(\nu =\sigma \;\)).Also, we can 'flip' ps in an equation from one side to the other, for example

\[ \tag{7}{ A_{\mu}\,p_{\mu'}^{\mu}= B_{\mu '} \Rightarrow A_{\mu} = B_{\mu'}\,p_{\mu}^{\mu'} }.\]

For proof, multiply the original equation by \(p_{\nu}^{\mu'}\). Finally we note that under linear transformations, which the general LTs are, all the ps are constant.

We are now ready to define 4-tensors. An object having components \(A_{\sigma \dots \tau}^{\mu \dots \rho}\) in $S$ and components \(A_{\sigma'\dots \tau'}^{\mu'\dots \rho'}\) in $S'$ is said to transform tensorially from $S$ to $S'$ if

\[ \tag{8} A_{\sigma'\dots \tau'}^{\mu'\dots \rho'} = A_{\sigma \dots \tau}^{\mu \dots \rho} \;p_{\mu}^{\mu'} \dots p_{\rho}^{\rho'}\,p_{\sigma'}^{\sigma} \dots p_{\tau'}^{\tau}.\]

A quantity transforming tensorially between any two IFs (that is, under general LTs) is said to be a 4-tensor. Note how the components in S' are linear combinations of the components in $S$, and the same linear combinations for all tensors the same type. So if two tensors are equal (have equal components) in one frame, they are equal in all frames. This is why a physical law expressed as a tensor equality is automatically Lorentz-invariant. Observe how it is almost impossible to forget the transformation pattern Eq. (8) – by remembering that the free indices must balance on the two sides. Tensor transformations form a group – that is, they include the identity (from $S$ to $S$, when the ps are \({\delta }\) s ) and they are symmetric and transitive. Symmetry: if an object transforms tensorially from $S$ to $S'$, then also from $S'$ to $S$. For example,

\[ \tag{9} A_{\nu'}^{\mu'} = A_{\nu}^{\mu}\,p_{\mu}^{\mu'}\,p_{\nu'}^{\nu} \Rightarrow A_{\nu}^{\mu}=A_{\nu'}^{\mu'}\,p_{\mu'}^{\mu}\,p_{\nu}^{\nu'};\]

for proof, "flip" the ps, see Eq. (7). Transitivity: if an object transforms tensorially from $S$ to $S'$ and from $S'$ to $S''$, then also from $S$ to $S''$. For example,

\[ \tag{10} A_{\nu'}^{\mu'}=A_{\nu}^{\mu}\,p_{\mu}^{\mu'}\,p_{\nu'}^{\nu}\;\; \text{and} \;\; A_{\nu''}^{\mu''}=A_{\nu'}^{\mu'} \,p_{\mu'}^{\mu''}\,p_{\nu''}^{\nu'}\Rightarrow A_{\nu''}^{\mu''}=A_{\nu}^{\mu}\, p_{\mu}^{\mu''}\,p_{\nu''}^{\nu};\]

for proof, substitute the first equation into the second and use Eq. (6)(i). A tensor can thus be fully specified by prescribing its components in one frame, say $S$; the components in any other frame $S'$ will be determined by tensorially transforming away from $S$. Their tensorial relation between two arbitrary frames $S'$ and $S''$ is then assured by transitivity through $S$.

A tensor having only top indices is called (fully) contravariant: the ps in its transformation are the derivatives of the new with respect to the old coordinates. If a tensor has bottom indices only, it is called (fully) covariant and the relevant ps are the derivatives of the old with respect to the new coordinates. The most basic contravariant tensor is the coordinate differential \(dx^{\mu}\). For, by the chain rule of differentiation, we have

\[ \tag{11} dx^{\mu'}=p_{\mu}^{\mu'}dx^{\mu}.\]

Under LTs, differentials $dx^\mu$ and differences $\Delta x^\mu$ transform alike, and 4-vectors transform like the latter; so 4-vectors are contravariant one-index 4-tensors \(A^{\mu}\)! In general, one-index tensors are called vectors. The most basic covariant vector is the gradient of a function of position (a "scalar") \(\phi(x,y,z,ct)\); for, if we write

\[ \tag{12} \frac{\partial \phi}{\partial x^{\mu}}=\phi_{,\,\mu}\;,\]

we have, again by the chain rule,

\[ \tag{13} \phi_{,\, \mu'} = \phi_{,\,\mu}\,p^{\mu}_{\mu'}. \]

The Kronecker delta is a "mixed" tensor. For, by use of Eq. (6)(ii), we have

\[ \tag{14}\delta_{\,\nu}^{\,\mu}\,p_{\mu}^{\mu'}\,p_{\nu'}^{\nu} = p_{\nu}^{\mu'} \,p_{\nu'}^{\nu} = \delta_{\,\nu'}^{\,\mu'}. \]

Tensor algebra and differentiation

“Tensor operations” are operations that start with tensors and end up with tensors. Tensor algebra consists of just four such operations: sum, outer product, contraction, and index permutation. All are defined by the relevant operations on the components, but must be checked – once and for all – for their tensor character.

The sum \(C^{\mu\cdots }_{\sigma\cdots } \) of two tensors \(A^{\mu\cdots }_{\sigma\cdots } \) and \(B^{\mu\cdots }_{\sigma\cdots }\) of equal type is defined in all IFs thus: \[ \tag{15}C^{\mu\cdots }_{\sigma\cdots } = A^{\mu\cdots }_{\sigma\cdots } + B^{\mu\cdots }_{\sigma\cdots }.\] By looking at the primed version of this equation and substituting for \( A^{\mu'\cdots }_{\sigma'\cdots } \) and \( B^{\mu'\cdots }_{\sigma'\cdots } \) from Eq. (8), we see that the sum is a tensor.

If \( A^{\cdots }_{\cdots }\) and \( B^{\cdots }_{\cdots } \) are tensors of arbitrary type, simple juxtaposition of their components (but with all different indices) defines, in all IFs, their outer product. Thus, for example,

\[\tag{16} C^{\mu \nu}_{\rho \sigma \tau} = A^{\mu}_{\rho}\, B^{\nu}_{\sigma \tau}, \]

the outer product of the \(A\) and \(B\) tensors, is a tensor of the type indicated by its five indices, as can be seen at once by writing down its primed version and substituting for the primed \(A\) and \(B\) tensors from the relevant versions of Eq. (8).

We note that a scalar function \( \phi(x,y,z,ct) \) -- which could be a constant – is, in fact, by the definition (8), a “zero-index” tensor: no indices, no \( p\)s, no change; at each point in spacetime \( \phi \) stays the same as we change coordinates. Thus multiplication by a scalar is a particular case of outer multiplication (16). Another tensor, by the definition (8), that falls somewhat out of the norm, is the zero-tensor: there is one for every index type, sloppily denoted just by \(0\), and defined by having all its components equal to zero.

The third operation – contraction – depends crucially on the different transformation patterns implied by co- and contravariant indices. It can only be performed on tensors having both co- and contravariant indices. And it consists in summing over a selected pair of them, replacing this pair by a dummy-index pair. The result is a tensor of the type corresponding to the remaining free indices of the original tensor. For example, if \( A^{\mu}_{\nu \sigma}\) is a tensor, then so is

\[ \tag{17} B_{\nu} = A^{\mu}_{\nu \mu}.\]

For proof, just set \(\tau' = \mu' \) in Eq. (8) and use Eq. (6)(ii). Contraction in conjunction with outer product is called inner product, for example \( C_{\mu \rho \sigma} = A_{\mu \nu}\,B^{\nu}_{\rho \sigma}\). A most important case of contraction or inner multiplication arises when no free indices remain: the result is a scalar (an invariant). For example, $A_{\mu}^{\mu}\;$, $B_{\mu \nu}C^{\mu \nu}$ are invariants. A particular case\[ \delta^{\,\mu}_{\,\mu} = 4 \] .

The last of the algebraic tensor operations is index permutation. For example, from a given set of tensor components \( A_{\mu \nu \rho} \) we can form differently ordered sets like \( B_{\mu \nu \rho} = A_{\mu \rho \nu} \) or \( C_{\mu \nu \rho} = A_{\nu \rho \mu}\), etc., all of which constitute tensors, as is immediately clear from Eq. (8). Similar index permutations are permissible among superscripts. As a result, “symmetry” relations among tensor components, such as \( A_{\mu \nu} = A_{\nu \mu} \) or \( B_{\mu \nu \rho} + B_{\nu \rho \mu} + B_{\rho \mu \nu} = 0\), are tensor equations and thus coordinate-independent.

There is one more crucial tensor operation: differentiation. It applies to tensor fields, i.e. tensorial objects which are defined not only at one point in spacetime ("point tensors"), but continuously throughout a region. For example, the 4-velocity of a particle is defined all along the particle's worldline, and the 4-velocity of a fluid is defined throughout its volume. Of course, at each point $P$ of spacetime the transformation of a field tensor is identical to that of a similar point tensor at $P$. But, unlike point tensors, field tensors can be differentiated, as we contemplate their infinitesimal change from one point in spacetime to the next. Now, because of the constancy of the \(p\)s, it is clear from Eq. (8) that the derivative \( \frac{d}{d \tau} (A^{\cdots }_{\cdots }) \) of any 4-tensor \( A^{\cdots }_{\cdots } \) with respect to a scalar \( \tau \) is itself a 4-tensor of the same type. That is unsurprising. But a less expected result occurs with partial differentiation. Suppose we denote a partial derivative by a comma, as we did in Eq. (12). Applying the chain rule, as we did in Eq. (13), we now find from (8):

\[\tag{18} A^{\mu'\cdots \rho'}_{\sigma'\cdots \tau',\,\omega'} = A^{\mu\cdots \rho}_{\sigma\cdots \tau,\,\omega}\; p^{\mu'}_{\mu} \,p^{\sigma}_{\sigma'}\cdots p^{\rho'}_{\rho}\,p^{\tau}_{\tau'}\,p^{\omega}_{\omega'}.\]

Thus partial differentiation makes tensors out of tensors, simply adding one extra covariant index to the tensor type. [Of course, Eq. (13) was a harbinger of this.] By a repetition of the argument, all higher-order partial derivatives, which we denote by a single comma:

\[ \tag{19}A^{\cdots }_{\cdots ,\,\mu \nu} = A^{\cdots }_{\cdots ,\,\mu,\,\nu}\;\;\;\text{etc.,} \]

are tensors as well, each derivative adding one more covariant index to the tensor type.

The metric tensor

We now come to a very basic tensor that serves as a link between covariance and contravariance, and which stems from the metric structure of spacetime. That structure, as we have seen, is determined by the Minkowskian metric Equation 15 of SR: mechanics, which can now be written as

\[ \tag{20} \Delta s^2 = g_{\mu \nu} \Delta x^{\mu} \Delta x^{\nu}, \] with \[ \tag{21} g_{11} = g_{22} = g_{33} = -g_{44} = -1, \;\;\;g_{\mu \nu} = 0 \;\; \text{when} \;\;\; \mu \neq \nu\]

in all IFs. To see that \(g_{\mu \nu}\) is, in fact, a covariant 4-tensor -- the metric tensor -- we need merely Lorentz-transform the \( \Delta x^{\mu}\) in Eq. (20) , which we know to be tensorial:

\[ \tag{22} \Delta s^2 = g_{\mu \nu}\, p^{\mu}_{\mu'}\,p^{\nu}_{\nu'}\, \Delta x^{\mu'}\,\Delta x^{\nu'} = g_{\mu'\nu'}\, \Delta x^{\mu'}\,\Delta x^{\nu'};\]

comparing coefficients shows \( g_{\mu \nu}\) to be a tensor. We shall also need the quantities \(g^{\mu \nu}\) defined by

\[ \tag{23} g^{\mu \nu}g_{\nu \rho} = \delta^{\,\mu}_{\,\rho}. \]

In matrix language, these are the elements of the matrix inverse to \( g_{\mu \nu}\). It is easy to verify that numerically they are identical to the \(g_{\mu \nu} \) :

\[ \tag{24} g_{\mu \nu} = g^{\mu \nu} = \;\text{diag}(-1,-1,-1,1). \]

Moreover, they constitute a contravariant tensor. For, in any $S’$, they are defined by

\[\tag{25} g^{\mu' \nu'} g_{\nu' \rho'} = \delta^{\,\nu'}_{\,\rho'};\]

but this equation must also be satisfied by the tensor transforms of the \(g^{\mu \nu}\) , since the other two members of Eq. (23) transform tensorially. Hence the \( g^{\mu' \nu'}\) are these tensor transforms. We now have the two metric tensors (24) available for the operations of raising and lowering indices. For example, given a contravariant vector \( A^{\mu}\), we define its covariant components \(A_{\mu}\) as follows:

\[ \tag{26} A_{\mu} = g_{\mu \nu} A^{\nu} .\]

Conversely, given a covariant vector \(B_{\mu}\) , we define its contravariant components \( B^{\mu} \) thus:

\[ \tag{27} B^{\mu} = g^{\mu \nu} B_{\nu}. \]

These operations are consistent, in that raising a lowered index (or vice versa) restores the original:

\[ \tag{28} A^{\mu} = g^{\mu\nu}A_{\nu} = g^{\mu\nu}g_{\nu\rho}A^{\rho} = \delta^{\,\mu}_{\,\rho} A^{\rho} = A^{\mu}. \]

Consistency extends to differentiation: given \( A^{\mu},A_{\mu,\,\nu}\) could mean we differentiated and then lowered \(\mu\) or vice versa, but because the g s are constant, there is no difference.

Raising and lowering of indices can be extended to tensors with any number of indices, and can be applied multiply. When such operations are anticipated, we must write the indices in staggered form, e.g., \( {{A^{\mu}}_{\nu}}^{\rho} \) ; then, for example,

\[ \tag{29} A_{\mu \nu \rho} = g_{\mu \alpha}g_{\rho \beta} {{A^{\alpha}}_{\nu}}^{\beta}\;,\;\; {A^{\mu \nu}}_{\rho} = g^{\nu \alpha}g_{\rho \beta} {{A^{\mu}}_{\alpha}}^{\beta}\;,\;\;\; \text{etc.} \]

(It sometimes helps the eye if -- as in (29) -- we denote dummy index pairs by letters from a different part of the alphabet, e.g., \(\alpha,\beta,\cdots \) ) We regard all these versions of \( {{A^{\mu}}_{\nu}}^{\rho} \) as denoting the same tensorial object; for purely formal reasons the one or the other form of its components may at times be preferred. It is of interest to see the product \(\mathbf{A}\mathop{\mathbf{.}}\mathbf{B}\) (cf Equation 18 of SR: mechanics) of two 4-vectors expressed in terms of the metric tensor:

\[ \tag{30} \mathbf{A}\mathop{\mathbf{.}}\mathbf{B} = g_{\mu \nu}A^{\mu}B^{\nu} = A^{\mu}B_{\mu} = A_{\mu}B^{\mu}. \]

The last two expressions illustrate a clearly universal rule: one can always “see-saw” any dummy-index pair.

What happens if we raise an index on \( g_{\mu \nu} \) itself? We get a delta:

\[ \tag{31} g^{\mu}_{\nu} = g^{\mu \alpha}g_{\alpha \nu} = \delta^{\,\mu}_{\,\nu}. \]

In fact, some authors write all \( \delta \)s as \(g\) s .

Evidently the same free index in each term of a tensor equation can be shifted up or down at will. Hence, in particular, symmetries in the covariant indices hold equally in the contravariant indices and vice versa; for example,

\[ \tag{32}A_{\mu \nu \rho}+A_{\nu \rho \mu} + A_{\rho \mu \nu} = 0 \Longleftrightarrow A^{\mu \nu \rho}+A^{\nu \rho \mu} + A^{\rho \mu \nu} = 0. \]

Lastly, consider what happens when we raise or lower an individual numerical index in the full list of components of a tensor. The rule is very simple: raising or lowering a 4 has no numerical effect, while raising or lowering a 1,2,or 3 changes the sign of the component:

\[ \tag{33}A_i = g_{i\mu}A^{\mu} = -A^i \;\;(i=1,2,3),\quad A_4 = g_{4\mu}A^{\mu} = A^4 . \]

As we have seen above, 4-vectors are 4-tensors of type \( A^{\mu}\). Earlier we also saw (cf Equation 14 of SR: mechanics) that 4-vectors can be written in the form \( \mathbf{A} = (\mathbf{a}, A_4) \). What is then the covariant form? By (33),

\[ \tag{34} A^{\mu} = (\mathbf{a},A_4) \Longleftrightarrow A_{\mu} = (-\mathbf{a},A_4).\]

Maxwell’s Theory in Tensor Form

The commonly used SI system of units for electrodynamics is extremely inconvenient when it comes to its relativistic formulation, since it masks the inherent pseudo-symmetry between the electric and magnetic fields. Accordingly we here use the older Gaussian or cgs (centimeter-gram-second) system. Maxwell’s equations then read as follows: \[ \tag{35} \nabla \mathop{\mathbf{.}} \mathbf{e} = 4 \pi \rho,\quad \nabla \times \mathbf{b} = \frac{1}{c} \frac{\partial \mathbf{e}}{\partial t} + \frac{4\pi}{c}\mathbf{j} \] \[\tag{36} \nabla \mathop{\mathbf{.}} \mathbf{b} = 0,\quad \nabla \times \mathbf{e} = -\frac{1}{c} \frac{\partial \mathbf{b}}{\partial t}. \] The first equation in each line is a scalar equation, while the second is a 3-vector equation, giving us eight differential equations in all. The first two equations connect the electric and magnetic fields, here denoted by \(\mathbf{e}\) and \(\mathbf{b}\) , to the sources– the charge density \(\rho \) and the current density \(\mathbf{j} \). Equations (36), on the other hand, represent restrictions on the field which, as we shall see, amount to the necessary and sufficient conditions for the existence of the usual potentials. Note that Maxwell’s equations are linear, which has the consequence that two solutions \( (\mathbf{e},\mathbf{b},\rho,\mathbf{j})\) can be simply added to form a third. None of these equations says anything about the action of the field on the motion of a charged particle. That is the role of Lorentz’s force law:

\[\tag{37} \mathbf{f} = q\, \Bigl(\mathbf{e} + \frac{1}{c}\mathbf{u}\times\mathbf{b}\Bigr), \]

in which \( q\) is the charge and \(\mathbf{u} \) the velocity of the particle. This law also provides a practical way to determine \( \mathbf{e}\) and \(\mathbf{b}\) . Unlike the mass $m_u,$ the charge \( q \) is an invariant. We shall find Eq. (37) to be a convenient starting point for the 4-dimensional formulation of the theory. We have already noted (cf after Equation 68 of SR: mechanics) that in general a relativistic force must involve the velocity of the particle on which it acts. From a 4-dimensional point of view, the simplest dependence of a 4-force \(F^{\mu} \) on the 4-velocity \( U^{\mu}\) of the particle is a linear one\[F^{\mu} = A^{\mu}_{\nu} U^{\nu} \] , where the \( A^{\mu}_{\nu}\) are tensorial coefficients. [Of course, Eq. (37) is linear in the 3-velocity, too.] Let us also assume that the force is proportional to the charge \( q\). Then, lowering the index \(\mu \) and introducing a factor \(c \) for later convenience, we can “guess” the tensor equation

\[ \tag{38} F_{\mu} = \frac{q}{c} E_{\mu \nu} U^{\nu},\]

thereby introducing the electromagnetic field tensor \( E_{\mu \nu}\) . We would surely want the force \( F^{\mu}\) to be rest-mass preserving, which, according to Equation 64 of SR: mechanics and (30), requires \(F_{\mu}U^{\mu}=0\) . So we need

\[ \tag{39} E_{\mu \nu} U^{\mu}U^{\nu} = 0 \]

for all \(U^{\mu}\), and hence the anti-symmetry of the field tensor:

\[ \tag{40} E_{\mu \nu} = -E_{\nu \mu}.\]

Now, anti-symmetric 4-tensors have the pleasant property that their six independent non-zero components split into two sets of three which transform as 3-vectors under rotations. [Compare this with the first three components of 4-vectors transforming as 3-vectors under rotations.] We can define these two 3-vectors in the case of \(E_{\mu \nu}\) as \(\mathbf{e}\) and \(\mathbf{b}\) in every inertial frame:

\[\tag{41}E_{\mu \nu}= \begin{pmatrix} 0&-b_3&b_2&-e_1\\ b_3&0&-b_1&-e_2 \\ -b_2&b_1& 0& -e_3\\ e_1&e_2&e_3&0 \end{pmatrix},\qquad E^{\mu \nu}= \begin{pmatrix} 0&-b_3&b_2&e_1\\ b_3&0&-b_1&e_2 \\ -b_2&b_1& 0& e_3\\ -e_1&-e_2&-e_3&0 \end{pmatrix}. \]

We exhibit the contravariant form for future reference. Whenever 2-index tensor components are exhibited as a matrix, the first or top index denotes the row.

We can now verify that our “suspected” tensor law (38) is completely equivalent to Lorentz’s force law (37). By reference to Equation 27 of SR: mechanics, Eq. (38) can be written as

\[ F_{\mu} = (q/c)\; \gamma(u)\; E_{\mu \nu}\; (\mathbf{u},c)^{\nu},\]

where the last factor denotes the \(\nu \)th component of \((\mathbf{u},c).\) When we perform the implied summation over \( \nu\), this becomes

\[ F_{\mu} = (q/c)\, \gamma(u)\,\bigl(-b_3u_2 + b_2u_3-ce_1,b_3u_1-b_1u_3-ce_2,-b_2u_1+b_1u_2-ce_3,\mathbf{e}\mathop{\mathbf{.}}\mathbf{u}\bigr),\]

which is at once recognized as the covariant version [cf (34)] of the \( \mathbf{F}\) as given in Equation 64 of SR: mechanics with \(\mathbf{f} \) precisely the Lorentz force (37).

Next we consider the sources. Since Maxwell’s equations are differential equations, we contemplate a continuous distribution of charges, and, at first, one that has a unique 3-velocity \(\mathbf{u} \) at each event. We define the proper charge density \( \rho_0\) of this continuum as the charge density\( \rho\) measured in the local (comoving, inertial) rest-frame. Then in the lab frame, where the charges move with velocity \(\mathbf{u} \), we shall have, because of length contraction,

\[ \tag{42} \rho = \rho_0 \gamma(u). \]

We define the 3-current density as

\[ \tag{43} \mathbf{j} = \rho \mathbf{u},\]

and recall that the conservation of charge is expressed by the following equation of continuity:

\[ \tag{44} \frac{\partial \rho}{\partial t} + \nabla\mathop{\mathbf{.}}\mathbf{j} = 0 . \]

The 4-current density \(\mathbf{J} \) is defined by the first of the following equations ( provisionally – hence the brackets):

\[\tag{45} \mathbf{J} = [\;\rho_0 \mathbf{U} = \rho_0 \gamma(u) (\mathbf{u},c)\;] = (\mathbf{j},c\rho), \]

and it allows us to express the equation of continuity (44) 4-dimensionally:

\[\tag{46} J^{\mu}_{,\,\mu} = 0.\]

In real life, as in a current-carrying copper wire, the local velocities of the charges are not all the same. But we assume that we can divide them into classes which do have unique velocities. We then define the effective \(\rho \), \( \mathbf{j}\) , and \( \mathbf{J}\) as the sum of the \( \rho\), \( \mathbf{j}\) , and \(\,\mathbf{J} \) of all the classes. The effective \( \mathbf{J}\), being then a sum of 4-vectors, is itself a 4-vector, and is related to the effective \( \rho\) and \( \mathbf{j}\) as in the extremities of Eq. (45). Eqs. (44) and (46) are still valid for the effective quantities. But the two middle terms of (45) do not generalize: the effective \( \mathbf{J}\) is not necessarily timelike, as illustrated by the sum of two currents with equal and opposite \( \rho \).

With these preliminaries out of the way, we are ready to “guess” the 4-tensor forms of Maxwell’s equations:

\[\tag{47} E^{\mu \nu}_{,\,\mu} = \frac{4 \pi}{c} J^{\nu} \]


\[\tag{48}E_{\mu \nu,\, \sigma} + E_{\nu \sigma,\, \mu} + E_{\sigma \mu,\, \nu} = 0. \]

By reference to the definitions (41) one easily sees that the first three of the four equations (47) (\( \nu = 1,2,3\;\)) are indeed equivalent to Maxwell’s equations (35)(ii), while the fourth \((\nu = 4) \) is equivalent to (35)(i). Giving values 1,2,3, respectively, to the indices \(\mu,\nu,\sigma \) in (48) results in Maxwell’s equation (36)(i), while the values 2,3,4; 3,4,1; 4,1,2 give the three components of (36)(ii). All other sets of values either yield one of the equations already obtained or 0=0.

It is very satisfactory that the field equation (47) immediately implies the equation of continuity (46), since \(E^{\mu \nu}_{,\,\mu \nu} \) is symmetric in its subscripts but antisymmetric in its superscripts, and must therefore vanish (\( E^{12}_{,\,12} = - E^{21}_{,\,21}\) etc.). This, in 3-dimensional form, had been Maxwell’s reason for adding the “displacement current” \( \partial \mathbf{e}/\partial t\) to his equations, which had so puzzled some of his contemporaries.

The 4-potential

The electromagnetic field tensor \( E_{\mu \nu}\) allows itself to be expressed in terms of a covariant 4-vector potential \( \Phi_{\mu}\) as follows (the overall choice of sign being conventional):

\[ \tag{49}E_{\mu \nu} = \Phi_{\nu,\,\mu} - \Phi_{\mu,\,\nu}.\]

This \(\Phi_{\mu} \) will presently be seen to correspond to the usual scalar and vector potentials which provide such a convenient mathematical tool for solving Maxwell’s equations. Of course, (49) immediately implies Maxwell’s equation (48). But in the theory of differential equations the converse is also known: Eq. (48) is precisely the necessary and sufficient condition for being able to express \(E_{\mu \nu} \) as in (49). [This, incidentally, is analogous to \(g_{i,j} = g_{j,i} \) being the necessary and sufficient condition for the existence of a scalar potential \(\Phi \) such that \(g_i = \Phi_{,\,i} = \nabla \Phi \) -- which actually holds in all dimensions.]

Although the potential \( \Phi_{ \mu}\) turns out to be not uniquely determined by \( E_{\mu \nu}\), picking any potential \(\Phi_{ \mu} \) in one IF and its tensor transforms in all other IFs clearly guarantees (49) in all IFs. We may therefore take \( \Phi_{ \mu}\) to be a tensor. Any two potentials \( \Phi_{ \mu}\) and \( \tilde{\Phi}_{ \mu}\) satisfying (49) can differ by at most a gradient. For if \( \Phi_{ \mu} - \tilde{\Phi}_{ \mu} = \Psi_{\mu} \) , we must have \( \Psi_{\nu,\,\mu} = 0\), whence \( \Psi_{\mu} = \Psi_{ ,\,\mu}\) for some scalar \( \Psi\).

Now, given an arbitrary potential \( \tilde{\Phi}_{\mu} \) , we can always find another, \( \Phi_{ \mu}\) , which satisfies the so-called Lorenz (not Lorentz!) gauge condition

\[ \tag{50}\Phi^{\mu}_{ ,\,\mu} = 0. \]

It is merely necessary to find a scalar \(\Psi \) such that

\[\tag{51} \square \Psi = - \tilde{\Phi}^{\mu}_{ ,\,\mu}, \]

where \(\square \) is the D’Alembertian operator defined as

\[ \tag{52}\square \equiv \,{}_{ ,\,\mu \nu}\, g^{\mu \nu} = \frac{1}{c^2} \frac{\partial^2}{\partial t^2} - \frac{\partial^2}{\partial x^2} - \frac{\partial^2}{\partial y^2} - \frac{\partial^2}{\partial z^2}.\]

[Eq. (51) is in principle solvable, see after (55) below.] For then, with \( \Phi_{ \mu} = \tilde{\Phi}_{ \mu} + \Psi_{ ,\,\mu}\),

\[ \tag{53} \Phi^{\mu}_{,\, \mu} = \tilde{\Phi}^{\mu}_{,\,\mu} + {\Psi_{,}}^{\mu}_{\mu} = \tilde{\Phi}^{\mu}_{,\,\mu} + g^{\mu \nu} \Psi_{ ,\,\mu \nu} = 0. \]

If we substitute (49) with (50) into the first of the tensor field equations, (47), which is now all that remains to be satisfied, we find that it reduces to

\[ \tag{54}\square \Phi_{ \mu} = \frac{4\pi}{c} J_{\mu} . \]

The Lorenz gauge has decoupled the field equations! Finally, the standard solution of (54) is

\[\tag{55}\Phi_{ \mu} (P) = \frac{1}{c} \int \frac{[J_{\mu}] dV}{r}, \]

provided \( J_{\mu}\) is “sufficiently small” at infinity; here [\(J_{\mu} \)] denotes the value of \( J_{\mu}\) “retarded” by the light travel time to the origin P from the position \( \mathbf{r}\) of \(dV\), and the integral extends over all of 3-space. [Eq. (51) is solved analogously.] Note how the field is “built up” at the speed of light. It can be shown that the solution (55) (i) automatically satisfies the Lorenz gauge condition (50) because of the equation of continuity (46), (ii) is tensorial, and (iii) is unique in the absence of “incoming radiation”.

In charge-free regions (\(J_{\mu} = 0 \)), by (54), we have \(\square \Phi_{ \mu} = 0 \), and with that, using (49) and the commutativity of partial derivatives,

\[ \tag{56} \square E_{\mu \nu} = 0 .\]

This is the wave equation. Hence disturbances of the field propagate in vacuum at the speed of light. This result was the basis for Maxwell’s hypothesis that light consisted of electromagnetic waves.

We finally translate our 4-dimensional work on the potential back into the more familiar 3-dimensional formalism. We define the Maxwell scalar and vector potentials \( \phi \), \(\mathbf{w}\) (often written as \(\phi,\mathbf{A} \) or \( V,\mathbf{A}\) ) by

\[\tag{57} \Phi^{\mu} = (\mathbf{w},\phi) \Leftrightarrow \Phi_{\mu} = (-\mathbf{w},\phi). \]

The expression (49) of the field in terms of the potential then becomes

\[ \tag{58}\mathbf{e} = -\nabla \phi - \frac{1}{c}\frac{\partial \mathbf{w}}{\partial t} , \;\;\; \mathbf{b} = \nabla \times \mathbf{w}. \]

And from equations (50) and (54) we find

\[ \tag{59} \frac{\partial \phi} {\partial t} + c \nabla\mathop{\mathbf{.}}\mathbf{w} = 0 \qquad \text{(Lorenz gauge condition)} \] \[ \tag{60} \square \phi = 4 \pi \rho, \; \; \square \mathbf{w} = \frac{4 \pi}{c} \mathbf{j} \qquad \text{(field equations)}. \]

Transformation of \(\mathbf{e} \) and \( \mathbf{b}\)

In Newton’s theory, a mass point carries its isotropic inverse-square field in all its movements – gravity is instantaneous and force is invariant. The situation in electromagnetism is more complicated. It is the tensor property of \(E^{\mu \nu} \) that directly gives us the transformation of \(\mathbf{e} \) and \( \mathbf{b}\) -- let us say under a standard LT between the usual two IFs $S$ and $S’$ in standard configuration. Consider, for example, \( E^{1'2'}\), the (1,2) component of \(E^{\mu'\nu'} \). From (8) we have

\[\tag{61}E^{1'2'} = E^{\mu \nu}\, p^{1'}_{\mu}\, p^{2'}_{\nu},\]

while the non-zero \( p\)s can be read off from Equation 4 of SR: Kinematics:

\[\tag{62} p^{1'}_1 = p^{4'}_4 = \gamma, \;\;\; p^{1'}_4 = p^{4'}_1 = -\gamma\; v/c, \;\;\; p_{2}^{2'} = p_3^{3'} = 1.\]

This yields

\[\tag{63}E^{1'2'} = E^{\mu \nu}\, p^{1'}_{\mu}\, p^{2'}_{\nu} = E^{\mu 2}\, p^{1'}_{\mu} = \gamma\;\bigl( E^{12} - \frac{v}{c} E^{42}\bigr),\]

or, by reference to (41),

\[\tag{64}-b_{3'} = \gamma\;(-b_3 + v e_2/c).\]

In the same way we obtain the other entries in the following list:

\[e_{1'} = e_1, \;\;\; e_{2'} = \gamma\;(e_2 – v\, b_3/c),\;\;\; e_{3'} = \gamma\;(e_3 + v\, b_2/c). \] \[\tag{65} b_{1'} = b_1, \;\;\; b_{2'} = \gamma\;(b_{2} + v\,e_3/c), \;\;\; b_{3'} = \gamma\;(b_3 – v\, e_2/c). \]

As usual, the inverse transformation is obtained by a \( v\)-reversal [see the paragraph before Eq. 5 of SR: Kinematics ]. It is now clear that there is no such thing as a “pure” \(\mathbf{e}\)-field or a “pure” \(\mathbf{b}\)-field in Maxwell’s theory: even if all the \(\mathbf{b}\,\)s [ \(\mathbf{e}\,\)s ] are zero in one IF $S$, they will not all be zero in $S’$. There are two quantities, however, which are invariant from frame to frame, namely

\[\tag{66}X = \mathbf{b}^2 - \mathbf{e}^2,\;\;\; \text{and} \;\;\; Y = \mathbf{e}\mathop{\mathbf{.}}\mathbf{b},\]

as can be readily verified from (65). [There are more elegant tensor proofs of this invariance, e.g. $X=\frac{1}{2}E_{\mu \nu}E^{\mu \nu}$ and $Y=\frac{1}{4}E_{\mu \nu}B^{\mu \nu}$, where $B^{\mu \nu}$ is the "dual" of $E^{\mu \nu}$, not discussed here.]. Since $X,Y$ are clearly invariant also under rotations and translations, they are invariant under general LTs. It follows that each of the three relations \(\vert \mathbf{e} \vert > \vert \mathbf{b} \vert,\)\(\vert \mathbf{e} \vert = \vert \mathbf{b} \vert,\) \(\vert \mathbf{e} \vert < \vert \mathbf{b} \vert\) is invariant, as is the acuteness or obtuseness of the angle between \( \mathbf{e}\) and \( \mathbf{b}\) (according to the sign of \(\mathbf{e}\mathop{\mathbf{.}}\mathbf{b}\)) and, in particular, their orthogonality.

A typically relativistic use of the transformation equations (65) is to find the fields of a uniform straight current and of a uniformly moving point charge, by “translating” from the respective rest frames of the charges. SR offers this possibility of “frame-hopping”, that is, of solving the problem in a special IF where conditions are simpler, and then translating the result to the frame of interest. Spotting a tensor equation valid in the special frame and re-interpreting it in the lab frame is perhaps the neatest way of doing this, but not always feasible. The more pedestrian method of direct translation is perfectly acceptable and often more straightforward.

Consider an infinite straight current \( I\) as an infinite line of charge moving at uniform velocity \(v\) through the lab frame $S$, and having line charge density\( \lambda\) in $S$, so that \( I = \lambda v\) . This line charge has line density \( \lambda_0 = \lambda/ \gamma(v) \) in its rest frame $S’$, by length contraction. Now consider the field in $S’$. Because the charge is at rest, it must be purely electric, and by symmetry it must be radial. Then, by applying Gauss’s outflux theorem [the integral version of (35)(i)] to a cylinder having the line charge as axis, we find for the field strength in $S’$:

\[ \tag{67}e_0 = \frac{2 \lambda_0}{r}, \]

where \(r \) is the radial distance from the axis. Lastly we transform this to the lab frame $S$ by the inverse of (65) . Taking \( e_0 = e_{2'}\) as the only nonvanishing field component at a typical point of $S’$, we find, for the only nonvanishing field components in $S$,

\[ \tag{68}e_2 = \frac{2\,\gamma\; \lambda_0}{r} = \frac{2 \,\lambda}{r},\;\;\; b_3 = \frac{2\, \gamma\; \lambda_0 v}{c\, r} = \frac{2\, I}{c\,r},\]

which indicates a radial \(\mathbf{e} \) field and a circular \( \mathbf{b}\) field. A real current in a copper wire corresponds to two superimposed line charges, one at rest and one in motion, with equal and opposite line charge densities in the lab (the total charge being zero). Hence the resultant \( \mathbf{e}\) field vanishes, while the \(\mathbf{b} \) field is as given in (68). The motion of the electrons in such a current is surprisingly slow (a few millimeters per second). Nevertheless the \(\mathbf{b} \) field arises from this velocity. And it is the workhorse of electromagnetic machinery. As A.P. French has remarked, Who says that relativity is important only for velocities close to the speed of light? The smallness of the velocity is here balanced by the relative hugeness of the moving charge.

We can use the same kind of argument to find the \( \mathbf{e}\) and \(\mathbf{b}\) fields of a point charge \( q\) moving at uniform velocity \( v\) through the lab frame $S$, say along the \( x\) axis. We now face an extra complication, namely that the lab field is not static, and we would wish to know what it is at a single instant in $S$, say at \( t = 0\), when the charge may be assumed to pass through the origin. In the rest frame $S’$, with the charge at the origin, we have the usual Coulomb field:

\[\tag{69} \mathbf{e}' = (q/{r'}^3)\;(x',y',z'),\;\;\; \mathbf{b}' = 0,\;\;\; {r'}^2 = {x'}^2 + {y'}^2 + {z'}^2 .\]

Lorentz-transforming the coordinates cf Equation 4 of SR: Kinematics at \(t=0\) gives (\((x',y',z') = (\gamma\; x,y,z).\) Then, using the second line of (65) with \(\mathbf{b}' = 0\), and the inverse of the first line, we find, successively,

\[\mathbf{b} = (v/c)\,(0,-e_3,e_2),\;\;\; \mathbf{e} = (e'_1,\gamma\; e'_2,\gamma\; e'_3) = (q\, \gamma/{r'}^3)\;(x,y,z),\] \[\tag{70} {r'}^2 = \gamma^2\, x^2 + y^2 + z^2 = \gamma^2\, r^2 - (\gamma^2-1)\,(y^2+z^2) = \gamma^2\,r^2\,\bigl[1-(v^2/c^2) \,\sin^2(\theta) \bigr],\]

where \(\theta\) is the angle between the vector \(\mathbf{r} = (x,y,z)\) and the \(x\) axis. Hence the result\[\tag{71}\mathbf{b} = \frac{1}{c} \mathbf{v} \times \mathbf{e},\;\;\; \mathbf{e} = (q\,\mathbf{r})/\Bigl({\gamma^2 \,r^3\,\bigl[1-(v^2/c^2)\,\sin^2(\theta)\bigr]^{3/2}}\Bigr).\]

Note that the instantaneous electric field is still radial [in spite of the fact that it was “caused” before the charge arrived at the origin!] and that its strength still falls off as the inverse square of the distance along any radius. But the field is now stronger at the sides than in front and back. In fact, if we use the standard field-strength representation by the number of field lines crossing unit area (as is permitted by Gauss’s outflux theorem), one can prove the interesting result that the field line pattern of the moving charge is simply the isotropic Coulomb pattern shrunk by a Lorentz factor – as if by length contraction! [This result is perhaps not quite so surprising when we think of the analogous situation of a magnetic monopole and its field lines marked by iron filings – looked at from different frames.] The \(\mathbf{b}\) field is circular, and also strongest at the sides but, of course, zero on the line of motion.

The electromagnetic energy tensor

Various early researchers, beginning around the mid 1800’s, had discovered various “mechanical” properties of the electromagnetic field (then still thought of as the ether) – its energy, its energy flux, its momentum, and then finally Maxwell, in 1873, its stress components. But it was one of Minkowski’s great achievements, in 1908, to incorporate all these mechanical properties into a single 4-tensor equation. In this way all the various conservation laws of continuum mechanics could be extended to interactions between charged matter and electromagnetic fields. For example, if we simultaneously release two oppositely charged particles from rest, they accelerate toward each other; where does their kinetic energy come from? If we look at the same situation from another frame, one particle accelerates before the other; where does its new momentum come from? There are good reasons for considering the field itself as able to exchange energy and momentum with matter, so that all the familiar conservation laws remain satisfied.

Consider a charged “fluid” in the presence of an electromagnetic field, but subject to no other external forces. We define the 4-force density \(\tilde{K}^{\mu}\) by a procedure analogous to that we used in the definition of \(J^{\mu}\) [see after (45)], namely by dividing the fluid into classes of unique \(\rho_0\) and \(\mathbf{u}\) (hence \(\mathbf{U}\)) and summing over all classes. For each class,\(\tilde{K}^{\mu}\) is the Lorentz force (38) on unit proper (co-moving) volume of fluid, and thus on a charge \(\rho_0\). Each partial \(\tilde{K}^{\mu}\) is a vector, hence so is the sum, the effective \(\tilde{K}^{\mu}\):

\[\tag{72} \tilde{K}^{\mu} = \sum \frac{\rho_0}{c} {E^{\mu}}_{\nu} U^{\nu} = \frac{1}{c} {E^{\mu}}_{\nu} J^{\nu}.\]

If we write \(\tilde{\mathbf{k}}\) for the partial 3-force per unit proper volume, we have, from Eq. 74 (iii) of SR: Mechanics,

\[\tag{73} \tilde{K}^{\mu} = \sum \,\gamma(u)\, (\tilde{\mathbf{k}},\tilde{\mathbf{k}}\mathop{\mathbf{.}}\mathbf{u}/c) = (\mathbf{k},\,c^{-1}\partial W / \partial t), \]

where \(\mathbf{k} \) is the effective (total) 3-force per unit lab volume, and \(\partial W/ \partial t \) the effective rate of work done by the field on unit lab volume of fluid; for each \( \gamma\) -factor converts from unit proper to unit lab volume. We shall eventually equate the RHS’s of (72) and (73).

But first, we eliminate via Maxell’s Eq. (47) all reference to the sources from the RHS of (72), after lowering the index \( \mu \), and then begin a further conversion using the Leibniz rule backwards:

\[\tag{74} \tilde{K}_{\mu} = \frac{1}{4\pi}E_{\mu \nu}E^{\sigma\nu}_{ ,\,\sigma} = \frac{1}{4\pi} \left[(E_{\mu\nu} E^{\sigma \nu})_{ ,\, \sigma} – E_{\mu \nu,\,\sigma}E^{\sigma\nu} \right]. \]

The final aim is to turn the entire RHS into a derivative, starting with the last term:

\[\tag{75}E_{\mu \nu,\,\sigma}E^{\sigma \nu} = \frac{1}{2}(E_{\mu \nu,\,\sigma}- E_{\mu \sigma,\,\nu})E^{\sigma \nu} = \frac{1}{2}E_{\sigma \mu,\,\nu}E^{\sigma \nu} = \frac{1}{4} (E_{\sigma \nu} E^{\sigma \nu})_{ ,\,\mu}; \]

in the first step we used the antisymmetry of \( E^{\sigma \nu}\), in the second step the Maxwell (48), and finally the see-saw rule. Substituting in (74), we now have

\[ \tag{76}\tilde{K}^{\mu} = - M^{\mu \nu}_{,\,\nu}, \;\;\; M^{\mu \nu}:= \frac{1}{4\pi} \bigl( {E^{\mu}}_{\lambda} E^{\lambda \mu} + \frac{1}{4}E_{\lambda \rho}E^{\lambda \rho}\bigr).\]

The tensor \(M^{\mu \nu}\) which arises here, and which is easily shown to be symmetric and trace-free,

\[\tag{77} M^{\mu \nu} = M^{\nu \mu},\;\;\; {M^{\mu}}_{\mu} = 0,\]

is the fundamentally important (Minkowski-) energy tensor of the electromagnetic field. Also note, from (72) and (76)(i), that when \(J^{\mu} = 0\), it has zero divergence. For its components we find, directly from the definition and from (41), \begin{align} M^{44} &= \frac{1}{8\pi} (e^2+b^2) := \sigma &&\text{= energy density} \nonumber \\ c M^{4i} &= \frac{c}{4\pi} (\mathbf{e}\times\mathbf{b})_i := s_i &&\text{= energy-current density (Poynting vector)} \nonumber\\ &&&= c^2 \times \text{momentum density} \nonumber \\ \tag{78} M^{ij} &= -\frac{1}{4\pi}\left[ e_ie_j+b_ib_j + \frac{1}{2}g_{ij} (e^2+b^2)\right]:=p_{ij} &&=\text{Maxwell stress tensor}. \end{align} To justify the alleged physical significance of these terms, let us look at the separate components of Eq. (76)(i), using (72). First, after a sign change, \(\mu = 4\) gives

\[ \tag{79} -\frac{\partial W}{\partial t} = \frac{\partial \sigma}{\partial t} + \nabla\mathop{\mathbf{.}}\mathbf{s},\]

which leads us to identify \(\sigma\) with the energy density and \(\mathbf{s}\) with the energy-current density of the field. For, if \(\partial W /\partial t\) is the rate of work done by the field on the fluid in unit volume, \(-\partial W/\partial t\) can be regarded as the work done by the fluid on the field. And this should equal the rate of increase of energy, \(\partial \sigma/\partial t\), of the field in the unit volume, plus the outflux of field energy, \(\nabla . \mathbf{s}\), from that volume, in unit time.

Next, set \(\mu=i\) in (76)(i), again after a sign change. This gives

\[\tag{80} -k_i = \frac{\partial(c^{-2} s_i)}{\partial t} + \frac{\partial p_{ij}}{\partial x^{i}}.\]

Since \(\mathbf{k}\) is the force of the field on the fluid, \(-\mathbf{k}\) can be regarded as the force of the fluid on the field, and this should equal the rate at which field momentum is generated inside a unit volume. Accordingly we recognize \(c^{-2}s_i\) as the i-momentum density of the field, and \(p_{ij}(= p_{ji})\) as the flux of i-momentum in the j-direction. This flux is precisely Maxwell’s stress tensor of the electromagnetic field. Why? Because a momentum flux is equivalent to a force. If a machine gun fires bullets into a wooden block, that block experiences a force equal to the momentum absorbed per unit time. Maxwell accordingly regarded \(p_{ij}\) as the \(i\)-component of the total force which the field (!) on the negative side of a unit area normal to the \(j\)-direction exerts on the field on the positive side.

Note how the symmetry of Minkowski’s energy tensor implies the equality of the energy current (from the last row) with \(c^2\) times the momentum density of the field (from the last column), just as is the case because of Einstein’s \(E=m_uc^2\) for a real fluid. This holds in spite of the fact the no unique velocity can be associated with the electromagnetic energy flow that would be consistent from frame to frame.

The mechanical energy tensor

Minkowski’s discovery, that the mechanical characteristics of the electromagnetic field – its energy, momentum, and stress – combine into a single tensor, showed how inextricably these quantities were intertwined, how each must enter into the transformation of the others, and how these transformations must be just right for them to form a 4-tensor. This realization, ironically, quickly led to the tensorial reformulation of the continuum mechanics of ordinary matter. It was now clear that the mass density \(\rho\), the momentum density \(\textbf{g}\) , and the stress 3-tensor \(p_{ij}\) must combine into a symmetric 4-tensor \(T^{\mu \nu}\) as in (78), called the energy tensor (after its generally dominant component):

\[\tag{81} T^{\mu \nu} = \begin{pmatrix} p_{ij} & c\,\mathbf{g}\\ c\,\mathbf{g}&c^2 \rho \end{pmatrix}. \]

[Just as a 4-vector under rotations splits into a 3-vector plus a scalar – see Eq. 14 of SR: mechanics – so a 2-index 4-tensor splits into a 3-tensor as the leading matrix, two 3-vectors along the edges, and a scalar in the far corner.]

Now, just as in (73), let \(\tilde{K}^{\mu}\) be the rest-mass preserving 4-force acting on unit proper volume of the material fluid. However, our present fluid is the analog of the previous field, and the force on that (action = - reaction) was \(-\tilde{K}^{\mu}\). So the analog of (76)(i) is

\[ \tag{82}\tilde{K}^{\mu} = (\mathbf{k},\,c^{-1}\partial W/\partial t) = {T^{\mu \nu}}_{,\,\nu}.\]

This is now essentially the form of Newton’s second law, when extended to Special Relativity and to mechanical continua.

When the force on the fluid (now considered charged) is entirely due to an electromagnetic field, then \(\tilde{K}^{\mu}\) is as given in (76)(i), and Eq. (82) becomes

\[\tag{83} (T^{\mu \nu} + M^{\mu \nu})_{,\,\nu} = 0.\]

which says that the 4-divergence of the total energy tensor is zero. This is the very neat equation governing the dynamical interaction of electromagnetic fields and charged mechanical fluid. It ensures the conservation of total energy and momentum, as in (79) and (80).

We note from the definition (81) that once again the last row of \(T^{\mu \nu}\) is identical to the last column, and this is required by Einstein’s \(E = m_uc^2\), since (apart from multiples of \(c\)) the one serves as energy current density and the other as momentum density. Such symmetry, however, is possible frame-independently only if the entire tensor is symmetric:

\[\tag{84} T^{\mu \nu} = T^{\nu \mu},\]

and so we must have, in all frames, \(p_{ij} = p_{ji}\).

Summarizing our survey, we may observe that while SR has not changed the physical content of Maxwell’s theory in vacuum it has added immeasurably to our insight into the structure of the theory through the use of tensor formalism. One should never underestimate the power of “mere” formalism – examples of which abound in physics and mathematics. Additionally, SR has given us an important new tool for solving electromagnetic problems, namely the possibility of solving a problem not necessarily in the frame in which it is presented, but in a frame of our own choosing, where conditions might be simpler, and only at the end transforming back to the frame of interest. However, Minkowski’s extension of Maxwell’s theory to the interior of “ponderable” media (like dielectrics) in 1908, into which we did not here enter, that indeed was new physics, unachievable without relativity.

See also

Special relativity: kinematics, Special relativity: mechanics

Personal tools

Focal areas