Special relativity: mechanics

From Scholarpedia
Wolfgang Rindler (2012), Scholarpedia, 7(1):10905. doi:10.4249/scholarpedia.10905 revision #137229 [link to/cite this article]
Jump to: navigation, search
Post-publication activity

Curator: Wolfgang Rindler

$ \newcommand{\sp}[2]{\mathbf{#1\,.\!#2} } \newcommand{\SP}[2]{\mathbf{#1.\!#2} } $ Special relativity (SR) is a physical theory based on Einstein's Relativity Principle, which states that all laws of physics (including, for example, electromagnetism, optics, thermodynamics, etc.) should be equally valid in all inertial frames; and on Einstein's additional postulate that the speed of light should be the same in all inertial frames. The present article describes mechanics in Special relativity. (See the article "Special relativity: kinematics" (SR:kinematics) for the prerequisites to the present article, and "Special relativity: electromagnetism" for the further development of the theory.)


Spacetime and 4-vectors

Spacetime is of great conceptual and practical utility in SR, and, in its generalized form, essential in general relativity. We shall need it in our development of relativistic mechanics. The idea of spacetime is simple enough. It is the space of all events. In fact, our Figures 3 and 4 in SR:kinematics are maps of 2-dimensional spacetime, namely of the events (\({x,t}\)) taking place on the spatial \({x}\) axis of some frame \(S\ .\) We can imagine (though not draw, in the three dimensions available to us) two more axes, of \({y}\) and \({z}\ ,\) allowing us then to represent as points all the events (\({x,y,z,t}\)) in the world – or rather, in our SR model of the world, coordinatized by an inertial frame \(S\ .\)

Clearly, the same thing can be done in Newtonian theory. But Newtonian spacetime as a 4-space lacks the one important feature that makes, for example, Euclidean 3-space so interesting and useful: the ability to support a versatile vector calculus. We shall review 3-vectors as a preliminary to the 4-vectors of SR.

A detour: 3-vectors

The 3-vector calculus rests on the existence in Euclidean 3-space of a metric,

\[\tag{1} {\Delta r}^{2}={\Delta x}^{2}+{\Delta y}^{2}+{\Delta z}^{2} \]

which is invariant under rotations and translations of the Cartesian coordinates. We define a 3-vector \(\mathbf{a}\) (a vector, for short) as a 3-component quantity

\[\tag{2} \mathbf{a} = {(a_{1},a_{2},a_{3})} \]

which can be represented invariably by a displacement

\[\tag{3} {\Delta } \mathbf{r} = {({\Delta x},{\Delta y},{\Delta z})} \]

and which must therefore transform like a displacement under rotations and translations. Then such a vector has a magnitude \(a\), or \({|\mathbf{a}|}\ ,\) defined by

\[\tag{4} {|\mathbf{a}|} = {a=\sqrt{a_{1}^{2}+a_{2}^{2}+a_{3}^{2} } }, \]

which must be invariant, since that of the corresponding displacement is invariant. If \({k}\) is a scalar (i.e. a number which is the same in all inertial frames), we can define a scalar multiple of a vector,

\[\tag{5} {k}\,\mathbf{a} = (k\,{a}_{1}, k\,{a}_{2}, k\,{a}_{3}), \]

which is obviously also a vector. A sum of vectors is defined like a sum of displacements,

\[\tag{6} \mathbf{a} + \mathbf{b} = {(a_{1}+b_{1},a_{2}+b_{2},a_{3}+b_{3})}, \]

and is itself a vector. Therefore its squared magnitude

\[ \begin{array}{rcl} |\mathbf{a}+\mathbf{b}|^{2} &=& (a_{1}+b_{1})^{2}+(a_{2}+b_{2})^{2}+(a_{3}+b_{3})^{2} \\ ~ &=& (a_{1}^{2}+a_{2}^{2}+a_{3}^{2})+(b_{1}^{2}+b_{2}^{2}+b_{3}^{2})+2\,(a_{1}b_{1}+a_{2}b_{2}+a_{3}b_{3}) \end{array} \]

must be invariant. In this equation all items except the last are known to be invariant, and so the last must be invariant too. This allows us to define an invariant (scalar) product of two vectors:

\[\tag{7} \sp{a}{b} = {a_{1}b_{1}+a_{2}b_{2}+a_{3}b_{3} }. \]

It is easily seen to obey the commutative law \(\sp{a}{b} = \sp{b}{a}\ ,\) and the distributive law \(\mathbf{a}\mathbf{.}(\mathbf{b+c})= \sp{a}{b} +\sp{a}{c}\ .\) Also note that \(\sp{a}{a}= {a^{2} }\ ,\) which is usually written

\[\tag{8} {\mathbf{a}^{2}=a^{2} }. \]

It must be stressed that not every quantity that allows itself to be represented by a number triple \({(a_{1},a_{2},a_{3})}\) is a vector. To be a vector it must transform (under rotations and translations) like the displacement \({ {\Delta \mathbf{r} } }\) and thus like the differential \(d\mathbf{r} = ({dx,dy,dz}) = {({dx}_{1},{dx}_{2},{dx}_{3})}\ ,\) where we have introduced the index notation for the three coordinates:

\[\tag{9} d{x}_{i}^{\prime}=\sum_j\frac{\partial {x}_{i}^{\prime} }{\partial x_{j} }{ {dx}_{j} },\;\;\; {a}_{i}^{\prime}=\sum_j\frac{\partial {x}_{i}^{\prime} }{\partial x_{j} }a_{j}. \]

Consider a moving particle, \({x_{i}=x_{i}(t)}\ ,\) where \({t}\) is absolute time. Dividing (9)(i) by the scalar \({dt}\ ,\) we see that the velocity \(\mathbf{u} = { {dx}_{i}/{dt} }\) is a vector. Consider any vector field \(\mathbf{a} = {a_{i} }( {\tau})\ ,\) \({\tau}\) being any scalar. Now, under the linear transformations between two given coordinate systems (rotations are linear), the partial derivatives in (9)(ii) are constants. Hence, by differentiating (9)(ii) with respect to \({\tau }\ ,\) we see that scalar derivatives of vectors, defined by \({d}a/{d\tau }={d}a_{i}/{d}\tau \ ,\) are vectors themselves. So all four of the basic vectors of mechanics, velocity \(\mathbf{u}={dx}_{i}/{ {dt} }\ ,\) acceleration \(\mathbf{a}={du}_{i}/{ {dt} }\ ,\) momentum \(\mathbf{p = }m\mathbf{u , }\) and force \(\mathbf{f} = m\mathbf{a},\) are indeed vectors.

From the component-wise definition of derivatives we can easily check the Leibniz rule

\[\tag{10} {d} (\mathbf{a}\mathbf{.}\mathbf{b}) = (d\mathbf{a})\mathbf{.}\mathbf{b}+\mathbf{a}\mathbf{.}d\mathbf{b}\,. \]

As a particular case, since \(\sp{a}{a} = {a^{2} }\) (or \({\mathbf{a}^{2}=a^{2} }\)), we get a most useful result by differentiating this identity:

\[\tag{11} \sp{a}{ {d}a}={a}\,{da}. \]

The 4-vector calculus

The 4-vector calculus can be developed in almost complete analogy with 3-vector calculus, as was first shown, and utilized to great effect, by Hermann Minkowski (Minkowski 1908). Four-vector calculus plays out on spacetime, which, in its Minkowskian quadratic form (Equation 8 of SR: kinematics), possesses a natural metric. Note that nothing comparable is available to us in Newtonian spacetime; the only invariant there is the degenerate quadratic \({ {\Delta t}^{2} }\ .\)

Four-vectors have one glaring difference from 3-vectors: the basic quadratic form (Equation 8 of SR: kinematics) is "indefinite", that is to say, it can be positive, negative, or zero, for completely ordinary pairs of events. It is therefore not to be regarded as the square of a number, but that hardly affects the calculus. For dimensional reasons we now take \({ct}\) rather than just \({t}\) as the fourth coordinate, and accordingly write the fundamental displacement vector as

\[\tag{12} { {\Delta \mathbf{s} }=({\Delta x},{\Delta y},{\Delta z},\Delta {ct}){.} } \]

This serves as the prototype for all 4-vectors \(\mathbf{A= } {(A_{1},A_{2},A_{3},A_{4})}, \) which therefore transform under general LTs (Poincaré transformations) like the coordinate \({\Delta }\)'s. Under standard LTs Equation 4 of SR: kinematics, in particular, we have

\[\tag{13} {A}_{1}^{\prime}=\gamma\, [A_{1}-(v/c)A_{4}],\;\;\; {A}_{2}^{\prime}=A_{2},\;\;\; {A}_{3}^{\prime}=A_{3},\;\;\; {A}_{4}^{\prime}=\gamma\, [A_{4}-(v/c)A_{1}]\;\;\; \]

Note also from (12) that under pure spatial rotations (keeping \({t}\) constant) the first three components of a 4-vector transform like a 3-dimensional displacement, and hence like a 3-vector. For this reason one often lumps together the first three components into a 3-vector and writes

\[\tag{14} \mathbf{A}=(\mathbf{a},A_{4}). \]

Our usual convention is to denote 4-vectors with bold capitals and 3-vectors with bold lower case letters.

Making a conventional sign choice (some authors make the opposite choice), we define our metric as

\[\tag{15} { {\Delta \mathbf{s} }^{2}=\Delta ({ {ct})^{2}-{\Delta x}^{2}-{\Delta y}^{2}-{\Delta z}^{2} }.} \]

We shall regard it, as we justify below, as the square of a vector – the "squared displacement" – rather than as the square of a possibly complex number.

In analogy with (5), (6), and particularly (7), we define the sum, scalar multiple, and product of 4-vectors: \[\tag{16} \mathbf{A}+\mathbf{B}=(A_{1}+B_{1},A_{2}+B_{2},A_{3}+B_{3},A_{4}+B_{4}), \] \[\tag{17} k\,\mathbf{A} = (k\,A_1,k\,A_2,k\,A_3,k\,A_4), \] \[\tag{18} \mathbf{A.B}= (A_{4}B_{4}-A_{1}B_{1}-A_{2}B_{2}-A_{3}B_{3}). \] Analogously to the corresponding 3-vector results, we have

\[\tag{19} \SP{A}{B} = \SP{B}{A},\;\;\;\mathbf{A}\mathbf{.}(\mathbf{B}+\mathbf{C}) = \SP{A}{B} + \SP{A}{C}. \]

When \(\SP{A}{B} = 0\ ,\) we say that the vectors are orthogonal. As in the case of 3-vectors, we write

\[\tag{20} {\mathbf{A}^{2} } = \SP{A}{A} = {A_{4}^{2} }-{A_{1}^{2} }-{A_{2}^{2} }-{A_{3}^{2} } \]

and define the magnitude as

\[\tag{21} {A =} {|\mathbf{A}^{2}|^{1/2}\ge 0}. \]

We see from (19) the justification of regarding (15) as a squared vector. Its magnitude,

\[\tag{22} { {\Delta s}=|c^{2}{\Delta t}^{2}-{\Delta x}^{2}-{\Delta y}^{2}-{\Delta z}^{2}|^{1/2} } , \]

is called the interval between the corresponding events.

The sign of \({\mathbf{A}^{2} }\) is significant. We divide all 4-vectors into three types – timelike, null, and spacelike – according to the sign of their squares:

\[\tag{23} {\mathbf{A}^{2}>0}: \textrm{timelike, }\;\;\; {\mathbf{A}^{2}=0}: \textrm{null,} \;\;\; {\mathbf{A}^{2}<0}: \textrm{spacelike} \]

Figure 1: The light cone at a typical event \(\mathcal{P}\) in spacetime

Figure 1 is a spacetime diagram (with one dimension more than the Minkowski diagrams 3,4,5 of SR:kinematics, but the \(z\)-dimension still suppressed) of the neighborhood of some event \(\mathcal{P}\ .\) It clarifies the geometry implicit in Figure 1. If we write our earlier Equation 11 of SR: kinematics with \({\Delta }\)'s rather than \({d }\)'s, we see from just one of its sides that a straight worldline segment with \({ {\Delta s}^{2}>0}\) (representing a timelike vector) corresponds to a velocity \({u} < c\ ,\) while null and spacelike displacements correspond, respectively to \({u} = 0\ ,\) \({u} > 0\ .\) Geometrically, therefore, these vectors subtend angles less than, equal to, and greater than \({ {45}^{\circ} }\) with the \({t}\)-direction. We say that they lie inside, on, or outside the light-cone. Importantly, no LT can change the sign of \({A_{4} }\) if \({\mathbf{A}^{2}\ge 0}\ .\) For all observers agree on the time sequence along a signal as long as \({u\le c}\ .\) A vector inside (or on) the top half of the cone therefore stays there under any active LT (See, for example, Fig. 6 of SR:kinematics.). We speak of such vectors as future-pointing and of that region as the absolute future of the vertex event. Similarly the lower half is called the absolute past. The events outside the cone constitute the "present" – each of them can be reached by a superluminal signal from the vertex, which has infinite speed in some IF, in which the event is then simultaneous with the vertex.

One important fact to bear in mind about spacetime diagrams like Figure 1 here and Figs. 3,4 and 5 of SR:kinematics is that they are mere maps of spacetime with its events and vectors, not the spacetime itself. It is like mapping the surface of the earth onto a Euclidean plane: since that surface also has a non-Euclidean metric, the map necessarily distorts its true metric relations. So in a spacetime diagram, for example, vectors that are orthogonal in the Minkowski sense \(\SP{A}{B} = 0\ ,\) like \((2,0,0,1)\) and \((1,0,0,2)\ ,\) or the \({x}^{\prime}\) and \({t}^{\prime}\) axes in Fig. 4 of SR:kinematics, do not necessarily look orthogonal. And vectors that have the same Minkowski magnitude (length) \({|\mathbf{A}^{2}|^{1/2} }\) do not necessarily look as if they did, like \((0,0,0,3)\) and \((4,0,0,5)\ ,\) or the whole set of vectors \(\mathcal{OB}\) in Fig. 5 of SR:kinematics, joining the origin to the calibrating hyperbola. Only in the usual diagrams of Euclidean geometry do map and space coincide.

The zero-component lemma

The important lemma here referred to (and whose 3-dimensional analog is intuitive) is this: If a 4-vector has a particular one of its four components zero in all IFs, then the entire vector must vanish. Let the vector be \({A=(A_{1},A_{2},A_{3},A_{4})}\ ,\) and suppose, typically, that \({A_{2} }\) is always zero. Consider a LT in the \({y}\)-direction out of any IF \(S\ :\) \({A}_{2}^{\prime}=\gamma\, [A_{2}-(v/c)A_{4}]\ .\) This shows that also \({A_{4} }\) must always vanish. But then a LT in the \({x}\)-direction, \({A}_{4}^{\prime}=\gamma\, [A_{4}-(v/c)A_{1}]\) shows that \({A_{1} }\) must always vanish, as must \({A_{3} }\) by an analogous argument.

Four-velocity and four-acceleration

General LTs are linear transformations, and so the coordinate differentials \({ {dx}_{\mu } }\) and the 4-vector components \({A_{\mu } }\) (with Greek indices taking the values 1,2,3,4) satisfy equations analogous to (9). As in 3-vector theory, we define differentials and scalar derivatives of 4-vectors by the corresponding operations on the components. From the analog of (9)(i) it then follows that the scalar derivative \({dx}_{\mu }/{d\tau }\) of the coordinates is a 4-vector, and that the scalar derivative \({dA}_{\mu }/{d\tau }\) of any 4-vector is a 4-vector. But now the coordinate \({t}\) is no longer an invariant scalar. What is?

In particle mechanics we are concerned with particle worldlines. An important invariant – just from physical considerations – along such a worldline is the elapsed proper time \(\tau\) Equation 22 of SR: kinematics indicated by an imaginary ideal clock carried by the particle. From the definition (22) of the interval, this proper time \({\tau }\) is the same (except for a factor \(1/{c}\)) as the evidently invariant cumulative interval along the worldline. For, expressing (22) in terms of differentials and, for a particle, assuming \({({ {dx}^{2}+{ {dy}^{2}+{ {dz}^{2})/{ {dt}^{2}=u^{2}\le c^{2} } } } } }\ ,\) we have

\[\tag{24} s=\int(1-u^{2}/c^{2})^{1/2}c\,dt=c\tau. \]

Hence we can define the particle's 4-velocity vector \(\mathbf{U}\) and its 4-acceleration vector \(\mathbf{A}\) by the equations

\[\tag{25} \mathbf{U}=\frac{ {dx}_{\mu } }{ {d\tau } },\;\;\; \mathbf{A}=\frac{ {d}\mathbf{U} }{ {d\tau } }=\frac{ {dU}_{\mu } }{ {d\tau } }=\frac{d^{2}x_{\mu } }{ {d\tau }^{2} }. \]

Now, from (24) we have the important equation

\[\tag{26} \frac{ {dt} }{ {d\tau } }=\bigl(1-\frac{u^{2} }{c^{2} }\bigr)^{-1/2}=\gamma (u). \]

Since \({dx}/{d\tau }=({dx}/{dt})({dt}/{d\tau })=u_{1}\gamma(u)\ ,\) etc., and \({x_{4}={ {ct} } }\ ,\) we see that

\[\tag{27} \mathbf{U} = \gamma (u)\,(u_{1},u_{2},u_{3},c)=\gamma (u)\,(\mathbf{u},c). \]

Note that a particle moving with velocity \({u}\) has a different \({\gamma }\)-factor \({\gamma (u)}\) in each IF. An interesting relation between its \({\gamma }\)-factors in \(S\) and \(S'\ ,\) \({\gamma (u)}\) and \({\gamma ({u}^{\prime}),}\) and \({\gamma (v)}\ ,\) the \({\gamma }\)-factor between the frames, can be obtained by applying (13)(iv) to \({U_{4} }\ .\) [See Eq.(78) below.]

The last form of \(\mathbf{U}\) in (27) is noteworthy. It foreshadows the kind of relation that often exists between a familiar 3-vector and its generalization to 4 dimensions. A somewhat more complicated relation holds between the 3-acceleration \(\mathbf{a} = d\mathbf{u}/dt\) and the 4-acceleration \(\mathbf{A. }\) From the definition (25)(ii) and Eqs.(26),(27) we have

\[\tag{28} \mathbf{A}=\gamma\, \frac{ {d}\mathbf{U} }{ {dt} }=\gamma\, \frac{d}{ {dt} }({\gamma\,\mathbf{u} },{\gamma\, c})=\gamma\; (\dot{\gamma }\,\mathbf{u}+{\gamma}\,\mathbf{a},\dot{\gamma }\,c), \]

where \(\dot{\gamma }={d\gamma }/{ {dt} }\ .\) An important (and easily verified) identity satisfied by \({\gamma (u)}\) is

\[\tag{29} c^{2}{d\gamma }=\gamma ^{3}{u\,du}=\gamma ^{3}\mathbf{u}{.}{d}\mathbf{u}. \]

In particular, this implies \({\dot{\gamma }=0}\) when \({u} = 0\ .\) Hence in the instantaneous rest frame of the particle (where \({u = }0\) and \({\gamma =1}\)) the components of \(\mathbf{A}\) reduce to

\[\tag{30} \mathbf{A} = (\mathbf{a}, 0). \]

So \(\mathbf{A} = 0\) if and only if the 3-acceleration in the rest frame vanishes. The 4-velocity \(\mathbf{U}\ ,\) on the other hand, never vanishes – it even has the same magnitude, \({c}\ ,\) always. (Geometrically, it represents \({c}\) times the unit tangent vector to the worldline.) In fact, we have

\[\tag{31} {\mathbf{U}^{2}=c^{2} },\;\;\;{\mathbf{A}^{2}=-\alpha ^{2} }, \]

\({\alpha }\) being the proper acceleration, namely the magnitude of the 3-acceleration in the particle's rest frame. The simplest way to get these squares (since they are invariants!) is to calculate them in the rest frame. By reference to (23), we see that \(\mathbf{U}\) is timelike, while \(\mathbf{A}\) is spacelike. Also the two are orthogonal,

\[\tag{32} \SP{U}{A} = 0 , \]

as is again most easily seen in the rest frame. Consider next the product \(\SP{U}{V}\) of the 4-velocities of two particles which either move uniformly or whose worldlines just cross. Calculating in the rest frame of the \(\mathbf{U}\)-particle, relative to which the \(\mathbf{V}\)-particle has velocity \({v}\ ,\) we find

\[\tag{33} \SP{U}{V} = {c^{2}\gamma (v)}. \]

This provides us with a very neat way to find the relative velocity \({v}\) of the two particles.

The wave 4-vector

What is the equation of a set of moving parallel planes representing loci of wavecrests? A stationary plane with unit normal \(\mathbf{n} = ({l,m,n})\) at distance \(p\) from some arbitrary origin \(P = {(x_{0},y_{0},z_{0})}\) satisfies the equation \(\sp{n}{r} = p\) or

\[\tag{34} l\,(x-x_{0})+m\,(y-y_{0})+n\,(z-z_{0})=p. \]

If this plane propagates in the direction of its normal with velocity \({w},\,\) \({p}\) becomes \(w\,(t-t_{0})\ .\) And if, instead of a single plane, we have a set of equally spaced parallel planes a distance \({\lambda }\) apart, we add \({N}{\lambda }\) to \({p},\) \({N}\) running through the positive and negative integers. In \({\Delta }\)-notation, then,

\[\tag{35} { {l\Delta x}+{m\Delta y}+{n\Delta z}-(w/c)\Delta { {ct}={N\lambda } } }. \]

In vector notation, this could be written as

\[\tag{36} \mathbf{L}\mathbf{.}\Delta\mathbf{s}=N, \]


\[\tag{37} {\mathbf{L}=\frac{1}{\lambda }\,\bigl(\mathbf{n},\frac{w}{c}\bigr)=\nu\, \bigl(\frac{\mathbf{n} }{w},\frac{1}{c}\bigr)} , \]

\({\lambda }\) being the wavelength and \({\nu =w/\lambda }\) the frequency. But is \(\mathbf{L}\) really a 4-vector? The answer is yes. Proof: the wave will have a similar equation in \(S'\ ,\) say \(\mathbf{M'}\mathbf{.}\Delta\mathbf{s'}=N\ .\) (Just Lorentz-transform the \(S\)- \({\Delta }\)'s of (36) to \(S'\)- \({\Delta }\)'s. By the linearity of the LTs, each \(S\)- \({\Delta }\) is a linear form in the \(S'\)- \({\Delta }\)'s, so the result is some linear form in the \(S'\)- \({\Delta }\)'s, which we can write as \(\mathbf{M'}\mathbf{.}\Delta\mathbf{s'}.)\) But the Lorentz transform \(\mathbf{L'}\) of \(\mathbf{L}\) must also satisfy this equation, by the invariance of the product of vectors. Hence \((\mathbf{M'}-\mathbf{L'})\mathbf{.}\Delta\mathbf{s'}=0\ .\) Since \(\mathbf{M'}-\mathbf{L'}\) cannot be orthogonal to all the \(\Delta {s}^{\prime}\ ,\) it follows that \(\mathbf{M'} = \mathbf{L'}\ .\)

We note that our results for a space-filling plane wave train can also be applied locally to an arbitrary wave train (e.g. to one spreading spherically), provided that a sufficiently small portion of it has the appearance of a plane wave train.

The recognition of \(\mathbf{L}\) as a 4-vector allows us to know how its various components transform under a Lorentz transformation, namely as in Figure 1 above. This, in turn, allows us to calculate the important transformation equations for \(\mathbf{n}\ ,\) \(\lambda\ ,\) and \(w\) (see, for example, Rindler 2006).

Relativistic Mechanics

Special Relativity, first of all, is a new theory of space and time – spacetime – and so far we have outlined this part of it, merely elaborating the kinematic consequences of the LTs, augmented by the speed-limit axiom. But SR also claims obedience – Lorentz invariance – from all the laws of physics. As we shall see later, Maxwell's theory (at least in vacuum) was already Lorentz invariant. And so this very substantial part of classical physics was a welcome first inhabitant of the new spacetime. Newton's theory, being Galileo invariant, did not fit. Even though no experimental shortcomings of it had yet been discovered, confidence in the new model still required that a new Lorentz invariant mechanics be found. What was needed was the judicious invention of new axioms. There is no logically binding way to derive them. But an obvious requirement is that for slow motions (compared to the speed of light) the new mechanics must overlap with the old, since for over two centuries that had held up to the most stringent tests. Somewhat miraculously, the new "relativistic" mechanics was easily found, was simple and elegant within the new 4-dimensional formalism, and predicted hugely different results from Newton's theory for particle collisions near the speed of light, all of which were eventually confirmed.

The 4-vector calculus will be our guide. Any equation between two 4-vectors, \(\mathbf{A} = \mathbf{B}\) (equal components) is automatically Lorentz invariant, since the components of both sides transform equally. A special case of this is \(\mathbf{A} = 0\ ,\) 0 being a somewhat sloppy but accepted notation for the zero-vector \((0,0,0,0)\ .\) Four-vector equations are therefore natural candidates for relativistic laws (though not the only ones – some are tensor or even spinor equations.) In Newtonian theory, force is the primary concept. Relativistic mechanics can be approached from many angles. But since particle collisions play an important role in it, momentum is here perhaps the most convenient starting point.

We assume that associated with every particle there is an intrinsic positive scalar, \({m_{0} }\ ,\) namely its Newtonian or rest-mass. This allows us to define the 4-momentum \( \mathbf{P}\) of a particle in analogy to its 3-momentum,

\[\tag{38} \mathbf{P = } {m_{0}\mathbf{U} }, \]

\(\mathbf{U}\) being its 4-velocity. Like \(\mathbf{U}, \mathbf{P}\) is timelike and future pointing. And we take as the basic axiom of collision mechanics the conservation of this 4-vector quantity: the sum of the 4-momenta of all the particles going into a point-collision equals the sum of the 4-momenta of all those coming out. (The collision may or may not be elastic, and there may be more, or fewer, or other particles coming out than going in.) We can write this conservation law in the form

\[\tag{39} \sideset{}{^\ast}\sum \mathbf{P}_{n}=0, \]

where a different value of \({n} = 1,2, {\dots}\) is assigned to every particle going in and every particle coming out (so if 3 go in and 4 come out, \({n}\) runs from 1 to 7), and \(\sideset{}{^\ast}\sum\) is a sum that counts all pre-collision terms positively and all post-collision terms negatively. The LHS of (39) is thus a 4-vector, which makes our axiom automatically Lorentz invariant. And this is the entire basis of relativistic mechanics; the rest is just working out its consequences!

Using the component form (27) of \(\mathbf{U}\ ,\) we find from (38) the following components for \( \mathbf{P}\ :\)

\[\tag{40} { \mathbf{P}=m_{0}\mathbf{U}=m_{0}\gamma (u)\,(\mathbf{u},c)=:( \mathbf{p},{m}_u{c})}, \]

where, in the last equation, we have introduced the symbols

\[\tag{41} {m}_u=\gamma (u)\,m_{0}, \] \[\tag{42} \mathbf{p} = {m}_u\mathbf{u} . \]

The formalism thus leads us naturally to this quantity \({m}_u\ ,\) which we shall call the relativistic mass, or usually just the mass of the particle, and to \( \mathbf{p}\ ,\) which we shall call the relativistic momentum, or usually just the momentum of the particle. Observe that \({m}_u\) increases with speed; when \({u} = 0\) it is least, namely \({m_{0} }\ ,\) which is why we call \({m_{0} }\) the rest-mass of the particle. On the other hand, \({m}_u\) becomes infinite as \({u}\) approaches \({c}\ ,\) which is Nature's way of avoiding superluminal velocities.

In terms of these quantities the original conservation law (39) splits componentwise into the conservation of relativistic momentum,

\[\tag{43} \sideset{}{^\ast}\sum \mathbf{p}=0, \;\;\;\text{that is}, \;\;\; \sideset{}{^\ast}\sum {m}_u\mathbf{u}=0, \]

and the conservation of relativistic mass,

\[\tag{44} \sideset{}{^\ast}\sum {m}_u=0,\;\;\;\text{that is,}\;\;\;\sideset{}{^\ast}\sum\gamma (u)\,m_0 = 0 \]

where, for brevity, we have omitted the summation index \({n}\ .\) Evidently, in the slow-motion limit these are the corresponding Newtonian conservation laws, and so our proposed relativistic law satisfies the three basic criteria: Lorentz invariance, simplicity, and Newton-conformity. We also note from the "zero-component lemma" discussed earlier, that the validity of either (43) or (44) in all IFs implies the full conservation law (39). Finally, in the formal limit \({c\rightarrow \infty}\) the relativistic laws become the exact Newtonian laws, with \(m_0\) being the Newtonian mass, which can often serve as a check on our work.

The equivalence of mass and energy

There is much more in Eq.(44) than at first meets the eye. It cannot possibly say what it seems to say: that mass in the Newtonian sense of "quantity of matter" is conserved. By now we know empirically that matter is not conserved, as when an electron and a positron annihilate each other. Moreover, the quantity of matter cannot be velocity dependent, as \({m}_u\) is. But there is a velocity-dependent quantity in classical mechanics that is conserved: the kinetic energy in elastic collisions. Could it be that \({m}_u\) (or a multiple of it) is a measure of the total energy of the particle? And that (44) asserts the universal conservation of energy? The answer turns out to be yes, and this was regarded by Einstein, who found it, as one of the most significant results of SR. Yet Einstein's assertion of the full equivalence of mass and energy according to the famous formula

\[\tag{45} E={m}_u{c}^{2} \]

was and is in part a hypothesis, as we shall see.

Consider the following expansion for the mass as defined by (41):

\[\tag{46} {m}_u=m_{0}\bigl(1-\frac{u^{2} }{c^{2} }\bigr)^{-1/2}=m_{0}+\frac{1}{c^{2} }\bigl(\frac{1}{2}m_{0}u^{2}\bigr)+{.}{.}{.} \]

This shows that the relativistic mass of a slowly moving particle exceeds its rest mass by \(1/ {c^{2} }\) times its kinetic energy (assuming the approximate validity of the Newtonian expression for the latter). So kinetic energy contributes to the mass in a way that is consistent with (45). In fact, it is equation (46) that supplies the constant of proportionality in (45). And it is the enormity of this constant that explains why the mass increase corresponding to the easily measurable kinetic energies of particles in classical collisions had never been observed.

At this stage in the argument it is still possible to suppose that energy contributes to the mass, without causing all of it. There could be a residue of intrinsic mass that is separately conserved. To equate all the mass with energy, especially in Einstein's time, required an act of aesthetic faith very characteristic of Einstein. His hypothesis has stood up magnificently to the test of time. Note that Einstein's equation determines a zero-point of energy. In Newton's theory one could, in principle, extract an infinite amount of energy from a finite mass by letting it collapse indefinitely under its own gravity. According to Einstein's equation, nature must find a way of preventing this. (This requires general relativity, but the rescue comes from the formation of a black hole.)

A WORD OF CAUTION: there is a school of thought (mainly among particle physicists who, after all, are the main consumers of collision mechanics) who reject the concept of relativistic mass altogether. Wherever we have an \({m}_u\, ,\) they would replace it by \({E/c^{2} }\ ;\) our \({m_{0} }\) becomes their \(m\, ,\) simply called the mass; and our \(E={ {m}_u{c}^{2} }\) becomes their \(E=\gamma(u)m{c}^{2}\). This has nothing to do with physics. It is simply a choice between two alternative conventions, ours being that of Einstein.

Einstein's mass-energy equivalence allows us to include even particles of zero rest mass (photons, \({\dots}\)) into the scheme of mechanics. If such a particle has finite energy [all of it being kinetic energy, \({({m}_u-m_{0})c^{2} }\)] it has finite mass \({m}_u=E/c^{2}\ ;\) thus, because of (41), it must move at the speed of light and then it will also have momentum \({p} = {E}/{c. }\) Formally we can regard its mass as the limit of a product, \({ {\gamma m}_{0} }\ ,\) whose first factor has gone to infinity, while its second factor has gone to zero. In all cases we shall have, from (40),

\[\tag{47} { \mathbf{P}^{2}=m_{0}^{2}c^{2}={m}_u^{2}c^{2}-p^{2}=E^{2}/c^{2}-p^{2} }. \]

Note that for zero-rest-mass particles, and only for those, \( \mathbf{P}\) becomes a null vector. When two particles with respective 4-momenta \({ \mathbf{P}_{1} }\) and \({ \mathbf{P}_{2} }\) are involved, and \(v_{12}\) is their relative speed, we have, by calculating in the rest frame of either,

\[\tag{48} \mathbf{P}_{1}\mathbf{.}\mathbf{P}_{2}=m_{01}E_{2}=m_{02}E_{1}=c^{2}\gamma(v_{12})\,m_{01}m_{02} , \]

where, typically, \(m_{01}\) is the rest-mass of the first particle and \({E_{2} }\) the energy of the second particle in the rest- frame of the first. Note that the first equation holds even if the second particle is a photon. If both are photons, the equation is inapplicable (But see Equation (53) below.). In the particular case of an elastic (that is, rest-mass preserving) collision of two particles with pre-collision momenta \( \mathbf{P}\) and \(\mathbf{Q }\) and post-collision momenta \(\mathbf{P'}\) and \(\mathbf{Q'}\ ,\) we find, on squaring the conservation equation \( \mathbf{P} + \mathbf{Q} = \mathbf{P'}+\mathbf{Q'}\ ,\) and using (47), that

\[\tag{49} \SP{P}{Q}=\SP{P'}{Q'}, \]

so that, by reference to (48), the relative velocity between the particles is conserved. Since this result is independent of the value of \({c}\ ,\) it must hold in Newtonian theory also!

Particles and Waves

As a last resort to avoid the notorious "ultraviolet catastrophe" of blackbody radiation theory, Planck in 1900 had made the suggestion that radiation of frequency \({\nu }\) might be emitted only in distinct "quanta" of energy

\[\tag{50} E={h\nu }, \]

where \(h\) is a universal constant, now known as Planck's constant. And then Einstein found that the photoelectric effect would fit the further assumption that radiation of frequency \({\nu }\) is not only emitted in quanta of energy \({h\nu }\ ,\) but also travels and is received as such quanta, which were eventually called photons. Finally, in 1923, de Broglie – at first only as a formal possibility – showed how waves could be associated with all particles according to Planck's formula (50). This eventually led to Schrödinger's wave mechanics and much more. In a beautiful application of SR, de Broglie proposed the following relation between the particle's 4-momentum \( \mathbf{P}\) and the wave 4-vector of the associated wave (see (40), (45), and (37)):

\[\tag{51} \mathbf{P} = {h}\,\mathbf{L},\;\;\;\text{that is},\;\;\; {E\,\bigl(\frac{\mathbf{u} }{c^{2} },\frac{1}{c}\bigr)={h\nu\, }\bigl(\frac{\mathbf{n} }{w},\frac{1}{c}\bigr)} . \]

In fact, if Planck's relation (50) is to be maintained for a material particle and its associated wave, then (51) is inevitable. For then the 4th components of the 4-vectors on either side of (51) are equal; by our earlier "zero-component lemma", the entire 4-vectors must therefore be equal! From (51) it then follows that the wave travels in the direction of the particle ( \({\mathbf{n}\propto \mathbf{u} }\)), but with a larger velocity \(w\), given by de Broglie's relation

\[\tag{52} { {u\,w}=c^{2} }, \]

as can be seen by comparing the magnitudes of the leading 3-vectors. (However, the group velocity of the wave, which carries the energy, can be shown to be still \(u\).) The wave must necessarily travel at a speed other than the particle unless that speed is \(c\), for waves and particles aberrate differently, and a particle comoving with its wave would slide across it sideways in another frame.

We can now complete formula (48) for the case when both particles are photons. If their paths subtend an angle \({\theta }\ ,\) Eq.(51), with \(\mathbf{n}_{1}\mathbf{.}\mathbf{n}_{2}=\cos\theta\) and \({w_{1}=w_{2}=c}\ ,\) yields

\[\tag{53} \mathbf{P}_{1}\mathbf{.}\,\mathbf{P}_{2}=h^{2}c^{-2}\nu_{1}\nu_{2}(1-\cos\theta ). \]

The Zero-Momentum Frame

Consider an arbitrary inertial frame \(S\) and in it a system of occasionally colliding particles, which could also be photons, subject to no forces other than the very short-range forces during collisions, and thus moving uniformly between collisions. We define the total mass \({\bar{m} }\ ,\) the total momentum \({\bar{ \mathbf{p} } }\ ,\)and the total 4-momentum \({\bar{ \mathbf{P} } }\) of the system in \(S\) as the instantaneous sum of the respective quantities. Then (see (40))

\[\tag{54} {\bar{ \mathbf{P} }=\sum { \mathbf{P} }=\sum \,( \mathbf{p},{m}_u{c})=(\bar{ \mathbf{p} },\bar{m}c)}. \]

Because of the conservation laws, each of the barred quantities remains constant in time.

The quantity \({\bar{ \mathbf{P} } }\ ,\) being a sum of 4-vectors, seems assured of 4-vector status itself. But there is a problem. If all observers agreed on which \( \mathbf{P}\)s make up the sum \({\bar{ \mathbf{P} } }\ ,\) then \({\bar{ \mathbf{P} } }\) would clearly be a vector. But in each frame the sum is taken at one instant, which may result in different \( \mathbf{P}\)s making up the \({\bar{ \mathbf{P} } }\) of different observers. A spacetime diagram such as Figure 1, even an imagined one, is helpful. Our particle system will correspond to a lattice of straight worldline segments, meeting at various knots (collisions), where two or more segments come together and one or more emerge, or vice versa. A simultaneity in \(S\) corresponds to a "horizontal" plane in the diagram, and a simultaneity in a second frame \(S'\) corresponds to a "tilted" plane. In \(S\) \({\bar{ \mathbf{P} } }\) is summed over horizontal planes, and in \(S'\) over tilted planes. However, even in \(S\ ,\) the same \({\bar{ \mathbf{P} } }\) results no matter what plane we sum over. For imagine a continuous motion of a horizontal plane into a tilted one. Each \( \mathbf{P}\) remains the same until the first collision, since the particles move uniformly between collisions. As the plane sweeps over that collision, the sub-sum of \({\bar{ \mathbf{P} } }\) that enters the collision (as all other collisions) is conserved, by 4-momentum conservation. Thus, without affecting the value of \({\bar{ \mathbf{P} } }\ ,\) all observers could sum their \( \mathbf{P}\)s over the same plane, whence \({\bar{ \mathbf{P} } }\) is indeed a 4-vector.

This 4-vector \({\bar{ \mathbf{P} } }\) is timelike and future-pointing, except in the negligible special case of nothing but comoving photons, when it is obviously null. For consider the expansion

\[\tag{55} {\bar{ \mathbf{P} } }^{2}= {( \mathbf{P}_{1}+ \mathbf{P}_{2}+\cdots)^{2}= \mathbf{P}_{1}^{2} }+ \mathbf{P}_{2}^{2}+\cdots+2 \,\mathbf{P}_{1}\mathbf{.}\, \mathbf{P}_{2}+\cdots . \]

By reference to (47), (48), and (53), all terms on the RHS are non-negative. The presence of even a single non-zero-rest-mass particle or of a single pair of non-comoving photons will make the RHS positive. So \({\bar{ \mathbf{P} } }\) is timelike. That it is also future-pointing is clear from the positivity of the fourth components of all the summands. Thus, by choosing an IF with time axis along \({\bar{ \mathbf{P} } }\ ,\) we can make its spatial components all zero\[{\bar{ \mathbf{p} }=0}\ .\] This is the zero-momentum frame \(S_{ { {ZM} } }\) for our system (analogous to the classical center-of-mass frame). In \(S_{ { {ZM} } }\) the 4-velocity \(\mathbf{U}_{ { {ZM} } }\) of \(S_{ { {ZM} } }\) is \((0,0,0,{c})\ ,\) so that, by (54),

\[\tag{56} {\bar{ \mathbf{P} }=}(0,0,0, \bar{m}_{ {ZM} }\,c) = \bar{m}_{ {ZM} }\mathbf{U}_{ZM}, \]

where \({\bar{m} }_{ { {ZM} } }\) is \({\bar{m} }\) in \(S_{ { {ZM} } }\ ,\) obviously an invariant. Compare (56) with (40)(i): this shows that \({\bar{m} }_{ { {ZM} } }\) and \(\mathbf{U}_{ { {ZM} } }\) are for the system what \({m_{0} }\) and \(\mathbf{U}\) are for a particle. Note how the kinetic energy contributes to the "rest-mass" of the system.

In the general IF, relative to which \(S_{ { {ZM} } }\) has velocity \(\mathbf{u}_{ZM}\ ,\) say, Eq.(56) reads (cf.(27))

\[\tag{57} \bar{ \mathbf{P} }=(\bar{ \mathbf{p} },\bar{m}c)={\bar{m} }_{ {ZM} }\,\gamma ( u_{ZM})\,(\mathbf{u}_{ {ZM} },c) , \]

which yields

\[\tag{58} \bar{m}=\gamma(u_{ {ZM} })\,{\bar{m} }_{ { {ZM} } }, \]

\[\tag{59} \bar{ \mathbf{p} }=\bar{m}\mathbf{u}_{ { {ZM} } },\;\;\;\text{or}\;\;\; \mathbf{u}_{ {ZM} }=\bar{ \mathbf{p} }/{\bar{m} } . \]

Threshold Energies

An important application of relativistic mechanics occurs in so-called threshold problems. Suppose a stationary proton is to be struck by a moving proton so as to create an extra pion \({(p+p\rightarrow p+p+\pi ^{0})}\ .\) What is the minimum energy of the incoming proton to make this reaction possible? It is not enough for its kinetic energy \({({m}_u-m_{0})c^{2} }\) to merely equal the rest energy of the pion we want to create! For, by the conservation of 3-momentum, there must be motion and thus "waste" energy after the collision.

In all such cases, the minimum expenditure of energy occurs when all the end-products travel "as a lump". Suppose a given stationary target particle is to be struck by a given bullet particle and we know the rest masses of all the desired end-products. Relative to the lab, the bullet's initial velocity determines \(\mathbf{u}_{ { {ZM} } }\) via Eq.(59). But this is also \(\mathbf{u}_{ { {ZM} } }\) after the collision, since both \({\bar{ \mathbf{p} } }\) and \({\bar{m} }\) are conserved. We want there to be a minimum of kinetic (waste) energy after collision. Eq.(58) – where \({\gamma }\) is now fixed – shows that this indeed occurs when all the end-products are at rest in \(S_{ZM}\ .\)

However, to get the actual threshold formula, it is easiest to proceed as follows. Let \({ \mathbf{P}_{B} }\) and \({ \mathbf{P}_{T} }\) be the pre-collision 4-momenta of bullet and target, and \({ \mathbf{P}_{i} }({i} = 1,2,{\dots}\)) the 4-momenta of all post-collision particles. Then

\[\tag{60} \mathbf{P}_{B}+ \mathbf{P}_{T}=\sum { \mathbf{P}_{i} }. \]

Squaring this equation along the lines of (55), and using once more Eqs.(47) and (48), we find, in a self-explanatory notation,

\[\tag{61} {m_{0B}^{2}+m_{0T}^{2} }+2\;c^{-2}\,m_{0T}E_{B}=\sum {m_{0i}^{2}+2\,\sum_{i<j}{m_{0i} }m_{0j}\gamma (v_{ {ij} })}. \]

The only variable on the LHS is \({E_{B} }\ ,\) the energy of the bullet relative to the rest frame of the target, and thus relative to the lab. Once again we see that this will be minimum when all the \({\gamma }\)-factors on the RHS are unity, that is, when there is no relative motion between the outgoing particles. The RHS then equals \({\bigl(\sum {m_{0i} }\bigr)^{2} }\ ,\) and so, solving for \({E_{B} }\ ,\) now the minimum or threshold energy, we find

\[\tag{62} {E_{B}=\frac{c^{2} }{2\,m_{0T} }\bigl[\bigl(\sum {m_{0i} }\bigr)^{2}-m_{0B}^{2}-m_{0T}^{2}\bigr]} . \]

This formula applies even when the bullet is a photon, provided it gets absorbed in the collision (since it cannot be part of a post-collision "lump".)

Because of the inevitable waste kinetic energy, this method of creating new particles is generally not very efficient. A way out is the 100% efficient method of head-on colliding beams.

The Compton Effect

An extraordinary validation of Einstein's idea that photons can behave mechanically like little billiard balls with (relativistic) mass and momentum was provided by Compton's famous scattering experiment of 1922, in which X-ray photons were the bullets and electrons in graphite surfaces the target.

Suppose a photon of frequency \({\nu }\) strikes a stationary electron of rest-mass \({m_{0} }\) and comes away with altered frequency \({\nu }^{\prime}\) at an angle \({\theta }\) with its incident direction. Let \( \mathbf{P}\) and \( \mathbf{P'}\) be the pre- and post-collision 4-momenta of the photon, and \(\mathbf{Q}\) and \(\mathbf{Q'}\) those of the electron. Then from the conservation equation \(\mathbf{P + Q = P' +Q'}\) we can separate out the unwanted vector \(\mathbf{Q'}\) and square to get rid of it:

\[\tag{63} {( \mathbf{P}+\mathbf{Q}- \mathbf{P^\prime})^{2}=\mathbf{Q}^{2} }. \]

By (47), \({\mathbf{Q}^{2}=\mathbf{Q'}^{2} }\ ,\) and \({ \mathbf{P}^{2}= \mathbf{P}^{\prime 2}=0,}\) so we are left with

\[\tag{64} \mathbf{P}\mathbf{.}\mathbf{P'}=\mathbf{Q}\mathbf{.}(\mathbf{P}-\mathbf{P'}), \]

from which, by reference to (48) and (53), we find at once the desired and experimentally confirmed relation

\[\tag{65} h\,c^{-2}\,\nu\, {\nu }^{\prime}(1-\cos\theta)=m_{0}\,(\nu -{\nu }^{\prime}). \]

In terms of the corresponding wavelengths \({\lambda }\ ,\) \({\lambda }^{\prime}\ ,\) the half-angle \({\theta /2}\ ,\) and the Compton wavelength \(l=h/(c m_0)\ ,\) this may be rewritten in the more familiar form

\[\tag{66} {\lambda -{\lambda }^{\prime}=\frac{2h}{c m_{0} }\,\sin^{2}(\theta /2)=2l\,\sin^{2}(\theta /2)} . \]

Scattering of photons by stationary electrons is called Compton scattering and clearly always results in an energy loss for the photon. The opposite is the case in inverse Compton scattering, where a photon collides with a fast ("relativistic") electron or other charged particle, and often experiences a spectacular gain in energy. For simplicity, we shall consider only the case of a head-on collision along the \({x}\) axis. Eq.(64) is still applicable. But now

\[\tag{67} {\mathbf{Q}=\gamma (u)\,m_{0}(u,0,0,c)},\;\;\;{ \mathbf{P}=({h\nu }/c)(-1,0,0,1)}, \;\;\; \mathbf{P^\prime}=(h{\nu }^{\prime}/c)(1,0,0,1), \]

where \(u\) is the velocity of the electron. Then (64) yields

\[\tag{68} 2\,h\,c^{-2}\,{\nu}\,{\nu}^{\prime}= \nu\,{\gamma}\,m_{0}\,(1+u/c)-{\nu}^{\prime}\,{\gamma}\,{m}_{0}\,(1-u/c). \]

If we now set \({1+u/c\approx 2}\) and \({1-u/c\approx 1/(2\,\gamma ^{2})}\) (since the product is \({\gamma ^{-2} }\)), we get

\[\tag{69} \frac{ {\nu }^{\prime} }{\nu }=\frac{4\,\gamma^{2} }{1+\bigl(4\,\gamma\, h\, \nu /(m_{0}\,c^{2})\bigr)}. \]

For a low-energy photon, the second term in the denominator can be quite small, so its energy can be amplified by a factor of the order of \({\gamma ^{2} }\ .\) For example, when a photon of the cosmic microwave background ( \({h\nu }\approx {10}^{-3}{eV})\) collides with a high-energy cosmic ray proton ( \(m_{0}c^{2}\approx {10}^{10}{eV}\ ,\) \(\gamma \approx {10}^{ {11} }\)), its energy could be boosted to \({ {10}^{ {19} }{ {eV} } }\ !\)

Four-force and three-force

There are at hand only two reasonable definitions for the 4-force \(\mathbf{F}\) on a particle, \({\mathbf{F}=m_{0}\mathbf{A} }\) or \({\mathbf{F}=(d/{d\tau }) \mathbf{P} }\ .\) The accepted choice is the latter:

\[\tag{70} \mathbf{F}=\frac{d}{ {d\tau } } \mathbf{P}=\frac{d}{ {d\tau } }(m_{0}\mathbf{U})=m_{0}\mathbf{A}+\frac{ {dm}_{0} }{ {d\tau } }\mathbf{U}, \]

though the two coincide when \(m_{0}=\mathrm{const.}\) We then speak of a rest-mass-preserving force, which will be the expected norm. It particularly applies to the Lorentz force of electrodynamics. From (70), with (26),(40), and (45), we have

\[\tag{71} \mathbf{F}=\frac{d}{ {d\tau } } \mathbf{P}=\gamma (u)\frac{d}{ { {dt} } }( \mathbf{p},{m}_u{c})=\gamma (u)(\mathbf{f},\frac{1}{c}\frac{ {dE} }{ {dt} }) , \]

where we have introduced the relativistic 3-force \(\mathbf{f}\) defined by

\[\tag{72} \mathbf{f}=\frac{ {d} \mathbf{p} }{ {dt} }=\frac{d({m}_u\mathbf{u})}{ {dt} }. \]

Note that the power \({dE/dt}\) is the complement of \(\mathbf{f}\) in the formation of \(\mathbf{F}\ ,\) just as the energy \({E}\) is the complement of \( \mathbf{p}\) in the formation of \( \mathbf{P}\ .\)

From (70) we find, by use of (31) and (32), the first of the following equations; the second results from forming the scalar product (cf. Equation (18) of the right-most member of (71) with \({\mathbf{U}=\gamma\, (\mathbf{u},c)}\ :\)

\[\tag{73} \mathbf{F}\mathbf{.}\mathbf{U}=c^{2}\frac{ {dm}_{0} }{d\tau }=\gamma ^{2}(u)(\frac{ {dE} }{ {dt} }-\mathbf{f}\mathbf{.}\mathbf{u}). \]

This shows that \(\mathbf{F.U}\) is the proper rate at which the particle's internal energy is being increased. If the force is rest-mass-preserving – as will be assumed from now on – it thus satisfies

\[\tag{74} \mathbf{F.U} = 0,\;\;\; \mathbf{f}\mathbf{.}\mathbf{u}=\frac{ {dE} }{ {dt} },\;\;\; \text{and}\;\;\;{\mathbf{F}=\gamma (u)(\mathbf{f},\mathbf{f}\mathbf{.}\mathbf{u}/c)}, \]

where, for the last equation, we once again used (71). In particular, multiplying the middle equation by \({dt}\) we see that \(\mathbf{f}\) satisfies the Newtonian relation

\[\tag{75} \mathbf{f}\mathbf{.}\,{d}\mathbf{r }= {dE. } \]

But not Newton's second law: for, by (72) and (74)(ii),

\[\tag{76} \mathbf{f}={m}_u\mathbf{a}+\frac{ {d}{m}_u}{ {dt} }\mathbf{u} ,\;\;\;{m}_u\mathbf{a}=\mathbf{f}-\frac{\mathbf{f}\mathbf{.}\mathbf{u} }{c^{2} }\mathbf{u}. \]

Now \(\mathbf{a}\) is necessarily coplanar with \(\mathbf{f}\) and \(\mathbf{u ,}\) but it is parallel to \(\mathbf{f}\) only when \(\mathbf{u}\) is either parallel or orthogonal to \(\mathbf{f}\ .\)

The important transformation of the 3-force \(\mathbf{f}\) under a standard change of inertial frames – analogous to the transformations Equations 25 and 34 in SR:kinematics of \(\mathbf{u}\) and \(\mathbf{a}\) – is most easily obtained by applying the transformation pattern (13) to the 4-vector \(\mathbf{F }\) in (74)(iii). We shall write \({D}\) for \({1-u_{1}v/c^{2} }\) as we did in Equations 31 - 34 in SR:kinematics. We also need the formula

\[\tag{77} {\frac{\gamma ({u}^{\prime})}{\gamma (u)}=\gamma (v)D} , \]

whose derivation was outlined in the remark following (27). This is what then results:

\[\tag{78} {f}_{1}^{\prime}=\frac{f_{1}-{v}\mathbf{f}\mathbf{.}\mathbf{u}/c^{2} }{D} ,\;\;\; {f}_{2}^{\prime}=\frac{f_{2} }{ {\gamma D} } ,\;\;\; {f}_{3}^{\prime}=\frac{f_{3} }{ {\gamma D} } ,\;\;\; \gamma =\gamma (v){.} \]

Note that the transformed force in general depends not only on the original force but also on the velocity \(\mathbf{u}\) of the particle on which the force acts. Thus a velocity-independent force (like Newton's gravitational force field) is no longer a Lorentz-invariant concept. The velocity-dependent Lorentz force of electromagnetism, on the other hand, is a typical relativistic force. Nevertheless, Newton's relation \(\mathbf{f'}= \mathbf{f}\) still holds in the purely one-dimensional case: among IFs with mutual velocity along the common direction of \(\mathbf{f}\) and \(\mathbf{u}\ .\) For, let that be the \({x}\) direction, so that \({f_{2}=f_{3}=0}\) and consequently \({f}_{2}^{\prime}={f}_{3}^{\prime}=0\ ;\) but \(\mathbf{f.u}\) is now \({f_{1}u_{1} }\ ,\) whence \({f}_{1}^{\prime}=f_{1}\) also. As an example, consider a parallel constant electric field and a charged particle moving in the direction of the field lines. In its rest-frame it always feels the same force, to which, by (76) with \(\mathbf{u} = 0\) and \({ {m}_u=m_{0} }\ ,\) it responds with constant proper acceleration, that is, with hyperbolic motion. (The rest-frame is an important concept throughout SR: it is always an IF, it always moves uniformly: an accelerating particle co-moves with its rest-frame for only one instant. But at that instant the rest-frame measures the particle's proper acceleration.)


All the figures in this article are taken from the author's book "Relativity: Special, General, and Cosmological" (2nd ed., 2006) Oxford University Press, by kind permission of the publishers.


  • Einstein, A. (1905) Annalen der Physik, 17, 891
  • Ignatowski, W. V. (1910) Phys. Zeits., 11, 972
  • Minkowski, H. (1908) Göttinger Nachr. 53. English translation in Lorentz, Einstein, Minkowski and Weyl (1923) The Principle of Relativity, Methuen/Dover.
  • Rindler, W. (1966) Special Relativity (2nd. ed.), Oliver and Boyd
  • Rindler, W. (1979) Essential Relativity (2nd. ed.), p. 51, Springer-Verlag
  • Rindler, W. (1991) Introduction to Special Relativity (2nd. ed.), Oxford U. P.
  • Rindler, W. (2006) Relativity: Special, General, and Cosmological (2nd. ed.), Oxford U. P.

Further reading

  • "Spacetime Physics", E. F. Taylor and J. A. Wheeler (W.H.Freeman, 1992)
  • "Special Relativity", A.P. French (Norton, 1968)
  • "Special Theory of Relativity", C. W. Kilmister (Pergamon Press, 1970)
  • "Special Relativity", W. G. Dixon (Cambridge University Press, 1978)
  • "Relativity: The Special Theory", J. L. Synge (North Holland, 1956)

External links

See also

Special relativity: kinematics, Special relativity: electromagnetism

Personal tools

Focal areas