# Special relativity: mechanics

Wolfgang Rindler (2012), Scholarpedia, 7(1):10905. | doi:10.4249/scholarpedia.10905 | revision #137229 [link to/cite this article] |

$
\newcommand{\sp}[2]{\mathbf{#1\,.\!#2} }
\newcommand{\SP}[2]{\mathbf{#1.\!#2} }
$
**Special relativity** (SR) is a physical theory based on Einstein's Relativity Principle, which states that all laws of physics (including, for example, electromagnetism, optics, thermodynamics, etc.) should be equally valid in all inertial frames; and on Einstein's additional postulate that the speed of light should be the same in all inertial frames. The present article describes mechanics in Special relativity. (See the article "Special relativity: kinematics" (SR:kinematics) for the prerequisites to the present article, and "Special relativity: electromagnetism" for the further development of the theory.)

## Contents |

## Spacetime and 4-vectors

Spacetime is of
great conceptual and practical utility in SR, and, in its generalized
form, essential in general relativity. We shall need it in our
development of relativistic mechanics. The idea of spacetime is simple
enough. It is the space of all events. In fact, our Figures 3 and 4 in SR:kinematics are maps of 2-dimensional spacetime, namely of the events
(\({x,t}\)) taking place on the spatial \({x}\) axis of some
frame \(S\ .\) We can imagine (though not *draw*, in the three
dimensions available to us) two more axes, of \({y}\) and
\({z}\ ,\) allowing us then to represent as points *all*
the events (\({x,y,z,t}\)) in the world – or rather, in our SR
model of the world, coordinatized by an inertial frame \(S\ .\)

Clearly, the same thing can be done in Newtonian theory. But Newtonian spacetime as a 4-space lacks the one important feature that makes, for example, Euclidean 3-space so interesting and useful: the ability to support a versatile vector calculus. We shall review 3-vectors as a preliminary to the 4-vectors of SR.

### A detour: 3-vectors

The 3-vector
calculus rests on the existence in Euclidean 3-space of a
*metric*,

\[\tag{1} {\Delta r}^{2}={\Delta x}^{2}+{\Delta y}^{2}+{\Delta z}^{2} \]

which is invariant under rotations and translations of the Cartesian coordinates. We define a 3-vector \(\mathbf{a}\) (a vector, for short) as a 3-component quantity

\[\tag{2} \mathbf{a} = {(a_{1},a_{2},a_{3})} \]

which can be represented invariably by a displacement

\[\tag{3} {\Delta } \mathbf{r} = {({\Delta x},{\Delta y},{\Delta z})} \]

and which must therefore *transform like a displacement* under rotations and translations. Then such a vector
has a *magnitude* \(a\), or \({|\mathbf{a}|}\ ,\) defined by

\[\tag{4} {|\mathbf{a}|} = {a=\sqrt{a_{1}^{2}+a_{2}^{2}+a_{3}^{2} } }, \]

which must be invariant, since that of the
corresponding displacement is invariant. If \({k}\) is a scalar
(i.e. a number which is the same in all inertial frames), we can define
a *scalar multiple* of a vector,

\[\tag{5} {k}\,\mathbf{a} = (k\,{a}_{1}, k\,{a}_{2}, k\,{a}_{3}), \]

which is obviously also a vector. A *sum* of vectors is
defined like a sum of displacements,

\[\tag{6} \mathbf{a} + \mathbf{b} = {(a_{1}+b_{1},a_{2}+b_{2},a_{3}+b_{3})}, \]

and is itself a vector. Therefore its squared magnitude

\[ \begin{array}{rcl} |\mathbf{a}+\mathbf{b}|^{2} &=& (a_{1}+b_{1})^{2}+(a_{2}+b_{2})^{2}+(a_{3}+b_{3})^{2} \\ ~ &=& (a_{1}^{2}+a_{2}^{2}+a_{3}^{2})+(b_{1}^{2}+b_{2}^{2}+b_{3}^{2})+2\,(a_{1}b_{1}+a_{2}b_{2}+a_{3}b_{3}) \end{array} \]

must be invariant. In this equation all items except the last
are known to be invariant, and so the last must be invariant too.
This allows us to define an invariant *(scalar) product* of
two vectors:

\[\tag{7} \sp{a}{b} = {a_{1}b_{1}+a_{2}b_{2}+a_{3}b_{3} }. \]

It is easily seen to obey the commutative law \(\sp{a}{b} = \sp{b}{a}\ ,\) and the distributive law \(\mathbf{a}\mathbf{.}(\mathbf{b+c})= \sp{a}{b} +\sp{a}{c}\ .\) Also note that \(\sp{a}{a}= {a^{2} }\ ,\) which is usually written

\[\tag{8} {\mathbf{a}^{2}=a^{2} }. \]

It must be stressed that not every quantity that allows itself to be represented by a number triple \({(a_{1},a_{2},a_{3})}\) is a vector. To be a vector it must transform (under rotations and translations) like the displacement \({ {\Delta \mathbf{r} } }\) and thus like the differential \(d\mathbf{r} = ({dx,dy,dz}) = {({dx}_{1},{dx}_{2},{dx}_{3})}\ ,\) where we have introduced the index notation for the three coordinates:

\[\tag{9} d{x}_{i}^{\prime}=\sum_j\frac{\partial {x}_{i}^{\prime} }{\partial x_{j} }{ {dx}_{j} },\;\;\; {a}_{i}^{\prime}=\sum_j\frac{\partial {x}_{i}^{\prime} }{\partial x_{j} }a_{j}. \]

Consider a moving particle, \({x_{i}=x_{i}(t)}\ ,\) where \({t}\) is absolute time. Dividing (9)(i) by the scalar \({dt}\ ,\) we see that the velocity \(\mathbf{u} = { {dx}_{i}/{dt} }\) is a vector. Consider any vector field \(\mathbf{a} = {a_{i} }( {\tau})\ ,\) \({\tau}\) being any scalar. Now, under the linear transformations between two given coordinate systems (rotations are linear), the partial derivatives in (9)(ii) are constants. Hence, by differentiating (9)(ii) with respect to \({\tau }\ ,\) we see that scalar derivatives of vectors, defined by \({d}a/{d\tau }={d}a_{i}/{d}\tau \ ,\) are vectors themselves. So all four of the basic vectors of mechanics, velocity \(\mathbf{u}={dx}_{i}/{ {dt} }\ ,\) acceleration \(\mathbf{a}={du}_{i}/{ {dt} }\ ,\) momentum \(\mathbf{p = }m\mathbf{u , }\) and force \(\mathbf{f} = m\mathbf{a},\) are indeed vectors.

From the component-wise definition of derivatives we can easily check the Leibniz rule

\[\tag{10} {d} (\mathbf{a}\mathbf{.}\mathbf{b}) = (d\mathbf{a})\mathbf{.}\mathbf{b}+\mathbf{a}\mathbf{.}d\mathbf{b}\,. \]

As a particular case, since \(\sp{a}{a} = {a^{2} }\) (or \({\mathbf{a}^{2}=a^{2} }\)), we get a most useful result by differentiating this identity:

\[\tag{11} \sp{a}{ {d}a}={a}\,{da}. \]

### The 4-vector calculus

The 4-vector calculus can be developed in almost complete analogy with 3-vector calculus, as was first shown, and utilized to great effect, by Hermann Minkowski (Minkowski 1908). Four-vector calculus plays out on spacetime, which, in its Minkowskian quadratic form (Equation 8 of SR: kinematics), possesses a natural metric. Note that nothing comparable is available to us in Newtonian spacetime; the only invariant there is the degenerate quadratic \({ {\Delta t}^{2} }\ .\)

Four-vectors have one glaring difference from 3-vectors: the basic quadratic form (Equation 8 of SR: kinematics) is "indefinite", that is to say, it can be positive, negative, or zero, for completely ordinary pairs of events. It is therefore not to be regarded as the square of a number, but that hardly affects the calculus. For dimensional reasons we now take \({ct}\) rather than just \({t}\) as the fourth coordinate, and accordingly write the fundamental displacement vector as

\[\tag{12} { {\Delta \mathbf{s} }=({\Delta x},{\Delta y},{\Delta z},\Delta {ct}){.} } \]

This serves as the prototype for *all*
4-vectors \(\mathbf{A= } {(A_{1},A_{2},A_{3},A_{4})}, \) which therefore transform under general LTs (Poincaré
transformations) like the coordinate \({\Delta }\)'s. Under
standard LTs Equation 4 of SR: kinematics, in particular, we have

\[\tag{13} {A}_{1}^{\prime}=\gamma\, [A_{1}-(v/c)A_{4}],\;\;\; {A}_{2}^{\prime}=A_{2},\;\;\; {A}_{3}^{\prime}=A_{3},\;\;\; {A}_{4}^{\prime}=\gamma\, [A_{4}-(v/c)A_{1}]\;\;\; \]

Note also from (12) that under pure spatial rotations (keeping \({t}\) constant) the first three components of a 4-vector transform like a 3-dimensional displacement, and hence like a 3-vector. For this reason one often lumps together the first three components into a 3-vector and writes

\[\tag{14} \mathbf{A}=(\mathbf{a},A_{4}). \]

Our usual convention is to denote 4-vectors with bold capitals and 3-vectors with bold lower case letters.

Making a conventional sign
choice (some authors make the opposite choice), we define our **metric**
as

\[\tag{15} { {\Delta \mathbf{s} }^{2}=\Delta ({ {ct})^{2}-{\Delta x}^{2}-{\Delta y}^{2}-{\Delta z}^{2} }.} \]

We shall regard it, as we justify below, as the square of a vector – the "squared displacement" – rather than as the square of a possibly complex number.

In analogy with (5),
(6), and particularly (7), we define the **sum**, **scalar multiple**, and
**product** of 4-vectors:
\[\tag{16}
\mathbf{A}+\mathbf{B}=(A_{1}+B_{1},A_{2}+B_{2},A_{3}+B_{3},A_{4}+B_{4}),
\]
\[\tag{17}
k\,\mathbf{A} = (k\,A_1,k\,A_2,k\,A_3,k\,A_4),
\]
\[\tag{18}
\mathbf{A.B}= (A_{4}B_{4}-A_{1}B_{1}-A_{2}B_{2}-A_{3}B_{3}).
\]
Analogously to the corresponding 3-vector
results, we have

\[\tag{19} \SP{A}{B} = \SP{B}{A},\;\;\;\mathbf{A}\mathbf{.}(\mathbf{B}+\mathbf{C}) = \SP{A}{B} + \SP{A}{C}. \]

When \(\SP{A}{B} = 0\ ,\) we say that the vectors
are **orthogonal**. As in the case of 3-vectors, we
write

\[\tag{20} {\mathbf{A}^{2} } = \SP{A}{A} = {A_{4}^{2} }-{A_{1}^{2} }-{A_{2}^{2} }-{A_{3}^{2} } \]

and define the **magnitude** as

\[\tag{21} {A =} {|\mathbf{A}^{2}|^{1/2}\ge 0}. \]

We see from (19) the justification of regarding (15) as a squared vector. Its magnitude,

\[\tag{22} { {\Delta s}=|c^{2}{\Delta t}^{2}-{\Delta x}^{2}-{\Delta y}^{2}-{\Delta z}^{2}|^{1/2} } , \]

is called the **interval ** between the
corresponding events.

The sign of
\({\mathbf{A}^{2} }\) is significant. We divide all 4-vectors into three types
– **timelike**, **null**, and **spacelike** –
according to the sign of their squares:

\[\tag{23} {\mathbf{A}^{2}>0}: \textrm{timelike, }\;\;\; {\mathbf{A}^{2}=0}: \textrm{null,} \;\;\; {\mathbf{A}^{2}<0}: \textrm{spacelike} \]

Figure 1 is a spacetime diagram (with one dimension more than the Minkowski diagrams 3,4,5 of SR:kinematics, but the \(z\)-dimension still suppressed) of the neighborhood of some event \(\mathcal{P}\ .\) It clarifies the geometry implicit in Figure 1. If we write our earlier Equation 11 of SR: kinematics with \({\Delta
}\)'s rather than \({d }\)'s, we see from just one of its sides
that a straight worldline segment with \({ {\Delta s}^{2}>0}\)
(representing a timelike vector) corresponds to a velocity \({u}
< c\ ,\) while null and spacelike displacements correspond,
respectively to \({u} = 0\ ,\) \({u} > 0\ .\)
Geometrically, therefore, these vectors subtend angles less than,
equal to, and greater than \({ {45}^{\circ} }\) with the
\({t}\)-direction. We say that they lie inside,
on, or outside the *light-cone*. Importantly, *no LT*
can change the sign of * \({A_{4} }\)* if
\({\mathbf{A}^{2}\ge 0}\ .\) For all observers agree on the
time sequence along a signal as long as \({u\le c}\ .\) A vector
inside (or on) the top half of the cone therefore stays there under any
active LT (See, for example, Fig. 6 of SR:kinematics.). We speak of such vectors as **future-pointing** and
of that region as the **absolute future** of the vertex
event. Similarly the lower half is called the **absolute past.** The events outside the cone constitute the
*"present"* – each of them can be reached by a superluminal
signal from the vertex, which has *infinite *speed in
*some *IF, in which the event is then simultaneous with the
vertex.

One important
fact to bear in mind about spacetime diagrams like Figure 1 here and Figs. 3,4 and 5 of SR:kinematics is that they
are mere *maps* of spacetime with its events and vectors, not
the spacetime itself. It is like mapping the surface of the earth
onto a Euclidean plane: since that surface also has a non-Euclidean
metric, the map necessarily distorts its true metric relations. So in
a spacetime diagram, for example, vectors that are orthogonal in the
Minkowski sense \(\SP{A}{B} = 0\ ,\) like \((2,0,0,1)\)
and \((1,0,0,2)\ ,\) or the \({x}^{\prime}\) and
\({t}^{\prime}\) axes in Fig. 4 of SR:kinematics, do not necessarily look orthogonal. And
vectors that have the same Minkowski magnitude (length)
\({|\mathbf{A}^{2}|^{1/2} }\) do not necessarily look as if they did, like
\((0,0,0,3)\) and \((4,0,0,5)\ ,\) or the whole set of vectors \(\mathcal{OB}\) in Fig. 5 of SR:kinematics, joining the origin to the calibrating hyperbola. Only in
the usual diagrams of Euclidean geometry do map and space coincide.

### The zero-component lemma

The important lemma here referred to (and whose 3-dimensional analog is intuitive) is this: If a 4-vector has a particular one of its four components zero in all IFs, then the entire vector must vanish. Let the vector be \({A=(A_{1},A_{2},A_{3},A_{4})}\ ,\) and suppose, typically, that \({A_{2} }\) is always zero. Consider a LT in the \({y}\)-direction out of any IF \(S\ :\) \({A}_{2}^{\prime}=\gamma\, [A_{2}-(v/c)A_{4}]\ .\) This shows that also \({A_{4} }\) must always vanish. But then a LT in the \({x}\)-direction, \({A}_{4}^{\prime}=\gamma\, [A_{4}-(v/c)A_{1}]\) shows that \({A_{1} }\) must always vanish, as must \({A_{3} }\) by an analogous argument.

### Four-velocity and four-acceleration

General LTs are linear transformations, and so the coordinate differentials \({ {dx}_{\mu } }\) and the 4-vector components \({A_{\mu } }\) (with Greek indices taking the values 1,2,3,4) satisfy equations analogous to (9). As in 3-vector theory, we define differentials and scalar derivatives of 4-vectors by the corresponding operations on the components. From the analog of (9)(i) it then follows that the scalar derivative \({dx}_{\mu }/{d\tau }\) of the coordinates is a 4-vector, and that the scalar derivative \({dA}_{\mu }/{d\tau }\) of any 4-vector is a 4-vector. But now the coordinate \({t}\) is no longer an invariant scalar. What is?

In particle mechanics we are concerned with particle worldlines. An important invariant – just from physical considerations – along such a worldline is the elapsed proper time \(\tau\) Equation 22 of SR: kinematics indicated by an imaginary ideal clock carried by the particle. From the definition (22) of the interval, this proper time \({\tau }\) is the same (except for a factor \(1/{c}\)) as the evidently invariant cumulative interval along the worldline. For, expressing (22) in terms of differentials and, for a particle, assuming \({({ {dx}^{2}+{ {dy}^{2}+{ {dz}^{2})/{ {dt}^{2}=u^{2}\le c^{2} } } } } }\ ,\) we have

\[\tag{24} s=\int(1-u^{2}/c^{2})^{1/2}c\,dt=c\tau. \]

Hence we can define the particle's 4-velocity vector \(\mathbf{U}\) and its 4-acceleration vector \(\mathbf{A}\) by the equations

\[\tag{25} \mathbf{U}=\frac{ {dx}_{\mu } }{ {d\tau } },\;\;\; \mathbf{A}=\frac{ {d}\mathbf{U} }{ {d\tau } }=\frac{ {dU}_{\mu } }{ {d\tau } }=\frac{d^{2}x_{\mu } }{ {d\tau }^{2} }. \]

Now, from (24) we have the important equation

\[\tag{26} \frac{ {dt} }{ {d\tau } }=\bigl(1-\frac{u^{2} }{c^{2} }\bigr)^{-1/2}=\gamma (u). \]

Since \({dx}/{d\tau }=({dx}/{dt})({dt}/{d\tau })=u_{1}\gamma(u)\ ,\) etc., and \({x_{4}={ {ct} } }\ ,\) we see that

\[\tag{27} \mathbf{U} = \gamma (u)\,(u_{1},u_{2},u_{3},c)=\gamma (u)\,(\mathbf{u},c). \]

Note that a particle moving with velocity \({u}\) has a different \({\gamma }\)-factor \({\gamma (u)}\) in each IF. An interesting relation between its \({\gamma }\)-factors in \(S\) and \(S'\ ,\) \({\gamma (u)}\) and \({\gamma ({u}^{\prime}),}\) and \({\gamma (v)}\ ,\) the \({\gamma }\)-factor between the frames, can be obtained by applying (13)(iv) to \({U_{4} }\ .\) [See Eq.(78) below.]

The last form of \(\mathbf{U}\) in (27) is noteworthy. It foreshadows the kind of relation that often exists between a familiar 3-vector and its generalization to 4 dimensions. A somewhat more complicated relation holds between the 3-acceleration \(\mathbf{a} = d\mathbf{u}/dt\) and the 4-acceleration \(\mathbf{A. }\) From the definition (25)(ii) and Eqs.(26),(27) we have

\[\tag{28} \mathbf{A}=\gamma\, \frac{ {d}\mathbf{U} }{ {dt} }=\gamma\, \frac{d}{ {dt} }({\gamma\,\mathbf{u} },{\gamma\, c})=\gamma\; (\dot{\gamma }\,\mathbf{u}+{\gamma}\,\mathbf{a},\dot{\gamma }\,c), \]

where \(\dot{\gamma }={d\gamma }/{ {dt} }\ .\) An important (and easily verified) identity satisfied by \({\gamma (u)}\) is

\[\tag{29} c^{2}{d\gamma }=\gamma ^{3}{u\,du}=\gamma ^{3}\mathbf{u}{.}{d}\mathbf{u}. \]

In particular, this implies \({\dot{\gamma }=0}\) when \({u} = 0\ .\) Hence in the instantaneous rest frame of the particle (where \({u = }0\) and \({\gamma =1}\)) the components of \(\mathbf{A}\) reduce to

\[\tag{30} \mathbf{A} = (\mathbf{a}, 0). \]

So \(\mathbf{A} = 0\) if and only if the 3-acceleration in the rest frame vanishes. The 4-velocity \(\mathbf{U}\ ,\) on the other hand, never vanishes – it even has the same magnitude, \({c}\ ,\) always. (Geometrically, it represents \({c}\) times the unit tangent vector to the worldline.) In fact, we have

\[\tag{31} {\mathbf{U}^{2}=c^{2} },\;\;\;{\mathbf{A}^{2}=-\alpha ^{2} }, \]

\({\alpha }\) being the proper acceleration, namely the magnitude of the 3-acceleration in the particle's rest frame. The simplest way to get these squares (since they are invariants!) is to calculate them in the rest frame. By reference to (23), we see that \(\mathbf{U}\) is timelike, while \(\mathbf{A}\) is spacelike. Also the two are orthogonal,

\[\tag{32} \SP{U}{A} = 0 , \]

as is again most easily seen in the rest frame. Consider next the product \(\SP{U}{V}\) of the 4-velocities of two particles which either move uniformly or whose worldlines just cross. Calculating in the rest frame of the \(\mathbf{U}\)-particle, relative to which the \(\mathbf{V}\)-particle has velocity \({v}\ ,\) we find

\[\tag{33} \SP{U}{V} = {c^{2}\gamma (v)}. \]

This provides us with a very neat way to find the relative velocity \({v}\) of the two particles.

### The wave 4-vector

What is the equation of a set of moving parallel planes representing loci of wavecrests? A stationary plane with unit normal \(\mathbf{n} = ({l,m,n})\) at distance \(p\) from some arbitrary origin \(P = {(x_{0},y_{0},z_{0})}\) satisfies the equation \(\sp{n}{r} = p\) or

\[\tag{34} l\,(x-x_{0})+m\,(y-y_{0})+n\,(z-z_{0})=p. \]

If this plane propagates in the direction of its normal with velocity \({w},\,\) \({p}\) becomes \(w\,(t-t_{0})\ .\) And if, instead of a single plane, we have a set of equally spaced parallel planes a distance \({\lambda }\) apart, we add \({N}{\lambda }\) to \({p},\) \({N}\) running through the positive and negative integers. In \({\Delta }\)-notation, then,

\[\tag{35} { {l\Delta x}+{m\Delta y}+{n\Delta z}-(w/c)\Delta { {ct}={N\lambda } } }. \]

In vector notation, this could be written as

\[\tag{36} \mathbf{L}\mathbf{.}\Delta\mathbf{s}=N, \]

where

\[\tag{37} {\mathbf{L}=\frac{1}{\lambda }\,\bigl(\mathbf{n},\frac{w}{c}\bigr)=\nu\, \bigl(\frac{\mathbf{n} }{w},\frac{1}{c}\bigr)} , \]

\({\lambda }\) being the wavelength and
\({\nu =w/\lambda }\) the frequency. But is \(\mathbf{L}\) really a
4-vector? The answer is yes. Proof: the wave will have a similar
equation in \(S'\ ,\) say \(\mathbf{M'}\mathbf{.}\Delta\mathbf{s'}=N\ .\) (Just
Lorentz-transform the \(S\)- \({\Delta }\)'s of (36) to \(S'\)- \({\Delta
}\)'s. By the linearity of the LTs, each \(S\)- \({\Delta }\) is a linear
form in the \(S'\)- \({\Delta }\)'s, so the result is *some* linear
form in the \(S'\)- \({\Delta }\)'s, which we can write as
\(\mathbf{M'}\mathbf{.}\Delta\mathbf{s'}.)\) But the Lorentz transform
\(\mathbf{L'}\) of \(\mathbf{L}\) must also satisfy this equation, by the
invariance of the product of vectors. Hence
\((\mathbf{M'}-\mathbf{L'})\mathbf{.}\Delta\mathbf{s'}=0\ .\) Since
\(\mathbf{M'}-\mathbf{L'}\) cannot be orthogonal to all the \(\Delta
{s}^{\prime}\ ,\) it follows that \(\mathbf{M'} = \mathbf{L'}\ .\)

We note that our results for a space-filling plane wave train can also be applied locally to an arbitrary wave train (e.g. to one spreading spherically), provided that a sufficiently small portion of it has the appearance of a plane wave train.

The recognition of \(\mathbf{L}\) as a 4-vector allows us to know how its various components transform under a Lorentz transformation, namely as in Figure 1 above. This, in turn, allows us to calculate the important transformation equations for \(\mathbf{n}\ ,\) \(\lambda\ ,\) and \(w\) (see, for example, Rindler 2006).

## Relativistic Mechanics

Special Relativity, first of all, is a new theory of space and time –
spacetime – and so far we have outlined this part of it, merely
elaborating the kinematic consequences of the LTs, augmented by the
speed-limit axiom. But SR also claims obedience – Lorentz invariance
– from all the laws of physics. As we shall see later, Maxwell's
theory (at least in vacuum) was already Lorentz invariant. And so
this very substantial part of classical physics was a welcome first
inhabitant of the new spacetime. Newton's theory, being Galileo
invariant, did not fit. Even though no experimental
shortcomings of it had yet been discovered, confidence in the new model
still required that a new Lorentz invariant mechanics be found. What
was needed was the judicious *invention* of new axioms.
There is no logically binding way to *derive *them. But an
obvious requirement is that for slow motions (compared to the speed of
light) the new mechanics must overlap with the old, since for over two
centuries that had held up to the most stringent tests. Somewhat
miraculously, the new "relativistic" mechanics was easily found, was
simple and elegant within the new 4-dimensional formalism, and
predicted hugely different results from Newton's theory for particle
collisions near the speed of light, all of which were eventually
confirmed.

The 4-vector calculus will be our guide. Any equation between two 4-vectors, \(\mathbf{A} = \mathbf{B}\) (equal components) is automatically Lorentz invariant, since the components of both sides transform equally. A special case of this is \(\mathbf{A} = 0\ ,\) 0 being a somewhat sloppy but accepted notation for the zero-vector \((0,0,0,0)\ .\) Four-vector equations are therefore natural candidates for relativistic laws (though not the only ones – some are tensor or even spinor equations.) In Newtonian theory, force is the primary concept. Relativistic mechanics can be approached from many angles. But since particle collisions play an important role in it, momentum is here perhaps the most convenient starting point.

We assume that
associated with every particle there is an intrinsic positive scalar,
\({m_{0} }\ ,\) namely its Newtonian or **rest-mass**. This
allows us to define the **4-momentum **\( \mathbf{P}\) of a particle in
analogy to its 3-momentum,

\[\tag{38} \mathbf{P = } {m_{0}\mathbf{U} }, \]

\(\mathbf{U}\) being its 4-velocity. Like \(\mathbf{U}, \mathbf{P}\) is timelike and future pointing. And we take as the basic axiom of collision mechanics the conservation of this 4-vector quantity: the sum of the 4-momenta of all the particles going into a point-collision equals the sum of the 4-momenta of all those coming out. (The collision may or may not be elastic, and there may be more, or fewer, or other particles coming out than going in.) We can write this conservation law in the form

\[\tag{39} \sideset{}{^\ast}\sum \mathbf{P}_{n}=0, \]

where a different value of \({n} = 1,2, {\dots}\) is assigned to every particle going in and every particle coming out (so if 3 go in and 4 come out, \({n}\) runs from 1 to 7), and \(\sideset{}{^\ast}\sum\) is a sum that counts all pre-collision terms positively and all post-collision terms negatively. The LHS of (39) is thus a 4-vector, which makes our axiom automatically Lorentz invariant. And this is the entire basis of relativistic mechanics; the rest is just working out its consequences!

Using the component form (27) of \(\mathbf{U}\ ,\) we find from (38) the following components for \( \mathbf{P}\ :\)

\[\tag{40} { \mathbf{P}=m_{0}\mathbf{U}=m_{0}\gamma (u)\,(\mathbf{u},c)=:( \mathbf{p},{m}_u{c})}, \]

where, in the last equation, we have introduced the symbols

\[\tag{41} {m}_u=\gamma (u)\,m_{0}, \] \[\tag{42} \mathbf{p} = {m}_u\mathbf{u} . \]

The formalism thus leads us naturally to this
quantity \({m}_u\ ,\) which we shall call the **relativistic mass**, or usually just the *mass* of the particle, and
to \( \mathbf{p}\ ,\) which we shall call the **relativistic momentum**,
or usually just the *momentum* of the particle. Observe that
\({m}_u\) increases with speed; when \({u} = 0\) it is least,
namely \({m_{0} }\ ,\) which is why we call \({m_{0} }\) the rest-mass
of the particle. On the other hand, \({m}_u\) becomes infinite
as \({u}\) approaches \({c}\ ,\) which is Nature's way of
avoiding superluminal velocities.

In terms of these quantities the original conservation law (39) splits componentwise into the conservation of relativistic momentum,

\[\tag{43} \sideset{}{^\ast}\sum \mathbf{p}=0, \;\;\;\text{that is}, \;\;\; \sideset{}{^\ast}\sum {m}_u\mathbf{u}=0, \]

and the conservation of relativistic mass,

\[\tag{44} \sideset{}{^\ast}\sum {m}_u=0,\;\;\;\text{that is,}\;\;\;\sideset{}{^\ast}\sum\gamma (u)\,m_0 = 0 \]

where, for brevity, we have omitted the
summation index \({n}\ .\) Evidently, in the slow-motion limit
these are the corresponding Newtonian conservation laws, and so our
proposed relativistic law satisfies the three basic criteria: Lorentz
invariance, simplicity, and Newton-conformity. We also note from the
"zero-component lemma" discussed earlier, that the validity of
*either* (43) or (44) in all IFs implies the full conservation
law (39). Finally, in the formal limit \({c\rightarrow \infty}\)
the relativistic laws become the exact Newtonian laws,
with \(m_0\) being the Newtonian mass,
which can often serve as a check on our work.

### The equivalence of mass and energy

There is much more in
Eq.(44) than at first meets the eye. It cannot possibly say what it
seems to say: that mass in the Newtonian sense of "quantity of
matter" is conserved. By now we know empirically that matter is
*not* conserved, as when an electron and a positron annihilate
each other. Moreover, the quantity of matter cannot be velocity
dependent, as \({m}_u\) is. But there *is *a
velocity-dependent quantity in classical mechanics that is conserved:
the kinetic energy in elastic collisions. Could it be that
\({m}_u\) (or a multiple of it) is a measure of the *total *energy of the particle? And that (44) asserts the universal conservation of energy?
The answer turns out to be yes, and this was regarded by Einstein, who
found it, as one of the most significant results of SR. Yet
Einstein's assertion of the full equivalence of mass and energy
according to the famous formula

\[\tag{45} E={m}_u{c}^{2} \]

was and is in part a hypothesis, as we shall see.

Consider the following expansion for the mass as defined by (41):

\[\tag{46} {m}_u=m_{0}\bigl(1-\frac{u^{2} }{c^{2} }\bigr)^{-1/2}=m_{0}+\frac{1}{c^{2} }\bigl(\frac{1}{2}m_{0}u^{2}\bigr)+{.}{.}{.} \]

This shows that the relativistic mass of a
slowly moving particle exceeds its rest mass by \(1/ {c^{2} }\) times
its kinetic energy (assuming the approximate validity of the Newtonian
expression for the latter). So kinetic energy *contributes* to
the mass in a way that is consistent with (45). In fact, it is
equation (46) that supplies the constant of proportionality in (45).
And it is the enormity of this constant that explains why the mass
increase corresponding to the easily measurable kinetic energies of
particles in classical collisions had never been observed.

At this stage in the argument
it is still possible to suppose that energy *contributes* to the
mass, without causing all of it. There could be a residue of
intrinsic mass that is separately conserved. To equate *all*
the mass with energy, especially in Einstein's time, required an act of
aesthetic faith very characteristic of Einstein. His hypothesis has
stood up magnificently to the test of time. Note that Einstein's
equation determines a zero-point of energy. In Newton's theory one
could, in principle, extract an infinite amount of energy from a finite
mass by letting it collapse indefinitely under its own gravity.
According to Einstein's equation, nature must find a way of
preventing this. (This requires general relativity, but the rescue
comes from the formation of a black hole.)

A **WORD OF CAUTION**: there is a school of
thought (mainly among particle physicists who, after all, are the main
consumers of collision mechanics) who reject the concept of
relativistic mass altogether. Wherever we have an \({m}_u\, ,\) they
would replace it by \({E/c^{2} }\ ;\) our \({m_{0} }\) becomes their
\(m\, ,\) simply called the mass; and our
\(E={ {m}_u{c}^{2} }\) becomes their \(E=\gamma(u)m{c}^{2}\). This has nothing to do with physics. It is simply a choice between two alternative conventions, ours being that of Einstein.

Einstein's mass-energy
equivalence allows us to include even particles of zero rest mass
(photons, \({\dots}\)) into the scheme of mechanics. If such a particle
has finite energy [all of it being *kinetic* energy,
\({({m}_u-m_{0})c^{2} }\)] it has finite mass \({m}_u=E/c^{2}\ ;\) thus,
because of (41), it *must* move at the speed of light and then
it will also have momentum \({p} = {E}/{c.
}\) Formally we can regard its mass as the limit of a product,
\({ {\gamma m}_{0} }\ ,\) whose first factor has gone to infinity, while its second factor has gone to zero. In all cases we shall have, from (40),

\[\tag{47} { \mathbf{P}^{2}=m_{0}^{2}c^{2}={m}_u^{2}c^{2}-p^{2}=E^{2}/c^{2}-p^{2} }. \]

Note that for zero-rest-mass particles, and only for those, \( \mathbf{P}\) becomes a null vector. When two particles with respective 4-momenta \({ \mathbf{P}_{1} }\) and \({ \mathbf{P}_{2} }\) are involved, and \(v_{12}\) is their relative speed, we have, by calculating in the rest frame of either,

\[\tag{48} \mathbf{P}_{1}\mathbf{.}\mathbf{P}_{2}=m_{01}E_{2}=m_{02}E_{1}=c^{2}\gamma(v_{12})\,m_{01}m_{02} , \]

where, typically, \(m_{01}\) is the rest-mass of the
first particle and \({E_{2} }\) the energy of the second
particle* in the rest- frame of the first*. Note that the
first equation holds even if the second particle is a photon. If both
are photons, the equation is inapplicable (But see Equation (53) below.). In the particular case of
an *elastic* (that is, rest-mass preserving) collision of
two particles with pre-collision momenta \( \mathbf{P}\) and \(\mathbf{Q
}\) and post-collision momenta \(\mathbf{P'}\) and \(\mathbf{Q'}\ ,\) we find, on
squaring the conservation equation \( \mathbf{P} + \mathbf{Q} =
\mathbf{P'}+\mathbf{Q'}\ ,\) and using (47), that

\[\tag{49} \SP{P}{Q}=\SP{P'}{Q'}, \]

so that, by reference to (48), the relative velocity between the particles is conserved. Since this result is independent of the value of \({c}\ ,\) it must hold in Newtonian theory also!

### Particles and Waves

As a last resort to avoid the notorious "ultraviolet catastrophe" of blackbody radiation theory, Planck in 1900 had made the suggestion that radiation of frequency \({\nu }\) might be emitted only in distinct "quanta" of energy

\[\tag{50} E={h\nu }, \]

where \(h\) is a universal constant,
now known as **Planck's constant**. And then Einstein found that
the photoelectric effect would fit the further assumption that
radiation of frequency \({\nu }\) is not only emitted in quanta of
energy \({h\nu }\ ,\) but also travels and is received as such
quanta, which were eventually called **photons**. Finally, in
1923, de Broglie – at first only as a formal possibility – showed how
waves could be associated with *all *particles according to
Planck's formula (50). This eventually led to Schrödinger's wave
mechanics and much more. In a beautiful application of SR, de
Broglie proposed the following relation between the particle's
4-momentum \( \mathbf{P}\) and the wave 4-vector of the associated wave
(see (40), (45), and (37)):

\[\tag{51} \mathbf{P} = {h}\,\mathbf{L},\;\;\;\text{that is},\;\;\; {E\,\bigl(\frac{\mathbf{u} }{c^{2} },\frac{1}{c}\bigr)={h\nu\, }\bigl(\frac{\mathbf{n} }{w},\frac{1}{c}\bigr)} . \]

In fact, if Planck's relation (50) is to be maintained for a material particle and its associated wave, then (51) is inevitable. For then the 4th components of the 4-vectors on either side of (51) are equal; by our earlier "zero-component lemma", the entire 4-vectors must therefore be equal! From (51) it then follows that the wave travels in the direction of the particle ( \({\mathbf{n}\propto \mathbf{u} }\)), but with a larger velocity \(w\), given by de Broglie's relation

\[\tag{52} { {u\,w}=c^{2} }, \]

as can be seen by comparing the magnitudes of the leading 3-vectors. (However, the group velocity of the wave, which carries the energy, can be shown to be still \(u\).) The wave must necessarily travel at a speed other than the particle unless that speed is \(c\), for waves and particles aberrate differently, and a particle comoving with its wave would slide across it sideways in another frame.

We can now complete formula (48) for the case when both particles are photons. If their paths subtend an angle \({\theta }\ ,\) Eq.(51), with \(\mathbf{n}_{1}\mathbf{.}\mathbf{n}_{2}=\cos\theta\) and \({w_{1}=w_{2}=c}\ ,\) yields

\[\tag{53} \mathbf{P}_{1}\mathbf{.}\,\mathbf{P}_{2}=h^{2}c^{-2}\nu_{1}\nu_{2}(1-\cos\theta ). \]

### The Zero-Momentum Frame

Consider an
arbitrary inertial frame \(S\) and in it a system of occasionally colliding
particles, which could also be photons, subject to no forces other than
the very short-range forces during collisions, and thus moving
uniformly *between* collisions. We define the **total mass**
\({\bar{m} }\ ,\) the **total momentum** \({\bar{ \mathbf{p} } }\ ,\)and the **total 4-momentum**
\({\bar{ \mathbf{P} } }\) of the system in \(S\) as the
*instantaneous *sum of the respective quantities. Then (see
(40))

\[\tag{54} {\bar{ \mathbf{P} }=\sum { \mathbf{P} }=\sum \,( \mathbf{p},{m}_u{c})=(\bar{ \mathbf{p} },\bar{m}c)}. \]

Because of the conservation laws, each of the barred quantities remains constant in time.

The quantity \({\bar{ \mathbf{P} } }\ ,\) being a sum of
4-vectors, seems assured of 4-vector status itself. But there
is a problem. If all observers agreed on which \( \mathbf{P}\)s make up the sum \({\bar{ \mathbf{P} } }\ ,\) then \({\bar{ \mathbf{P} } }\) would clearly
be a vector. But in each frame the sum is taken at one
instant, which may result in different
\( \mathbf{P}\)s making up the \({\bar{ \mathbf{P} } }\) of different
observers. A spacetime diagram such as Figure 1, even an
imagined one, is
helpful. Our particle system will correspond to a lattice of straight worldline segments, meeting at various knots (collisions), where two or more segments come together and one or more emerge, or vice versa. A simultaneity in \(S\) corresponds to a "horizontal" plane in
the diagram, and a simultaneity in a second frame \(S'\) corresponds to
a "tilted" plane. In \(S\) \({\bar{ \mathbf{P} } }\) is
summed over horizontal planes, and in \(S'\) over tilted planes.
However, even in \(S\ ,\) the same \({\bar{ \mathbf{P} } }\)
results no matter
*what* plane we sum over. For imagine a continuous motion of a
horizontal
plane into a tilted
one. Each \( \mathbf{P}\) remains the same until the first collision,
since the particles move uniformly between collisions.
As the plane sweeps over that collision, the sub-sum of \({\bar{ \mathbf{P} } }\)
that enters the collision (as all other collisions) is conserved, by
4-momentum conservation. Thus, without affecting the value of
\({\bar{ \mathbf{P} } }\ ,\) all observers could sum their \( \mathbf{P}\)s over the
*same* plane, whence \({\bar{ \mathbf{P} } }\) is indeed a 4-vector.

This 4-vector \({\bar{ \mathbf{P} } }\) is timelike and future-pointing, except in the negligible special case of nothing but comoving photons, when it is obviously null. For consider the expansion

\[\tag{55} {\bar{ \mathbf{P} } }^{2}= {( \mathbf{P}_{1}+ \mathbf{P}_{2}+\cdots)^{2}= \mathbf{P}_{1}^{2} }+ \mathbf{P}_{2}^{2}+\cdots+2 \,\mathbf{P}_{1}\mathbf{.}\, \mathbf{P}_{2}+\cdots . \]

By reference to (47), (48), and (53), all terms on the RHS
are non-negative. The presence of
even a single non-zero-rest-mass particle or of a single
pair of non-comoving photons will
make the RHS positive. So \({\bar{ \mathbf{P} } }\) is timelike.
That it is also future-pointing is clear from the
positivity of the fourth components of all the summands.
Thus, by choosing an IF with time
axis along \({\bar{ \mathbf{P} } }\ ,\) we can make its
spatial components all zero\[{\bar{ \mathbf{p} }=0}\ .\] This is the
**zero-momentum frame**
\(S_{ { {ZM} } }\) for our system (analogous to the
classical center-of-mass frame). *In *
\(S_{ { {ZM} } }\) the 4-velocity
\(\mathbf{U}_{ { {ZM} } }\) of
\(S_{ { {ZM} } }\) is \((0,0,0,{c})\ ,\) so
that, by (54),

\[\tag{56} {\bar{ \mathbf{P} }=}(0,0,0, \bar{m}_{ {ZM} }\,c) = \bar{m}_{ {ZM} }\mathbf{U}_{ZM}, \]

where \({\bar{m} }_{ { {ZM} } }\) is \({\bar{m} }\) in \(S_{ { {ZM} } }\ ,\) obviously an invariant. Compare (56) with (40)(i): this shows that \({\bar{m} }_{ { {ZM} } }\) and \(\mathbf{U}_{ { {ZM} } }\) are for the system what \({m_{0} }\) and \(\mathbf{U}\) are for a particle. Note how the kinetic energy contributes to the "rest-mass" of the system.

In the general IF, relative to which \(S_{ { {ZM} } }\) has velocity \(\mathbf{u}_{ZM}\ ,\) say, Eq.(56) reads (cf.(27))

\[\tag{57} \bar{ \mathbf{P} }=(\bar{ \mathbf{p} },\bar{m}c)={\bar{m} }_{ {ZM} }\,\gamma ( u_{ZM})\,(\mathbf{u}_{ {ZM} },c) , \]

which yields

\[\tag{58} \bar{m}=\gamma(u_{ {ZM} })\,{\bar{m} }_{ { {ZM} } }, \]

\[\tag{59} \bar{ \mathbf{p} }=\bar{m}\mathbf{u}_{ { {ZM} } },\;\;\;\text{or}\;\;\; \mathbf{u}_{ {ZM} }=\bar{ \mathbf{p} }/{\bar{m} } . \]

### Threshold Energies

An
important application of relativistic mechanics occurs in so-called
threshold problems. Suppose a stationary proton is to be struck by a
moving proton so as to create an extra pion \({(p+p\rightarrow
p+p+\pi ^{0})}\ .\) What is the minimum energy of the incoming proton
to make this reaction possible? It is *not* enough for its kinetic
energy \({({m}_u-m_{0})c^{2} }\) to merely equal the rest energy of the
pion we want to create! For, by the conservation of 3-momentum, there
must be motion and thus "waste" energy after the collision.

In
all such cases, the minimum expenditure of energy occurs when
*all* the end-products travel "as a lump". Suppose a given
stationary target particle is to be struck by a given bullet particle
and we know the rest masses of all the desired end-products. Relative
to the lab, the bullet's initial velocity determines
\(\mathbf{u}_{ { {ZM} } }\) via Eq.(59). But this is also
\(\mathbf{u}_{ { {ZM} } }\) *after* the
collision, since both \({\bar{ \mathbf{p} } }\) and \({\bar{m} }\) are
conserved. We want there to be a minimum of kinetic (waste) energy
after collision. Eq.(58) – where \({\gamma }\) is now fixed – shows
that this indeed occurs when all the end-products are at rest in
\(S_{ZM}\ .\)

However,
to get the actual threshold formula, it is easiest to proceed as
follows. Let \({ \mathbf{P}_{B} }\) and \({ \mathbf{P}_{T} }\) be the pre-collision
4-momenta of bullet and target, and \({ \mathbf{P}_{i} }({i} =
1,2,{\dots}\)) the 4-momenta of *all* post-collision particles.
Then

\[\tag{60} \mathbf{P}_{B}+ \mathbf{P}_{T}=\sum { \mathbf{P}_{i} }. \]

Squaring this equation along the lines of (55), and using once more Eqs.(47) and (48), we find, in a self-explanatory notation,

\[\tag{61} {m_{0B}^{2}+m_{0T}^{2} }+2\;c^{-2}\,m_{0T}E_{B}=\sum {m_{0i}^{2}+2\,\sum_{i<j}{m_{0i} }m_{0j}\gamma (v_{ {ij} })}. \]

The only variable on the
LHS is \({E_{B} }\ ,\) the energy of the bullet relative to the rest
frame of the target, and thus relative to the lab. Once again we see
that this will be minimum when *all *the \({\gamma
}\)-factors on the RHS are unity, that is, when there is no relative
motion between the outgoing particles. The RHS then equals
\({\bigl(\sum {m_{0i} }\bigr)^{2} }\ ,\) and so, solving for
\({E_{B} }\ ,\) now the minimum or **threshold energy**, we find

\[\tag{62} {E_{B}=\frac{c^{2} }{2\,m_{0T} }\bigl[\bigl(\sum {m_{0i} }\bigr)^{2}-m_{0B}^{2}-m_{0T}^{2}\bigr]} . \]

This formula applies even when the bullet is a photon, provided it gets absorbed in the collision (since it cannot be part of a post-collision "lump".)

Because
of the inevitable waste kinetic energy, this method of creating new
particles is generally not very efficient. A way out is the 100%
efficient method of head-on *colliding beams*.

### The Compton Effect

An extraordinary validation of Einstein's idea that photons can behave mechanically like little billiard balls with (relativistic) mass and momentum was provided by Compton's famous scattering experiment of 1922, in which X-ray photons were the bullets and electrons in graphite surfaces the target.

Suppose a photon of frequency \({\nu }\) strikes a stationary electron of rest-mass \({m_{0} }\) and comes away with altered frequency \({\nu }^{\prime}\) at an angle \({\theta }\) with its incident direction. Let \( \mathbf{P}\) and \( \mathbf{P'}\) be the pre- and post-collision 4-momenta of the photon, and \(\mathbf{Q}\) and \(\mathbf{Q'}\) those of the electron. Then from the conservation equation \(\mathbf{P + Q = P' +Q'}\) we can separate out the unwanted vector \(\mathbf{Q'}\) and square to get rid of it:

\[\tag{63} {( \mathbf{P}+\mathbf{Q}- \mathbf{P^\prime})^{2}=\mathbf{Q}^{2} }. \]

By (47), \({\mathbf{Q}^{2}=\mathbf{Q'}^{2} }\ ,\) and \({ \mathbf{P}^{2}= \mathbf{P}^{\prime 2}=0,}\) so we are left with

\[\tag{64} \mathbf{P}\mathbf{.}\mathbf{P'}=\mathbf{Q}\mathbf{.}(\mathbf{P}-\mathbf{P'}), \]

from which, by reference to (48) and (53), we find at once the desired and experimentally confirmed relation

\[\tag{65} h\,c^{-2}\,\nu\, {\nu }^{\prime}(1-\cos\theta)=m_{0}\,(\nu -{\nu }^{\prime}). \]

In terms of the corresponding wavelengths
\({\lambda }\ ,\) \({\lambda }^{\prime}\ ,\) the half-angle \({\theta /2}\ ,\) and
the **Compton wavelength** \(l=h/(c m_0)\ ,\) this may be rewritten in the
more familiar form

\[\tag{66} {\lambda -{\lambda }^{\prime}=\frac{2h}{c m_{0} }\,\sin^{2}(\theta /2)=2l\,\sin^{2}(\theta /2)} . \]

Scattering of photons by
stationary electrons is called **Compton scattering** and clearly
always results in an energy loss for the photon. The opposite is the
case in **inverse Compton scattering**, where a photon
collides with a fast ("relativistic") electron or other charged
particle, and often experiences a spectacular gain in energy. For
simplicity, we shall consider only the case of a head-on collision
along the \({x}\) axis. Eq.(64) is still applicable. But now

\[\tag{67} {\mathbf{Q}=\gamma (u)\,m_{0}(u,0,0,c)},\;\;\;{ \mathbf{P}=({h\nu }/c)(-1,0,0,1)}, \;\;\; \mathbf{P^\prime}=(h{\nu }^{\prime}/c)(1,0,0,1), \]

where \(u\) is the velocity of the electron. Then (64) yields

\[\tag{68} 2\,h\,c^{-2}\,{\nu}\,{\nu}^{\prime}= \nu\,{\gamma}\,m_{0}\,(1+u/c)-{\nu}^{\prime}\,{\gamma}\,{m}_{0}\,(1-u/c). \]

If we now set \({1+u/c\approx 2}\) and \({1-u/c\approx 1/(2\,\gamma ^{2})}\) (since the product is \({\gamma ^{-2} }\)), we get

\[\tag{69} \frac{ {\nu }^{\prime} }{\nu }=\frac{4\,\gamma^{2} }{1+\bigl(4\,\gamma\, h\, \nu /(m_{0}\,c^{2})\bigr)}. \]

For a low-energy photon, the second term in the denominator can be quite small, so its energy can be amplified by a factor of the order of \({\gamma ^{2} }\ .\) For example, when a photon of the cosmic microwave background ( \({h\nu }\approx {10}^{-3}{eV})\) collides with a high-energy cosmic ray proton ( \(m_{0}c^{2}\approx {10}^{10}{eV}\ ,\) \(\gamma \approx {10}^{ {11} }\)), its energy could be boosted to \({ {10}^{ {19} }{ {eV} } }\ !\)

### Four-force and three-force

There are at hand only two reasonable definitions for the 4-force \(\mathbf{F}\) on a particle, \({\mathbf{F}=m_{0}\mathbf{A} }\) or \({\mathbf{F}=(d/{d\tau }) \mathbf{P} }\ .\) The accepted choice is the latter:

\[\tag{70} \mathbf{F}=\frac{d}{ {d\tau } } \mathbf{P}=\frac{d}{ {d\tau } }(m_{0}\mathbf{U})=m_{0}\mathbf{A}+\frac{ {dm}_{0} }{ {d\tau } }\mathbf{U}, \]

though the two coincide when \(m_{0}=\mathrm{const.}\) We then speak of a **rest-mass-preserving** force, which will be the expected norm. It particularly applies to
the Lorentz force of electrodynamics. From (70), with (26),(40), and
(45), we have

\[\tag{71} \mathbf{F}=\frac{d}{ {d\tau } } \mathbf{P}=\gamma (u)\frac{d}{ { {dt} } }( \mathbf{p},{m}_u{c})=\gamma (u)(\mathbf{f},\frac{1}{c}\frac{ {dE} }{ {dt} }) , \]

where we have introduced the
**relativistic 3-force **\(\mathbf{f}\) defined by

\[\tag{72} \mathbf{f}=\frac{ {d} \mathbf{p} }{ {dt} }=\frac{d({m}_u\mathbf{u})}{ {dt} }. \]

Note
that the **power** \({dE/dt}\) is the complement of
\(\mathbf{f}\) in the formation of \(\mathbf{F}\ ,\) just as the energy
\({E}\) is the complement of \( \mathbf{p}\) in the formation of
\( \mathbf{P}\ .\)

From (70) we find, by use of (31) and (32), the first of the following equations; the second results from forming the scalar product (cf. Equation (18) of the right-most member of (71) with \({\mathbf{U}=\gamma\, (\mathbf{u},c)}\ :\)

\[\tag{73} \mathbf{F}\mathbf{.}\mathbf{U}=c^{2}\frac{ {dm}_{0} }{d\tau }=\gamma ^{2}(u)(\frac{ {dE} }{ {dt} }-\mathbf{f}\mathbf{.}\mathbf{u}). \]

This shows that \(\mathbf{F.U}\) is the proper
rate at which the particle's *internal* energy is being
increased. If the force is *rest-mass-preserving *– as
will be assumed from now on – it thus satisfies

\[\tag{74} \mathbf{F.U} = 0,\;\;\; \mathbf{f}\mathbf{.}\mathbf{u}=\frac{ {dE} }{ {dt} },\;\;\; \text{and}\;\;\;{\mathbf{F}=\gamma (u)(\mathbf{f},\mathbf{f}\mathbf{.}\mathbf{u}/c)}, \]

where, for the last equation, we once again used (71). In particular, multiplying the middle equation by \({dt}\) we see that \(\mathbf{f}\) satisfies the Newtonian relation

\[\tag{75} \mathbf{f}\mathbf{.}\,{d}\mathbf{r }= {dE. } \]

But not Newton's second law: for, by (72) and (74)(ii),

\[\tag{76} \mathbf{f}={m}_u\mathbf{a}+\frac{ {d}{m}_u}{ {dt} }\mathbf{u} ,\;\;\;{m}_u\mathbf{a}=\mathbf{f}-\frac{\mathbf{f}\mathbf{.}\mathbf{u} }{c^{2} }\mathbf{u}. \]

Now \(\mathbf{a}\) is necessarily coplanar with \(\mathbf{f}\) and \(\mathbf{u ,}\) but it is parallel to \(\mathbf{f}\) only when \(\mathbf{u}\) is either parallel or orthogonal to \(\mathbf{f}\ .\)

The important transformation of the 3-force \(\mathbf{f}\) under a standard change of inertial frames – analogous to the transformations Equations 25 and 34 in SR:kinematics of \(\mathbf{u}\) and \(\mathbf{a}\) – is most easily obtained by applying the transformation pattern (13) to the 4-vector \(\mathbf{F }\) in (74)(iii). We shall write \({D}\) for \({1-u_{1}v/c^{2} }\) as we did in Equations 31 - 34 in SR:kinematics. We also need the formula

\[\tag{77} {\frac{\gamma ({u}^{\prime})}{\gamma (u)}=\gamma (v)D} , \]

whose derivation was outlined in the remark following (27). This is what then results:

\[\tag{78} {f}_{1}^{\prime}=\frac{f_{1}-{v}\mathbf{f}\mathbf{.}\mathbf{u}/c^{2} }{D} ,\;\;\; {f}_{2}^{\prime}=\frac{f_{2} }{ {\gamma D} } ,\;\;\; {f}_{3}^{\prime}=\frac{f_{3} }{ {\gamma D} } ,\;\;\; \gamma =\gamma (v){.} \]

Note that the transformed force in general
depends not only on the original force but also on the velocity
\(\mathbf{u}\) of the particle on which the force acts. Thus a
velocity-independent force (like Newton's gravitational force field) is
no longer a Lorentz-invariant concept. The velocity-dependent
Lorentz force of electromagnetism, on the other
hand, is a typical relativistic force. Nevertheless, Newton's relation
\(\mathbf{f'}= \mathbf{f}\) still holds in the purely one-dimensional
case: among IFs with mutual velocity along the common direction of
\(\mathbf{f}\) and \(\mathbf{u}\ .\) For, let that be the
\({x}\) direction, so that \({f_{2}=f_{3}=0}\) and consequently
\({f}_{2}^{\prime}={f}_{3}^{\prime}=0\ ;\) but \(\mathbf{f.u}\) is now
\({f_{1}u_{1} }\ ,\) whence \({f}_{1}^{\prime}=f_{1}\) also. As an
example, consider a parallel constant electric field and a charged
particle moving in the direction of the field lines. In its
rest-frame it always feels the same force, to which, by (76) with
\(\mathbf{u} = 0\) and \({ {m}_u=m_{0} }\ ,\) it responds with constant proper
acceleration, that is, with hyperbolic motion. (The
*rest-frame* is an important concept throughout SR: it is
*always* an IF, it *always* moves uniformly: an
accelerating particle co-moves with its rest-frame for only *one instant*. But at that instant the rest-frame measures the particle's
proper acceleration.)

## Acknowledgements

All the figures in this article are taken from the author's book "Relativity: Special, General, and Cosmological" (2nd ed., 2006) Oxford University Press, by kind permission of the publishers.

## References

- Einstein, A. (1905) Annalen der Physik, 17, 891

- Ignatowski, W. V. (1910) Phys. Zeits., 11, 972

- Minkowski, H. (1908) Göttinger Nachr. 53. English translation in Lorentz, Einstein, Minkowski and Weyl (1923) The Principle of Relativity, Methuen/Dover.

- Rindler, W. (1966) Special Relativity (2nd. ed.), Oliver and Boyd

- Rindler, W. (1979) Essential Relativity (2nd. ed.), p. 51, Springer-Verlag

- Rindler, W. (1991) Introduction to Special Relativity (2nd. ed.), Oxford U. P.

- Rindler, W. (2006) Relativity: Special, General, and Cosmological (2nd. ed.), Oxford U. P.

## Further reading

- "Spacetime Physics", E. F. Taylor and J. A. Wheeler (W.H.Freeman, 1992)

- "Special Relativity", A.P. French (Norton, 1968)

- "Special Theory of Relativity", C. W. Kilmister (Pergamon Press, 1970)

- "Special Relativity", W. G. Dixon (Cambridge University Press, 1978)

- "Relativity: The Special Theory", J. L. Synge (North Holland, 1956)

## External links

## See also

Special relativity: kinematics, Special relativity: electromagnetism