Special relativity: kinematics

From Scholarpedia
Wolfgang Rindler (2011), Scholarpedia, 6(2):8520. doi:10.4249/scholarpedia.8520 revision #137267 [link to/cite this article]
Jump to: navigation, search
Post-publication activity

Curator: Wolfgang Rindler

Special relativity is a physical theory based on Einstein's relativity principle, which states that all laws of physics (including, for example, electromagnetism, optics, thermodynamics, etc.) should be equally valid in all inertial frames; and on Einstein's additional postulate that the speed of light should be the same in all inertial frames. The present article describes kinematics in special relativity. (For the further development of the theory, see the articles Special relativity: mechanics, and Special relativity: electromagnetism.)



All the laws of Newton's theory of classical mechanics are equally valid in all inertial reference frames, or, roughly speaking, in all non-rotating and non-accelerating laboratories, no matter how fast they move. This fact is referred to as Newtonian relativity. If true, then any mechanical experiment will proceed identically in every such reference frame. For example, no one could tell the difference between a game of billiards played on earth and one played in a smoothly flying very fast jet airplane. It was indeed this property of Newtonian mechanics that allowed Newton (and before him, Galileo) to champion Copernicus's idea of an earth flying around the sun, since terrestrial laboratories would even under these circumstances be considered good approximations to inertial frames.

Einstein, in his relativity principle of 1905 (Einstein 1905), postulated that not only the laws of mechanics but of all physics (including, for example, electromagnetism, optics, thermodynamics, etc.) should similarly be valid in all inertial frames. Although Einstein's reasons at the time might have been colored by the state of physics at the time, yet today the relativity principle seems more reasonable than ever, in light of the even more evident unity of physics. If one part of physics -- classical mechanics -- is so strikingly relativistic in the real world, it stands to reason that the rest should be so too. That put a severe restriction on the kind of laws that are permitted. Paradoxically, Newton's laws are no longer permitted! Why? Because Einstein, to complete his special relativity, laid down a second and crucial postulate: Any given light signal has the same speed \(c\) relative to any inertial frame. This seems preposterous: fly along a beam of light in the fastest airplane imaginable, and measure the speed of that beam relative to the airplane; it will be 300,000 km/sec, exactly as on earth!

Later it was found (Ignatowski 1910 and, e.g., Rindler 1979) that Einstein's light postulate is, in fact, an almost inevitable consequence of the relativity principle. The latter, by itself, has the interesting mathematical implication of leaving exactly one velocity invariant among inertial frames, which can be chosen at will. Newton's theory is at the limiting edge of possible relativistic theories; in it, the invariant velocity is infinity. In it, therefore, a particle (or just an ideal point) moving with infinite velocity in one inertial frame moves with infinite velocity in all other inertial frames. Let us stand back for a moment to see what this implies. A particle moving along a line with infinite velocity is everywhere at once. It thus passes all the clocks along the way at the same reading, say at noon. And in Newton's theory it then does the same thing in every other inertial frame as well. So when the clocks of one inertial frame all read noon, those of another such frame will also all read the same time, which might as well be noon. This leads to Newton's 'absolute' (or universal) time. Not so in special relativity. For various reasons -- mainly because of Maxwell's theory -- the invariant speed is here chosen to be \(c\ .\) The infinite speed is no longer invariant. Time is no longer absolute.

The light postulate has the effect of radically changing the mathematical relations between the coordinates of any pair of inertial frames, from the 'Galilean transformation' valid in Newtonian mechanics to the so-called 'Lorentz transformation' valid in special relativity. The Galilean transformation is the extreme case of a Lorentz transformation with \(c\) = infinity.

Thus special relativity is in principle a theory of all physics, whose aim it is to make all physical laws invariant from one inertial frame to another under Lorentz transformations. It required a review of the existing laws of physics, and a modification of any law that failed the test of 'Lorentz-invariance'. According to this criterion, Maxwell's theory was found to be already 'relativistic' and needed no revision (in vacuum). But Newton's theory, as already noted, was not. Its relativistic modifications led to astonishing predictions, in one of the most striking instances of theory far outpacing observation.

Most famously, the new mechanics led to the equivalence of mass and energy encapsuled in the equation \(E = mc^2\ ,\) to the relativity of simultaneity, the slowing down of moving clocks (time dilation), and the shortening of moving rods (length contraction); to the increase of the mass (inertia) of a particle as its speed increases -- the mass approaching infinity as the speed approaches the speed of light; to the speed of light being an absolute upper limit to the possible speed of any particle or signal; to the recognition of the photon as a particle with mechanical properties like energy and momentum; to de Broglie's association of waves with particles, which in turn led to the electron microscope and to Schrödinger's quantum wave mechanics; and others, all of which we shall establish in this and the following two Scholarpedia articles (on relativistic mechanics and relativistic electromagnetism). But first we make a brief excursion into some history and some philosophy of science. Especially the latter is of great practical use in understanding relativity.

Historical and philosophical preliminaries

A number of both experimental and theoretical discoveries made around the turn of the previous century, from the 1870s to the 1900s, rendered the situation in physics pregnant with the urgent need for the birth of a new synthesis. Maxwell's theory of electromagnetism -- too beautiful to be wrong! -- never quite meshed with Newtonian mechanics. The famous Michelson-Morley experiment of 1887, designed to locate Maxwell's 'ether' (the only wrong element in his theory, like absolute space in Newton's), indicated that the speed of light played a very special role in nature, inasmuch as it never seemed to change, even as you chased the light while you were speeding with earth through space. Lorentz, looking for the transformation that left Maxwell's equations invariant, had discovered the eponymous equations that were eventually to serve Einstein in a different guise. And Poincaré had already expressed some basic tenets that sounded very much like relativity. But it needed the fresh and powerful intellect of Einstein to unify all these strands into a brilliantly clear new theory, his special relativity (SR) of 1905. This was to replace the non-gravitational part of Newtonian mechanics as the conceptual framework for macroscopic physics, at least in theory.

Only gravity would not yield to this approach, and it took several years to discover the reason. Already in 1907 the mathematician Hermann Minkowski had shown that, in a certain well-defined sense, SR lived in a flat four-dimensional 'space-time'. (In Newton's theory, though it also obeys the relativity principle, the four-dimensionality is destroyed by the limiting procedure of letting \(c\) tend to infinity; time peels off and becomes a dimension by itself.) Einstein had a hunch that gravity would have something to do with curvature in space-time, and so for years, with the help of his mathematician friend Marcel Grossmann, he worked his way through the intricacies of the differential geometry of higher-dimensional curved spaces. This was to result in his theory of general relativity (GR) of 1915. Here gravity (actually, the tidal part of gravity) is space-time curvature. Newton's force of gravity disappears, and gravitational orbits become inertial (straightest) paths in curved space-time. The curvature is related to the gravitating matter-energy sources by Einstein's Field Equations.

Henceforth SR became subsumed in GR, much as plane Euclidean geometry is subsumed in the Gaussian differential geometry of curved surfaces. On the one hand, we can regard SR as approximately valid when the gravitational fields are weak (i. e. when space-time is almost flat). On the other hand, just as the tangent plane is a local approximation to a curved surface, so SR can be regarded as a local approximation to GR. What is the space-time equivalent of a tangent plane? Einstein answered this in his Equivalence Principle of 1907: At any point in a gravitational field consider a freely falling non-rotating cabin. In Newton's theory, the gravitational field inside would vanish. Einstein postulated that such a cabin provides a 'local inertial reference frame', free of first-order gravity, and in which SR applies. The more divergent the field lines are (i. e. the bigger the curvature), the smaller the cabin must be chosen; the field lines must be essentially parallel within it. For similar reasons, the duration of use of such a cabin must be restricted.

These considerations determine the range of applicability of SR. Basically, SR is applicable within limited regions. Of course, even so, its riches are enormous. A further restriction, both on Newtonian mechanics and SR, comes from cosmology. The basic element those two theories share is the set of world-wide inertial frames. But in the expanding universe we actually inhabit, such a set does not exist. Different parts of the universe have their own limited inertial frames -- like huge boxes several million light years across -- in which the laws of Newton apply with sufficient accuracy; but these boxes themselves, when sufficiently far apart, accelerate relative to each other as measured by their relative Doppler shift (even though until recently they were thought to decelerate!)

It is thus important to distinguish between the inner perfection of a theory and the limited perfection of its correspondence with the real world. As a mathematical model, for example, Newton's theory is every bit as perfect, non-approximative, and inviolate as Euclidean Geometry. Yet in the real world, which might be curved, even a perfect Euclidean plane might nowhere exist. It is not that Newtonian Mechanics and Euclidean Geometry (and SR) are in themselves approximative; it is rather that their correspondence with the real world might be less than 100%.

And this is essentially true of all the great theories of physics. All are ideal mathematical models, perfect within themselves; all are human inventions (compare the Newtonian and Einsteinian approach to gravitation); none claim to be the ultimate truth; none claim to replicate nature with infinite accuracy. (Although some of the accuracies attained seem almost miraculous!) When a great theory runs into observational trouble, it is not necessarily abandoned. Like Newton's theory, which runs into huge trouble, for example, when applied to collisions between particles coming out of accelerators at nearly the speed of light -- though these are perfectly handled by SR -- the theory may simply be confined to a narrower range of applicability. Thus, because of its superior simplicity and incredible accuracy at lower velocities, Newtonian mechanics continues to guide even some of the most delicate of NASA's space shots.

As an aid to thinking through the initially often paradoxical logic of SR, it is usually more helpful to think in terms of the clean model than in terms of the messy real world. This will consistently be our approach.

Galilean and Lorentz transformations

The basic assumption in the mathematical model that is special relativity -- just as in modern Newtonian mechanics -- is the existence of an infinite family of equivalent inertial reference frames (IFs), characterized by the property that free particles move uniformly (i. e. with constant speed and along straight lines) in all of them. Consequently they must all move uniformly and without rotation relative to each other. But whereas in Newton's theory these frames have all possible finite velocities relative to each other, in SR their relative velocities will be limited by \(c\ .\) In constructing the model, we keep in mind its ideal character. All IFs are infinite in all direction and strictly Euclidean. Gravity is completely absent. Place a free particle anywhere at rest in an IF and it will remain there.

A useful intuitive view of the family of IFs is to visualize each of them as a set of three right-handed orthogonal axes of \(x,y,z\) (we can even think of these as made of wire) flying through space. Each can be further visualized as a rectangular \(x,y,z\) lattice of little cubes measuring, say, 1 cm (or some other convenient unit) on the side. This allows a Cartesian address \((x,y,z)\) to be assigned to each point of the lattice. But a reference frame must also allow a time \(t\) to be assigned to each event as it occurs. Time is best thought of concretely. Imagine a standard clock placed at each lattice point. Once these clocks are suitably synchronized, i. e. once they tick in unison, the four coordinates \((x,y,z,t)\) of any event (such as the collision of two point-particles) can be read off locally. Of course, and remember this is only a mathematical model, we must allow the infinite lattice with all its clocks of one IF to move ghost-like through the lattice and clocks of any other IF. In Newton's theory a single such set of clocks is, in fact, sufficient (absolute time!); in SR the separate sets are essential.

The basic principle of clock synchronization is to ensure that the coordinate description of physics is as symmetric as the physics itself. For example, bullets shot off by the same gun at any point and in any direction should always have the same coordinate velocity \(dr/dt\ .\) Because of the light-postulate, photons serve particularly conveniently as such bullets in SR. They also allow us to see at once that time can no longer be absolute. In an IF \(S\ ,\) let a light signal be emitted at a point A and received simultaneously at equidistant points B and C on opposite sides of A. Let \(S'\) be a second IF moving in the direction BAC. In \(S'\ ,\) the point C of \(S\) (in the forward direction) will be lit up by the signal before B, since it is closer to the emission point in \(S'\ .\) The two light-up events which are simultaneous in \(S\) are not so in \(S'\ :\) simultaneity is relative.

In SR, a single light signal away from an arbitrary emission event will serve to synchronize all the clocks in all the IFs. If the time of the emission event is zero by convention, in all IFs, then in each IF as the signal passes a clock at distance \(r\) from the emission point, that clock must simply be set to read \(t = r/c\ .\) It goes without saying that all the clocks in all the IFs must be identical. In Newton's theory, on the other hand, clock synchronization in a single IF, say \(S\ ,\) is all that is needed. It could be achieved by standard guns shooting standard bullets in all directions from a given event, or by a single sound signal, if the air in \(S\) were still and uniform. All other IFs then simply copy the time of \(S\ .\)

We should, strictly speaking, differentiate between an inertial frame and an inertial coordinate system, although in sloppy practice one usually calls both IFs. An inertial frame is simply an infinite set of point particles sitting still in space relative to each other. For stability they could be connected by a lattice of rigid rods, but free-floating particles are preferable, since keeping constant distances from each other is also a criterion of the non-rotation of the frame. A standard inertial coordinate system is any set of Cartesian \(x,y,z\) axes laid over such an inertial frame, plus synchronized clocks sitting on all the particles, as described above. Standard coordinates always use identical units, say centimeters and seconds.

The identical outcome of shots fired from identical guns that we used in clock synchronization above, is only one instance of the overall homogeneity and isotropy of inertial frames: Any physical experiment can be repeated with the same outcome anywhere and anytime (homogeneity in space and time) and in any direction (isotropy). For consider any two experiments, \(E\) and \(E'\ ,\) performed in a given inertial frame \(S\ ,\) their initial conditions differing only in overall location, orientation and time. Then by simply translating and rotating the coordinate axes, and shifting, if necessary, the time origin, we can create a second standard coordinate system \(S'\) in the same inertial frame, where \(E'\) has identical initial conditions as \(E\) has in \(S\ .\) According to the relativity principle, the outcomes will be identical, as was to be shown.

The Galilean and Lorentz transformations both relate the standard coordinates of a given event \((x,y,z,t)\) in an IF \(S\) with those, \((x', y', z',t')\ ,\) in another IF \(S'\) which moves relative to \(S\) with velocity, say, \(v\ .\) The first step in deriving both these transformations is the recognition that they must be linear. This follows from homogeneity and isotropy: the relations between the differentials \(dx, dy, dz, dt\) in \(S\) (say, for a moving particle) and the corresponding differentials \(dx', dy', dz', dt'\) in \(S'\) must be independent of position and time.

Figure 1: The 'standard configuration' of two inertial frames \(S\) and \(S'\)

There is no need to deal with the most general possible configuration of two such coordinate systems, since further translations and rotations can always be added. It suffices to lay the simplest mutually tuned coordinate systems on a given pair of IFs; they will then be said to be 'in standard configuration' (see Figure 1 ). One arbitrary event is chosen as the common origin event \((0,0,0,0)\) in both \(S\) and \(S'\ ;\) the line of relative motion of \(S\) and \(S'\) is chosen as the direction of the \(x\) axes in both frames; and the corresponding \(y\) and \(z\) axes are chosen to be parallel. While this configuration is eminently permissible in Newton's theory, symmetry and linearity allow it also in SR.

The Galilean transformation can be read off from the Figure by inspection\[y\] and \(z\) remain unchanged; \(vt\) is the distance between the spatial origins, so it is also the difference between \(x\) and \(x'\ ;\) and \(t\) goes into itself by the Newtonian assumption of absolute time. Hence the result:

\[\tag{1} x' = x - vt,\;\;\; y' = y,\;\;\; z' = z,\;\;\; t' = t. \]

Let us switch to vector notation, and consider a particle moving arbitrarily in space. Relative to \(S\ ,\) its position vector is \(\mathbf{r} = (x,y,z)\ ,\) its velocity vector \(\mathbf{u} = d\mathbf{r}/dt\ ,\) and its acceleration vector \(\mathbf{a} = d\mathbf{u}/dt\ .\) Let the corresponding vectors in \(S'\) be \(\mathbf{r'}\ ,\) \(\mathbf{u'}\ ,\) \(\mathbf{a'}\ .\) Lastly, the (constant) vector velocity of \(S'\) relative to \(S\) is \(\mathbf{v}\) = \((v,0,0)\ .\) Then the first of the following equations is simply the spatial part of the Galilean transformation (1) in vector form:

\[\tag{2} \mathbf{r'} = \mathbf{r} - \mathbf{v}t,\;\;\; \mathbf{u'} = \mathbf{u} - \mathbf{v},\;\;\; \mathbf{a'} = \mathbf{a}. \]

The second results from the first by differentiating the left side with respect to \(t'\) and the right side with respect to \(t\) (\(t\) and \(t'\) are equal!). And one more such differentiation yields the third equation in (2), expressing the invariance of the acceleration. The middle equation is equivalent to the well-known Newtonian 'velocity addition' law. For example, if a bus (\(S'\)) rolls along a road \((S)\) at velocity 30mph (\(\mathbf{v}\)), and a passenger walks up the aisle at 2mph (\(\mathbf{u'}\)), then her velocity relative to the road (\(\mathbf{u}\)) is 32 mph (\(\mathbf{u'}\) + \(\mathbf{v}\)).

It is the last equation in (2) that is responsible for Newtonian relativity, but not by itself. Let us recall Newton's so-called second and third laws of motion, which form the basis of Newtonian mechanics:

\[\tag{3} \mathbf{f} = m\mathbf{a},\;\;\; \mathbf{f}(action) = -\mathbf{f}(reaction). \]

[The second law, for our purposes, includes the so-called first law: that free particles (\(\mathbf{f}\) = 0) move without acceleration. ] But to establish Newtonian relativity, we need to appeal to two more axioms of Newtonian mechanics. These are, first: that the inertial mass \(m\) of a particle is invariant (i. e. the same in all inertial frames) and second: that the vector force \(\mathbf{f}\) on a particle is invariant. Add to these the invariance of acceleration, and the relativity of eq. (3) becomes self-evident: they transform into their primed versions, or, in other words, they are equally valid in \(S\) and \(S'\ .\)

Mathematically speaking, we'll never have it that easy again! Even so, though the corresponding SR results and derivations are more complicated, they still possess what characterizes all superb physical theories: a basic beauty and even simplicity, when regarded in the right way.

The SR replacement for the Galilean transformation (GT) displayed in (1) is the Lorentz transformation (LT)

\[\tag{4} x' = \gamma (x - vt),\;\;\; y' = y,\;\;\; z' = z,\;\;\; t' = \gamma (t - vx / c^2), \]

where \(\gamma\ ,\) the so-called Lorentz factor, is defined as

\[\tag{5} \gamma = \frac{1}{\sqrt{1 - v^2/c^2}}. \]

[We shall not here give the derivation of these formulae, which can be found in any textbook, see, for example, Rindler 2006] Note that when we formally set \(c\) = infinity, \(\gamma\) becomes 1 and the LT reduces to the GT: the most general transformation respecting the relativity principle has then reduced to the one degenerate one.

Figure 2: How the \(\gamma\) factor depends on the speed

In fact, \(\gamma\) is a measure of how much the LT deviates from the GT, and so it is important to familiarize oneself with its properties. The graph of \(\gamma\) is shown in Figure 2. Note its very gradual increase from its initial value 1, and its steep rise to infinity along the asymptote at \(v=c\ .\) Values \(v \ge c\) lead to unphysical transformations, a first indication of the relativistic speed limit. As long as \(v/c <\) ~ 1/7 (at which speed the earth is circled in one second!), \(\gamma\) is less than 1.01; when \(v/c= \sqrt{3}/2 = 0. 866, \gamma =2\ ;\) and when \(v/c=0. 99\ldots995\) (2\(n\) nines), \(\gamma\) is approximately \(10^n\ .\) \(\gamma\)-factors as high as \(10^{11}\) have been calculated for some cosmic ray protons incident in the upper atmosphere.

The most striking member of eq. (4) is the last. It immediately shows the relativity of simultaneity, namely that two events with the same \(t\) but different \(x\) will not transform into two events with the same \(t'\ .\) Relatedly, it also shows that having speed infinity is no longer invariant. Consider an immaterial point traveling along the \(x'\) axis of \(S'\) with infinite speed, satisfying \(x'/t'= \infty\ ,\) or \(t'=0\ .\) This implies \(x/t =c^2/v\ ,\) a motion along the \(x\) axis with superluminal but finite speed.

If a law of physics is invariant under the standard LT (4), and under spatial rotations, spatial translations and time translations, then it is invariant between any two inertial coordinate systems, no matter how they may be mutually oriented. For the general coordinate transformation between two inertial frames \(S\) and \(S'\ ,\) whose coordinates are standard but not in standard configuration with each other, can be broken down into a product of such transformations: Rotate the \(x\) axis of \(S\) to be parallel to the velocity \(\mathbf{v}\) of \(S'\ ,\) thus arriving at a frame \(\tilde{S}\ ;\) next apply a standard LT with velocity \(v\) to \(\tilde{S}\) to arrive at \(\tilde{\tilde{S}}\ ,\) whose defining particles already coincide with those of \(S'\ ;\) a spatial rotation and a spatial and temporal translation (at most) will finally bring \(\tilde{\tilde{S}}\) into \(S'\ .\) The resultant transformation is called a general Lorentz transformation, or a Poincaré transformation (PT). It is, of course, linear, since each link in the chain is linear.

For consistency, we expect the set of PTs to form a group, as indeed it does: the inverse of a PT must be a PT ('symmetry'), and the composition of two PTs must be a PT ('transitivity'). Because then, if one IF \(S\) is related to all others by PTs, any pair of IFs \(S'\) and \(S''\) will be so related too. For, by symmetry, \(S'\) is related to \(S\) by a PT, and then, by transitivity via \(S\ ,\) \(S'\) is also related to \(S''\) by a PT.

The inverse transformation to (4) is, of course, easily obtained by algebraically solving for \(x,y,z,t\) in terms of \(x',y',z',t'\ .\) But there is a more interesting method, which is not only trivially easy to apply, but which also allows us to write down the inverse of any SR transformation, be it of coordinates, velocities, accelerations, forces, fields, etc. It arises as follows: Any transformation formula between two IFs \(S\) and \(S'\) in standard configuration (as in Figure 1 ) remains valid when we replace \(v\) by \(-v\) and interchange primed and unprimed symbols. We call this a \(v\)-reversal. The logic behind this method is that reversing the sense of \(v\) makes \(S\) move along the \(x'\) axis of \(S'\) with velocity \(v\ ,\) and so, whatever formula was originally true for primed in terms of unprimed quantities is now true for unprimed in terms of primed quantities. \(S'\) has become the 'first' and \(S\) the 'second' IF in the standard configuration.

Applied to the standard LT (4), a \(v\)-reversal yields the following inverse:

\[\tag{6} x = \gamma (x' + vt'),\;\;\; y = y',\;\;\; z = z',\;\;\; t = \gamma (t'+vx'/c^2), \]

\(\gamma\) being the same Lorentz factor as before, since it is invariant under a \(v\)-reversal.

If \(\Delta x, \Delta y\) etc., denote the finite coordinate differences \(x_2-x_1\ ,\) \(y_2-y_1\ ,\) etc., corresponding to two events \(\mathcal{P}_1\) and \(\mathcal{P}_2\ ,\) then by substituting the coordinates of \(\mathcal{P}_2\) and \(\mathcal{P}_1\) successively into (4) and subtracting, we get the following transformation between the deltas:

\[\tag{7} \Delta x' = \gamma (\Delta x - v \Delta t),\;\;\; \Delta y' = \Delta y,\;\;\; \Delta z' = \Delta z,\;\;\; \Delta t' = \gamma (\Delta t - v \Delta x / c^2) \]

If we simply take differentials in (4), we get the transformation between the differentials:

\[\tag{8} dx' = \gamma (dx - vdt),\;\;\; dy' = dy,\;\;\; dz' = dz,\;\;\; dt' = \gamma (dt - vdx/c^2). \]

Analogous formulae arise from the inverse transformation (6). Thus the finite coordinate differences as well as the differentials satisfy the same transformation equations as the coordinates themselves. This, of course, is always the case with linear homogeneous transformations. Each version has its uses, as we shall see.

The most fundamental property (and, in fact, a defining property) of the general LTs is that they preserve the value of the Minkowskian quadratic form:

\[\tag{9} c^2\Delta {t'}^2 - \Delta {x'}^2 - \Delta {y'}^2 - \Delta {z'}^2= c^2\Delta {t}^2 - \Delta {x}^2 - \Delta {y}^2 - \Delta {z}^2, \]

just as 3-rotations of the Cartesian axes preserve the Euclidean quadratic form

\[\tag{10} \Delta {x'}^2 + \Delta {y'}^2 + \Delta {z'}^2 = \Delta {x}^2 + \Delta {y}^2 + \Delta {z}^2. \]

The validity of (9) is easily established in the case of the standard LT by direct calculation, using (7). Also, (9) is clearly valid for mere space and time translations (\(x'=x+a\ ,\) etc.), which leave each term unchanged, and for space rotations, which preserve the spatial and temporal parts of (9) separately (\(t\) remains unchanged). Hence, by the decomposition of general LTs (PTs) elaborated five paragraphs back, PTs satisfy (9). Conversely, it can be shown (e.g., Rindler 1966) that (9) implies the PTs. If we accept this, then the group property of the set of PTs becomes immediate.

Since, by the linearity of the PTs, the transformation of the \(\Delta\)'s is the same as that of the \(d\)'s (as in (7) and (8) above), the identity (9) must hold for the coordinate differentials as well (as, indeed, it must for the coordinates themselves, if the transformation is homogeneous, i. e. without translations). This leads to a nice kinematic result. Consider an arbitrarily moving particle -- or, better, an arbitrarily moving geometric point, whose speed \(u\) relative to the frame \(S\) is unrestricted. If we write \((dx^2+dy^2+dz^2)/dt^2 = u^2\) and the same with primes in \(S'\ ,\) we can cast the differential version of (9) into the form

\[\tag{11} {dt'}^2(c^2-{u'}^2)=dt^2(c^2-u^2), \]

which implies an important result: subluminal speeds (<\(c\)) always transform into subluminal speeds, superluminal speeds (>\(c\)) always transform into superluminal speeds, while the speed of light, of course, transforms into itself.

One of the crucial subsidiary axioms of SR is that no information-carrying signal can travel faster than the speed of light. It turns out (see immediately below) that all superluminal signals go forward in time in some inertial frames and backward in others. The existence of such signals would therefore play havoc with physics as we know it, by denying causality, namely the principle that a given cause will have a predictable effect in the future. Backward signals would allow us to reshape the past. To avoid such a catastrophe, Nature must prohibit superluminal signals. A first consequence is that rigid bodies and incompressible fluids are no longer viable concepts, since they transmit signals instantaneously.

In detail, consider a signal along the \(x\) axis in a frame \(S\ ,\) from an event \(\mathcal{P}_1\) to a second event \(\mathcal{P}_2\) separated from the first by coordinate differences \(\Delta x > 0, \Delta t > 0\ .\) Suppose the signal travels at uniform superluminal speed \(U = \Delta x / \Delta t > c\ .\) Then, by the last of Eq. (7), we have, in the usual second frame \(S'\ ,\)

\[\tag{12} \Delta t' = \gamma \Delta t (1-vU/c^2). \]

Thus, if \(v\) is big enough to satisfy

\[\tag{13} c^2 / U < v < c \]

the signal will go back in time in \(S'\ !\)

Now suppose we, in frame S', possess a gun that shoots superluminal bullets. Let us put a message in the bullet and then run the gun backwards on our \(x'\) axis with a velocity \(-v\) such that \(v\) satisfies (13). Then (a) the gun constitutes a frame \(S\) in the usual standard configuration with our frame \(S'\ ,\) and (b) the bullet it now shoots in the \(x\) direction will travel backwards in time in our frame. In our frame, therefore, our message gets to its destination before it left us. Actually it will have the appearance of traveling from the destination to us. The fact remains that it was there before we wrote it. Only the speed limit axiom can prevent such violations of causality.

Note, however, that arbitrarily large velocities are possible for moving points that carry no information. Examples are the intersection point of two rulers that move across each other at a sufficiently minute angle, or the impact spot of a laser pointer on the moon, if the pointer is rotated fast enough on earth.

Graphic representation of the standard Lorentz transformation

There is much intuitive insight and many a useful result to be gained from a pictorial view of the Lorentz transformation. (Yet the hurried reader may skip this section at a first reading.) Transformations of coordinates can always be thought of -- and visualized -- in two distinct ways. Take, for example, rotations of the Cartesian \(x\) and \(y\) axes in the Euclidean plane. After we have rotated these axes through some finite angle \(\phi\ ,\) each point \((x,y)\) gets a new 'address' \((x',y')\) relative to the rotated axes. This is the passive view. Alternatively, we can move each original point \((x,y)\) to its new position \((x',y')\ .\) In essence, we are rotating the plane through the opposite angle, \(-\phi\ .\) We now see the plane -- and whatever geometrical structures we might have drawn in it -- as it appears when we set up the new \(x'\) and \(y'\) axes in the usual horizontal and vertical directions.

Figure 3: 'Passive' view of the 2-dimensional Lorentz transformation \((x,t) \mapsto (x',t')\)

With the LTs, both the active and passive views have their uses. We discuss the passive view first, and restrict our attention to just the \(x\) and \(t\) dimensions of the standard LT, Eq. (4), which actually contain all the important physics. For convenience, we choose units in terms of which \(c=1\ ,\) such as years and light-years or seconds and light-seconds. (In 'normal' units, such as seconds and centimeters, the graphs become visually unmanageable.) Consider the \(x,t\) plane as shown in Figure 3. Drawing the \(x\) and t axes orthogonally is here just a convention without physical significance (unlike the orthogonality of the \(x,y\) axes in the Euclidean plane). Another convention is that the \(t\) axis is always drawn vertically. If we think of an inertial frame \(S\) as a wire triad of \(x,y,z\) axes flying through space, equipped with appropriate clocks reading synchronized time \(t\ ,\) Figure 3 serves as a record of all the events taking place on the spatial \(x\) axis (the \(x\) wire). This latter must be distinguished from the spacetime \(x\) axis of Figure 3, which is just the locus of events on the spatial \(x\) axis at time zero. Indeed, all horizontal lines in the diagram represent moments (or simultaneities) on the spatial \(x\) axis. For a full record of all the events in the world (or better, in the SR 'model' of the world), we would need spacetime \(y\) and \(z\) axes sticking out from the origin in Figure 3, but that would require four dimensions.

Now, whereas a line in the Euclidean plane is just a line, every monotonic line (or curve) in the 'spacetime diagram' of Figure 3 represents -- or can represent -- the history of a moving point, also called its 'worldline'. The slope of this curve with respect to the \(t\) axis, \(dx/dt\ ,\) measures the velocity of the moving point. If that slope is negative, the point moves in the negative \(x\) direction as time goes on. If the slope exceeds 1 numerically, i. e. if the curve leans away from the vertical at more than 45 degrees, the point moves superluminally. The \(\pm 45^\circ\) directions at any given point (event) are called the local 'light-cone' at that event, and they bound the possible worldlines of information-carrying signals or particles through that event; the edges correspond to light.

So much for the kind of contents of the spacetime diagram (often called a Minkowski diagram, after its first proponent). Next, let us bring in the usual second frame \(S'\ ,\) related to \(S\) via the standard LT, Eq. (4). Its \(x'\) axis satisfies \(t' = 0\ ,\) and hence, by (4)(iv), \(t = vx\) (remember, \(c=1\)). In our diagram, this is a straight line through the origin having slope \(v\ .\) Similarly, the \(t'\) axis satisfies \(x'=0\ ,\) and hence by (4)(i), \(x=vt\ ,\) which is a straight line with slope \(v\) relative to the \(t\) axis. That makes sense\[x'=0\] is the worldline of the spatial origin of \(S'\ ,\) and thus of a point moving through \(S\) with velocity \(v\ .\) In fact, all lines parallel to the \(t'\) axis are worldlines of the particles constituting the spatial \(x'\) axis (\(x' = const\)). On the other hand, all lines parallel to the \(x'\) axis are moments (i. e. simultaneities, \(t'=const\)) in \(S'\ .\) Note how they differ from simultaneities in \(S\ !\)

The axes of \(S'\) are seen to subtend equal angles with their counterparts in \(S\ .\) But whereas in Euclidean rotations these angles have the same sense, in LTs they have opposite sense. \(S'\) can have any velocity strictly between \(-c\) and \(+c\) relative to \(S\ ;\) the corresponding \(x'\) and \(t'\) axes in the diagram are like scissors -- completely open (\(180^\circ\)) for \(v \rightarrow -c\ ,\) completely closed for \(v \rightarrow c\ .\)

We have already tacitly assumed that the \(x\) and \(t\) axes are equally calibrated, e. g. seconds and light-seconds corresponding to equal distances on the respective axes. But the calibration of the \(x'\) and \(t'\) axes is not as straightforward as in the Euclidean case. Just taking over the scales of the \(x\) and \(t\) axes won't do! We recall that for standard LTs (with \(y' = y, z' = z\ ,\) and now \(c=1\)) there is the basic identity \(t'^2 - x'^2 = t^2 - x^2\) (see text above Eq. (11) ). So if we now draw the calibrating hyperbolae \(t^2 - x^2 = \pm 1\) (the heavy curves in Figure 3 ), they will cut all four of the axes \((x,x',t,t')\) at the relevant unit distance or unit time from the origin. These units can then be repeated along the axes equally, by linearity. The diagram shows how to read off the coordinates \((a',b')\) of a given event relative to \(S'\ :\) we simply go along lines of constant \(x'\) or \(t'\) from the event to the axes.

As an example of the use of such diagrams, look at the dotted line in Figure 3. It represents the uniform motion of a superluminally moving point P in \(S\ .\) Now imagine the \(x'\) and \(t'\) axes gradually scissoring towards each other: the frame \(S'\) chases P with ever increasing velocity. Observe how, counterintuitively, the faster you chase P, the faster it recedes from you. (Consider a coordinate parallelogram with P's worldline as diagonal, and read off \(dx'/dt'\ .\)) Until, when the \(x'\) axis coincides with P's worldline, P moves with infinite speed. If \(S'\) moves even faster (i. e. after the \(x'\) axis surpasses the dotted line), P moves in the opposite sense along the spatial \(x'\) axis, namely from greater to lesser values of \(x'\) as \(t'\) increases. If P were a bullet, it would move from the broken glass back into the gun!

Figure 4: Reading off length contraction from the Minkowski diagram

Length contraction and time dilation can also be read off, qualitatively, from the Minkowski diagram. In Figure 4 the shaded area shows the 'world-tube' (bundle of worldlines) of a unit rod at rest on the spatial \(x\) axis between 0 and 1. In \(S'\) this rod moves at velocity \(-v\ .\) At \(t' = 0\) it occupies the segment \(\mathcal{OA}\) of the \(x'\) axis, which, by reference to the calibrating hyperbola, is seen to be less than unity: the moving rod is short.

Figure 5: Reading off time dilation from the Minkowski diagram

In Figure 5 the \(t'\) axis is the worldline of a standard clock fixed at the spatial origin of \(S'\) and therefore moving with velocity \(v\) through \(S\ .\) At \(\mathcal{B}\ ,\) where its worldline intersects the calibrating hyperbola, it reads 1. However, the corresponding time \(t\) in \(S\) is evidently greater than 1: the moving clock goes slow.

Figure 6: 'Active' view of the 2-dimensional Lorentz transformation \((x,t) \mapsto (x',t')\)

These examples should convince the reader of the utility of the passive representation of the LT. We next turn to the active representation. Figure 6 tells the story. We can imagine a computer program to do the work for us. On a black screen the \(x\) and \(t\) axes (in white) are fixed. They successively become the \(x',t'\) axes, the \(x'',t''\) axes, etc., as we switch to various other frames in standard configuration with the basic frame \(S\ .\) Imagine a generous sprinkling of light-dots on the screen, like stars in the night sky. If we were to apply an active Euclidean rotation, these 'stars' would simply rotate about the origin, here functioning as polestar. What the LT does is more complicated. Suppose we gradually increase \(v\ ,\) thereby considering a whole succession of 'second' frames. A small window in our screen indicates the value of \(v\) we have chosen (between zero and the light-speed 1). Instead of moving on circles, the 'stars' (now representing events) move on the hyperbolae \(t^2-x^2 = const\ ,\) since any point satisfying one of these equations continues to do so\[t'^2 - x'^2 = t^2 - x^2\ ,\] as we have seen. For positive constants these hyperbolae are in the up/down quadrants, for negative constants, in the left/right quadrants.

Now, any straight line of events in the diagram remains a straight line, by the linearity of the LT. Most interestingly, if we were to draw a set of equally spaced lines all parallel to the line \(t=x\) (marked \(\xi\) in Figure 6 ), then under an active LT this set would remain similar to itself but with the spacing expanding uniformly, as shown by the heavy arrows. A similar set of lines parallel to the line \(t=-x\) (marked \(\eta\) in Figure 6 ) would shrink by the same factor, so that areas are preserved. The reason for all this will appear in the next paragraph but one.

Adding and subtracting the \(x\) and \(t\) members of Eq. (4) (after multiplying the latter by \(c\)), we obtain an alternative formulation of the LT, important for many purposes:

\[\tag{14} ct' + x' = e^{-\phi}(ct + x),\;\;\; ct' - x' = e^{\phi}(ct - x) \]


\[\tag{15} e^\phi = \sqrt{\frac{1+v/c}{1-v/c}} \]

The \(\phi\) introduced here is a useful alternative parameter, instead of \(v\ ,\) for the Lorentz group. It is called the canonical parameter, because the parameter for the inverse transformation is \(-\phi\) and that for the resultant of successive transformations with \({\phi}_1\) and \({\phi}_2\) is clearly \({\phi}_1 + {\phi}_2\ .\)

Returning now to Figure 6 (and to \(c=1\)), the Cartesian coordinates \(\xi,\eta\) relative to the asymptotes of the hyperbolae are given by

\[\tag{16} \xi = (t+x)/\sqrt{2},\;\;\; \eta = (t-x)/\sqrt{2}. \]

In terms of these coordinates, the LT of Eq. (14) reads simply

\[\tag{17} \xi' = e^{-\phi}\xi,\;\;\; {\eta}' = e^{\phi}\eta. \]

This shows that as \(v\) (and with it \(\phi\)) increases from zero to positive values, all \(\xi\) coordinates numerically decrease while all \(\eta\) coordinates numerically increase, both by the factor \(e^{\phi}\ ,\) thus verifying the statement two paragraphs back.

As a simple application of an active LT, consider the set of lines \(t-x = n\lambda\ ,\) \(\lambda\) being a constant and \(n\) running through the positive and negative integers. These lines can represent the crests of a train of plane light waves of wavelength \(\lambda\) traveling in the \(x\) direction. After an active LT, i. e. as viewed in the usual second frame \(S'\ ,\) the wavelength is

\[\tag{18} {\lambda}' = \sqrt{\frac{1+v/c}{1-v/c}}\lambda, \]

by (17)(ii) and (15). This is the standard formula for the (special-)relativistic Doppler Effect.

As another interesting application, consider any one of the hyperbolae in the right-hand quadrant of Figure 6. Its equation is

\[\tag{19} x^2 - t^2 = x_{0}^{2}, \]

\(x_0\) being its intercept with the \(x\) axis. Its slope relative to the \(t\) axis nowhere exceeds unity, so it can be considered as the worldline of a particle. This particle decelerates in from positive infinity, comes to a momentary halt at \(x = x_0\) when \(t=0\ ,\) and then accelerates back to infinity along the spatial \(x\) axis. What is special about this worldline is that the corresponding particle moves with constant proper acceleration. This means that its acceleration relative to the IF momentarily comoving with it (called its rest frame) is always the same. (In SR the adjective "proper" always refers to a measurement taken in this comoving IF.) For, by an active LT, we can bring any one of its events \(\mathcal{A}\) to the vertex, thus transferring ourselves to the particle's rest frame at \(\mathcal{A}\ .\) Its acceleration \(\alpha\) then (\(d^2 x/dt^2\) at \(t = 0\ ,\) namely \(1/x_0\)), will be its acceleration in its rest frame always. The corresponding motion is called hyperbolic motion for obvious reasons. Note from the graph how the lab velocity tends to \(c\ ,\) and the lab acceleration tends to zero. For a sequence of such worldlines, with increasing proper acceleration \(\alpha\ ,\) the \(x\) intercept \(x_0\) decreases, so that, in the limit \(\alpha \rightarrow \infty\ ,\) the motion becomes that of a photon coming in to the origin and bouncing back out. Thus photons move with infinite proper acceleration!

Special relativistic kinematics

We shall presently discuss two of the most fundamental phenomena of SR, length contraction and time dilation: moving rods shrink, and moving clocks go slow, both by a \(\gamma\) factor. Does the observer's eye see these effects by the claimed amount? No. Because of the finite speed of light, what an observer sees at one instant is a composite of events that occurred progressively earlier as they occurred farther away. No observer can get an instantaneous view of the moving rod. What he actually sees, his world-picture, is irrelevant for our present purposes.

The concept that plays a pervasive role in special relativity is that of a world-map. As the name implies, this may be thought of as a (3-dimensional) map of events, namely those constituting an observer's instantaneous 3-space \(t = t_0\ .\) The main class of observers one considers in SR are 'inertial observers' -- observers at rest in some inertial frame \((x,y,z,t)\ ,\) say at its spatial origin. A world-map could be produced by having auxiliary observers at the coordinate lattice points all map their immediate neighborhoods at a pre-determined instant \(t=t_0\ ,\) and then combining all these local maps into a single global map. Alternatively, the world-map can be regarded as a 3-dimensional life-sized photograph exposed everywhere simultaneously, or a frozen instant in the observer's inertial frame. When we loosely speak of a snapshot taken in \(S\ ,\) or of the length of a moving object in \(S\ ,\) or of the shape of an accelerating object in \(S\ ,\) we invariably think of the world-map. Again, time dilation refers to what a moving clock does between successive world-maps.

To derive the formula for length contraction, consider two IFs \(S\) and \(S'\) in standard configuration, and a rigid rod of length \(\Delta x'\) at rest on the \(x'\) axis of \(S'\ .\) We wish to find its instantaneous length in \(S\ .\) In any frame in which it moves, its end points must evidently be measured simultaneously -- only in its rest-frame \(S'\) is this precaution unnecessary. So consider two events occurring simultaneously in \(S\) at the extremities of the rod. To these events we apply Eq. (7)(i): since \(\Delta t = 0\ ,\) we find \(\Delta x' = \gamma \Delta x\ ,\) or, writing for \(\Delta x, \Delta x'\ ,\) the more specific symbols \(L\) and \(L_0\ ,\) respectively,

\[\tag{20} L = L_0 / \gamma = (1-v^2/c^2)^{1/2}L_0. \]

We can conclude, quite generally, that the length \(L\) of a body in the direction of its motion with uniform velocity \(v\) is reduced by a \(\gamma\) factor over its 'rest-length' \(L_0\ .\) This is essentially a consequence of different events being considered simultaneous in different IFs. Nevertheless the effect is physical and real, as the following little story illustrates. (It's the story of the 'length contraction paradox'.)

Consider the admittedly unrealistic situation of a runner carrying a horizontal 20-foot pole at velocity \(v = 0. 866c\) (making \(\gamma = 2\)). The length of the pole to the outside world is 10 feet. He runs the pole into a 10-foot garage and a friend quickly closes the door. So far, no problem. But how does this experiment look in the IF of the runner? He stands there with his 20-foot pole and awaits the arrival of the now 5-foot garage. How can the pole possibly get in? That is where the relativistic speed limit comes in. At the time the front of the pole is hit by the (strong) back wall of the garage, the back of the pole has no reason to start moving. (SR allows no rigid bodies!) There is now an elastic wave propagating along the pole at maximally the speed of light; but by the time it can tell the back end of the pole to get moving, that end is well into the garage and the door will have been shut! Of course, the pole unavoidably will have suffered compression.

No direct experimental verification of length contraction seems presently feasible. However, an indirect verification is implicit in the magnetic field \(\mathbf{B}\) around a current-carrying wire. By charge conservation along the wire, the number density of the moving charges (electrons) must equal that of the stationary charges (ions). Now consider a test charge moving with some velocity \(\mathbf{u}\) parallel to the wire. It experiences a Lorentz force proportional to \(\mathbf{u \times B}\) away from the wire. But what causes such a radial force in the rest frame of the test charge? Only the fact that, owing to length contraction, the test charge sees different line densities of positive and negative charge along the wire, and hence feels a net electric radial force.

Next, time dilation. Consider again two inertial frames \(S\) and \(S'\) in standard configuration. Let a standard clock be fixed in \(S'\) and consider two events at that clock when it reads times differing by \(\Delta t'\ .\) We inquire what time interval \(\Delta t\) is ascribed to these events in \(S\ .\)

From the \(\Delta\)-form of Eq. (6)(iv) we see at once, since \(\Delta x' = 0\ ,\) that \(\Delta t = \gamma \Delta t'\ .\) Replacing \(\Delta t\ ,\) \(\Delta t'\) by the more specific symbols \(T\) and \(T_0\) for time elapsed and proper time elapsed, we have

\[\tag{21} T = \gamma T_0 = (1-v^2/c^2)^{-1/2}T_0. \]

We can conclude from this quite generally that a clock moving uniformly with velocity \(v\) through an inertial frame \(S\) goes slow by a \(\gamma\) factor relative to the synchronized standard clocks at rest in \(S\ .\) At speeds approaching the speed of light the apparent clock rate would be close to zero.

This 'time dilation', like length contraction, is no accident of convention but a real effect. Moving clocks really do go slow. If a standard clock is taken at uniform speed \(v\) through an inertial frame \(S\) along a straight line from point A to point B and back again to A, the elapsed time \(T_0\) indicated on the moving clock will be related to the elapsed time \(T\) indicated on the clock fixed at A by the Eq. (21) -- except for small unknown effects caused by the brief accelerations needed to initiate, reverse, and terminate, the journey. What happens to a clock during acceleration depends on the mechanism of the clock, whereas what happens during periods of uniform motion is totally mechanism-independent, as we saw in deriving Eq. (21). However, no matter what the effect of the three accelerations might be, it would be the same for long as for short journeys, and so it can be dwarfed by simply extending the lengths of free fall. So, at least in theory, an astronaut undertaking such a journey -- and, of course, aging exactly in accordance with her clock on board -- could come back finding her stay-at-home twin to have aged more than she during their separation by a \(\gamma\)-factor (neglecting what might have happened to her own aging during lift-off, turn-around and landing.) In the early days of relativity -- and far too long into its maturity -- there were always people who considered this result a paradox, the notorious 'twin-paradox'. Can't the stay-at-home claim that the situation between the two of them has been symmetric, so that he should be the younger? These people forget that inertial frames are important components of Nature. As Dennis Sciama put it nicely: relative to the inertial frames there is no symmetry between the twins, just as there is no symmetry between Newton's buckets -- one at rest, with the water flat, the other preferably above the first and rotating, with the water rising at the sides. If the buckets were the sole content of the universe, and there were no inertial frames, this would indeed be a paradox.

As we said, no universal predictions about how clocks behave under acceleration can be made. Certain natural 'clocks', like vibrating atoms or decaying muons, seem to be impervious to at least the accelerations they have so far been subjected to. We call such clocks 'ideal' clocks. Fully ideal clocks are a viable concept in SR, since in principle they can be built (using accelerometers and servomechanisms). An ideal clock, following an arbitrary path with arbitrary velocity through the synchronized clocks of an IF with coordinate time \(t\ ,\) will thus (by Eq.(21) ) indicate an elapsed time

\[\tag{22} \tau = \int (1-v^2/c^2)^{1/2}dt, \]

where the integral must be taken between the appropriate limits. The quantity \(\tau\) introduced here -- the proper time on an arbitrarily moving ideal clock -- will play an important role in the sequel.

Time dilation is actually very well established observationally. Among the more impressive of its modern manifestations was an experiment done at CERN in 1975 (and later refined to an impressive accuracy of \(2 \times 10^{-3}\)) wherein muons circling storage rings with velocities corresponding to \(\gamma \approx 29\) were found to decay more slowly than at rest by exactly their \(\gamma\) factor.

Another striking instance of time dilation is provided by the 'relativistic focusing' of beams of charged particles emerging from high-energy particle accelerators: they are surprisingly coherent. If we regard a stationary cloud of such particles, expanding under its own electrostatic repulsion, as a kind of clock, then the slow spread of beams at high velocity is an almost visible manifestation of the slowing down of a series of such clocks moving at high speed.

Transformation of velocity and acceleration

It is often of interest to 'translate' the velocity or acceleration of a moving particle (or even of just a moving geometric point) from one inertial frame to another. For example, if inside a car moving with velocity \(\mathbf{v}\) a particle has velocity \(\mathbf{u'}\ ,\) what is its velocity relative to the outside? In SR it is no longer just \(\mathbf{u'+v}\ !\) Again, if a particle moves in a circle of radius \(r\) with uniform speed \(u\) and thus with acceleration \(a=u^2/r\) towards the center, what acceleration does it feel in its rest frame -- the IF with which it is momentarily co-moving? In SR this will actually be \(\gamma^2 a\ !\)

Consider a geometric point (so as not to exclude superluminal velocities) moving with velocity

\[\tag{23} \mathbf{u} = (u_1,u_2,u_3) = (dx/dt,dy/dt,dz/dt) \]

in an IF \(S\ ,\) and with corresponding velocity

\[\tag{24} \mathbf{u'} = (u'_1,u'_2,u'_3) = (dx'/dt',dy'/dt',dz'/dt') \]

in the usual second frame \(S'\ .\) If we substitute from Eq. (8) into the right side of (24), and divide each numerator and denominator by \(dt\ ,\) and then compare with (23), we find the desired result:

\[\tag{25} u'_1 = \frac{u_1-v}{1-u_1v/c^2},\;\;\; u'_2 = \frac{u_2}{\gamma(1-u_1v/c^2)},\;\;\; u'_3 = \frac{u_3}{\gamma(1-u_1v/c^2)}. \]

No assumption as to the uniformity of \(\mathbf{u}\) was made, and these formulae apply equally to the instantaneous velocity in a non-uniform motion. Note how they reduce to the classical formulae when either \(v << c\ ,\) or \(c \rightarrow \infty\) formally.

To obtain the inverse transformation, we merely need to apply a '\(v\)-reversal' to (25) (see the paragraph before (6) ):

\[\tag{26} u_1 = \frac{u'_1+v}{1+u'_1v/c^2},\;\;\; u_2 = \frac{u'_2}{\gamma(1+u'_1v/c^2)},\;\;\; u_3 = \frac{u'_3}{\gamma(1+u'_1v/c^2)}. \]

This last set of equations can be regarded, alternatively, as giving the resultant \(\mathbf{u}\) of first imparting to a particle a velocity \(\mathbf{v} = (v,0,0)\) and then, relative to its new rest-frame, another velocity \(\mathbf{u'}\ .\) They are therefore occasionally referred to as the relativistic velocity addition formulae, and denoted by some notation like \(\mathbf{u = v \oplus u'}\ .\) We then also have \(\mathbf{u' = -v \oplus u}\ .\) But note that \(\mathbf{v \oplus w \neq w \oplus v}\ ,\) except for one-dimensional motion (as experimentation with simple components quickly shows).

We can now re-interpret our earlier Eq. (11), in which, necessarily, the \(v\) of the implicit LT is subluminal. It shows that \(u'< c\) implies \(u < c\ ,\) whence the result: the relativistic sum of any two velocities less than \(c\) is itself less than \(c\ .\) So, no matter how many velocity increments less than \(c\) a particle receives in its successive rest frames (that is, the sequence of inertial frames in which the particle is momentarily at rest), it can never attain the velocity of light. The 'hyperbolic motion' discussed after (19) is a case in point: the particle receives equal velocity increments at equal proper time increments, for ever!

The velocity transformation allows us to get the basic relativistic aberration formula, telling us how the elevation of an incoming ray, say from a star, varies as we move towards it across the line of sight. Consider an incoming ray whose negative direction makes angles \(\theta\) and \(\theta'\ ,\) respectively, with the \(x\) axes of the usual two frames \(S\) and \(S'\ .\) For the velocity of this ray we have

\[\tag{27} u_1 = -c \cos \theta,\;\;\; u_2 = -c \sin \theta,\;\;\; u'_1 = -c \cos \theta',\;\;\; u'_2 = -c \sin \theta'. \]

It turns out that the most useful aberration formula is in terms of \(\tan \frac{1}{2}\theta\ .\) Using the trigonometric identity

\[\tag{28} \tan \frac{1}{2}\theta' = \frac{\sin \theta'}{1+\cos \theta'}, \]

substituting from (27), and applying the transformation (25), we find, after some manipulation, the desired result:

\[\tag{29} \tan \frac{1}{2}\theta'= \sqrt{ \frac{c-v}{c+v} } \tan \frac{1}{2}\theta. \]

For outgoing rays we replace the negatives in (27) by positives, which has the effect of inverting the multiplier in (29). Thus, for outgoing rays \(\theta\) is smaller than \(\theta'\ ,\) and significantly so when \(v \approx c\ .\) A well-known consequence of this is the so-called headlight effect: a radiating and fast-moving source throws almost all of its radiation into a narrow forward cone. This effect is very pronounced, for example, in the synchrotron radiation emitted by highly accelerated charged particles in circular orbit. It should be noted that, unlike in kinematics and mechanics, 'classical' formulae in optics do not result from 'relativistic' formulae by letting \(c\) tend to infinity: classical optics already has \(c\) in its formulae! This remark applies also to our earlier Eq.(18).

There is one useful velocity concept which is the same in SR and Newtonian kinematics. This is the rate of change, in one given inertial frame \(S\ ,\) of the connecting vector between two particles whose position vectors and velocities are, let us say, \(\mathbf{r_1}, \mathbf{u_1}\) and \(\mathbf{r_2}, \mathbf{u_2}\ ,\) respectively:

\[\tag{30} \frac{d}{dt}(\mathbf{r_2}-\mathbf{r_1}) = \mathbf{u_2} - \mathbf{u_1}. \]

We call this, for lack of a better name, the mutual velocity between the particles in \(S\ ,\) to distinguish it from the relative velocity, which is what one particle ascribes to the other. It can be as big as \(2c\ ,\) as when two photons collide head-on. If I wish to know how much time will elapse in my own inertial frame before two uniformly converging particles collide, I simply divide their present distance apart by their mutual velocity, in relativistic as in Newtonian kinematics.

Lastly, we turn to the transformation of the acceleration. To this end, we begin by calculating the velocity differentials from (25). Denoting the first denominator by \(D\ ,\)

\[\tag{31} D = 1 - u_1v/c^2, \]

we find

\[\tag{32} \begin{array}{rcl} du'_1 &=& [Ddu_1 + (u_1 - v)du_1 / c^2]D^2 = du_1/\gamma^2 D^2 \\\\ du'_2 &=& [\gamma D du_2 + u_2\gamma du_1v / c^2]/\gamma^2D^2, \end{array} \]

and similarly for \(du'_3\ .\) Dividing each of these equations by

\[\tag{33} dt' = \gamma dt(1-u_1v/c^2) = \gamma Ddt, \]

then yields the following transformation for the acceleration components \(a'_1 = du'_1/dt'\ ,\) etc. :

\[\tag{34} a'_1 = a_1/\gamma^3D^3,\;\;\; a'_2 = \frac{a_2}{\gamma^2D^2}+\frac{a_1u_2v}{c^2\gamma^2D^3}, \]

and similarly for \(a'_3\ .\) In SR the acceleration is not invariant!

We shall consider two specific examples, namely finding the proper acceleration corresponding to linear and circular lab motion. The proper acceleration of a particle is that which is measured in the IF momentarily co-moving with the accelerating particle. Say a particle moves along the \(x\) axis of the usual frame \(S\) with instantaneous velocity \(u\) and acceleration \(a=a_1\ .\) Let \(S'\) be its rest frame, moving with uniform velocity \(v=u\ .\) Then \(u_1 = v\ ,\) \(D = \gamma^{-2}\) and the proper acceleration \(\alpha\) is \(a'_1\ .\) So, by (34)(i),

\[\tag{35} \alpha = \gamma^3(u)a, \]

where we have written \(\gamma(u)\) for the \(\gamma\) factor of \(u\ .\) Proper acceleration is precisely the thrust we feel when sitting in an accelerating rocket.

Next, suppose in the lab frame \(S\) a particle moves around a circle of radius \(r\) in the \(x,y\) plane, anti-clockwise and with speed \(u\ .\) Its lab acceleration \(a\) is \(u^2/r\) towards the center. At the bottom of its motion its rest frame is the usual frame \(S'\) moving uniformly with velocity \(v=u\ .\) At that point, \(u_1 = u = v\ ,\) \(a_1 = a_3 = 0\ ,\) \(a_2 = u^2 /r\ ,\) and the proper acceleration \(\alpha = a'_2\ .\) So, by (34)(ii),

\[\tag{36} \alpha = \gamma^2(u)a. \]


All the figures in this article are taken from the author's book "Relativity: Special, General, and Cosmological" (2nd ed., 2006) Oxford University Press, by kind permission of the publishers.


  • Einstein, A. (1905) Annalen der Physik, 17, 891
  • Ignatowski, W. v. (1910) Phys. Zeits., 11, 972
  • Rindler, W. (1966) Special Relativity (2nd. ed.), Oliver and Boyd
  • Rindler, W. (1979) Essential Relativity (2nd. ed.), p. 51, Springer-Verlag
  • Rindler, W. (1991) Introduction to Special Relativity (2nd. ed.), Oxford U. P.
  • Rindler, W. (2006) Relativity: Special, General, and Cosmological (2nd. ed.), Oxford U. P.

Further reading

  • "Spacetime Physics", E. F. Taylor and J. A. Wheeler (W.H.Freeman, 1992)
  • "Special Relativity", A.P. French (Norton, 1968)
  • "Special Theory of Relativity", C. W. Kilmister (Pergamon Press, 1970)
  • "Special Relativity", W. G. Dixon (Cambridge University Press, 1978)
  • "Relativity: The Special Theory", J. L. Synge (North Holland, 1956)

External links

See also

Special relativity: mechanics, Special relativity: electromagnetism

Personal tools

Focal areas