# Principle of least action

Post-publication activity

Curator: Chris G. Gray

The principle of least action is the basic variational principle of particle and continuum systems. In Hamilton's formulation, a true dynamical trajectory of a system between an initial and final configuration in a specified time is found by imagining all possible trajectories that the system could conceivably take, computing the action (a functional of the trajectory) for each of these trajectories, and selecting one that makes the action locally stationary (traditionally called "least"). True trajectories are those that have least action.

## Statements of Hamilton and Maupertuis Principles

There are two major versions of the action, due to Hamilton and Maupertuis, and two corresponding action principles. The Hamilton principle is nowadays the most used. The Hamilton action $$S$$ is defined as an integral along any actual or virtual (conceivable or trial) space-time trajectory $$q(t)$$ connecting two specified space-time events, initial event $$A \equiv(q_A,t_A=0)$$ and final event $$B \equiv (q_B,t_B=T)\ ,$$ $\tag{1} S\; =\; \int _{0}^{T}L\, \left(q\; ,\; \dot{q}\right) \; d t\quad ,$

where $$L\, \left(q\; ,\; \dot{q}\right)$$ is the Lagrangian, and $$\dot{q}\; =\; dq/d t\ .$$ In the integrand of (1), the Lagrangian function becomes time-dependent when $$q$$ assumes the values describing a particular trajectory $$q(t)$$. For most of what follows we will assume the simplest case where $$L = K - V\ ,$$ where $$K$$ and $$V$$ are the kinetic and potential energies, respectively; see Section 4 for discussion of the freedom of choice for $$L$$, and the relativistic sections 9 and 11 for cases for which $$L$$ is not equal to $$K - V$$. In general, $$q$$ stands for the complete set of independent generalized coordinates, $$q_1, q_2, \ldots\ ,$$ $$q_f\ ,$$ where $$f$$ is the number of degrees of freedom (see Section 4). Hamilton's principle states that among all conceivable trajectories $$q(t)$$ that could connect the given end points $$q_A$$and $$q_B$$ in the given time $$T\ ,$$ the true trajectories are those that make $$S$$ stationary. As we shall see in Section 5, if the trajectory is sufficiently short, the action $$S$$ is a local minimum for a true trajectory, i.e., "least". In general, for long trajectories $$S$$ is a saddle point for a true trajectory (and is never a maximum). In Hamilton's principle the conceivable or trial trajectories are not constrained to satisfy energy conservation, unlike the case for Maupertuis' principle discussed later in this section (see also Section 7). Energy conservation results as a consequence of the Hamilton principle for time-invariant systems (Section 12), for which the Lagrangian $$L(q,\dot{q})$$ does not depend on $$t$$ explicitly, but only implicitly when $$q$$ takes values $$q(t)$$ describing a trajectory . More than one true trajectory may satisfy the given constraints of fixed end-positions and travel time (see Section 3). To emphasize a particular constraint on the varied trajectories, we write Hamilton's principle as $\tag{2} \left(\delta S\right)_{T} \; =\; 0\quad ,$

where the constraint of fixed travel time $$T$$ is written explicitly, and the constraint of fixed end-positions $$q_A$$ and $$q_B$$ is left implicit. We will consider other variational principles below, but all will have fixed $$q_A$$ and $$q_B$$ (quantities other than $$T$$ will also be constrained) so we will always leave the constraint of fixed $$q_A$$ and $$q_B$$ implicit. (In Section 7 we mention generalized action principles with relaxed end-position constraints.) Some smoothness restrictions are also often imposed on the trial trajectories. It is clear from (1) that $$S$$ is a functional of the trial trajectory $$q(t)$$, often denoted $$S[q(t)]$$, and in (2) $$\delta S$$ denotes the first-order variation in $$S$$ corresponding to the small variation $$\delta q(t)$$ in the trial trajectory: i.e., $$\ S[q(t) + \delta q(t)] - S[q(t)] = \delta S + \delta^2 S + ...$$, where $$\delta S$$ is first-order in $$\delta q(t), \delta^2 S$$ is second-order in $$\delta q(t)$$, etc. Explicit expressions for these variations of $$S$$ in terms of $$\delta q(t)$$ are not needed here, but are given in calculus of variations and advanced mechanics texts, and in Gray and Taylor (2007), for example. The Hamilton principle means that the first-order variation of the action $$\delta S$$ vanishes for any small trajectory variation $$\delta q(t)$$ around a true trajectory consistent with the given constraints. The quantities $$\delta S$$ and $$\delta^2 S$$ are usually referred to simply as the first and second variations of $$S$$, respectively. The action $$S$$ is stationary for a true trajectory (first variation vanishes for all $$\delta q(t)$$), and whether $$S$$ is a minimum depends on whether the second variation is positive definite for all $$\delta q(t)$$ (see Section 5).

The second major version of the action is Maupertuis' action $$W\ ,$$ where $\tag{3} W\; =\; \int _{q_{A} }^{q_{B} }pdq\; =\; \int _{0}^{T}2\, K\, d t\quad ,$

where the first (time-independent) form is the general definition, with $$p\; =\; \partial L/\partial \dot{q}$$ the canonical momentum (equal to the ordinary momentum in many cases of interest), and $$pdq$$ stands for $$p_1dq_1 + p_2dq_2 + \ldots + p_fdq_f$$ in general. The second (time-dependent) form for $$W$$ in (3) is valid for normal systems in which the kinetic energy $$K$$ is quadratic in the velocity components $$\dot{q}_{1} \; ,\; \dot{q}_{2} \; ,\; \cdots \; ,\; \dot{q}_{f} \ .$$ The Maupertuis principle states that for true trajectories $$W$$ is stationary on trial trajectories with fixed end positions $$q_A$$ and $$q_B$$ and fixed energy $$E = K+V\ .$$ Following our earlier conventions, we write this principle as $\tag{4} \left(\delta W\right)_{E} \; =\; 0\quad .$

Note that $$E$$ is fixed but $$T$$ is not in Maupertuis' principle (4), the reverse of the conditions in Hamilton's principle (2).

Solution of the variational problem posed by Hamilton's principle (2) yields the true trajectories $$q(t)\ .$$ Solution of Maupertuis' variational equation (4) using the time-dependent (second) form of $$W$$ in (3) also yields the true trajectories, whereas using the time-independent (first) form of $$W$$ in (3) yields (in multidimensions) true orbits, i.e. spatial shape of the true paths. In the latter case, in two and more dimensions, the action $$W$$ can be rewritten as an integral involving the arc length along the orbit (Jacobi's form), and the problem then resembles a geodesic or reciprocal isoperimetric problem (Lanczos 1970). In one dimension, for simplicity assume $$q$$ is an inertial Cartesian coordinate and note that since the momentum $$p$$ is a simple function of $$q$$ in the first form for $$W$$ due to the constraint of fixed energy $$E = p^2/2m + V(q)$$, the only freedom for variation is instantaneous momentum reversals, so that the principle is essentially uninformative, i.e., provides essentially no information beyond what is contained in the assumed conservation of energy. In one dimension, one can find the one or more true trajectories $$q(t)$$ from energy conservation and the two end-position values. (The generalization of Maupertuis' principle discussed in Section 7 does not have the defect of being uninformative for one-dimensional systems, and energy conservation is not assumed but derived from the generalized principle for all time-invariant systems, just as it is for Hamilton's principle.) In all cases the solutions for the true trajectories and orbits can be obtained directly from the Hamilton and Maupertuis variational principles (see Section 8), or from the solution of the corresponding Euler-Lagrange differential equations (see Section 3) which are equivalent to the variational principles.

Hamilton's principle is applicable to both conservative systems and nonconservative systems where the Lagrangian $$L\, \left(q\; ,\; \dot{q}\; ,\; t\right)$$ is explicitly time-dependent (e.g. due to a time-dependent potential $$V(q,t)$$), whereas the form (4) of Maupertuis' principle is restricted to conservative systems (it can be generalized – see Gray et al. 2004). Systems with velocity-dependent forces require special treatment. Dissipative nonconservative systems are discussed in Section 4. Magnetic and relativistic systems are discussed by Jackson (1999), and in Section 9 below. For conservative systems the two principles (2) and (4) are related by a Legendre transformation, as discussed in Section 6.

An appealing feature of the action principles is their brevity and elegance in expressing the laws of motion. They are valid for any choice of coordinates (i.e., they are covariant), and readily yield conservation laws from symmetries of the system (Section 12). They generate covariant equations of motion (Section 3), but they also supply an alternative and direct route to finding true trajectories which bypasses equations of motion; this route can be implemented analytically as an approximation scheme (Section 8), or numerically to give essentially exact trajectories (Beck et al. 1989, Basile and Gray 1992, Marsden and West 2001). Action principles transcend classical particle and rigid body mechanics and extend naturally to other branches of physics such as continuum mechanics (Section 11), relativistic mechanics (Section 9), quantum mechanics (Section 10), and field theory (Section 11), and thus play a unifying role. Unifying the various laws of physics with the help of action principles has been an ongoing activity for centuries, not always successful, e.g., successful for particle mechanics and geometric optics by Maupertuis and Hamilton (Yourgrau and Mandelstam 1968), and not completely successful for particle mechanics and thermodynamics by Helmholtz, Boltzmann and others (Gray et al. 2004), and continues to this day with some of the modern quantum field theories. The action principles have occasionally assisted in developing new laws of physics (see comments at the end of Section 9). The Hamilton and Maupertuis principles are not applicable, however, if the system is nonholonomic, and usually not if the system is dissipative (Section 4).

## History

Various aspects of the extensive history of action principles and variational principles in general are discussed in the historical references at the end of this article.

Maupertuis' principle is older than Hamilton's principle by about a century (1744 vs 1834). The original formulation of Maupertuis was vague and it is the reformulation due to Euler and Lagrange that is described above. Maupertuis' motivation was to rationalize the laws of both ray optics (Fermat's principle of least time (1662)) and mechanics with a metaphysical teleological argument (using design or purpose to explain natural phenomena) that "nature acts as simply as possible". Today we recognize that a principle of least action is partly conventional (changing the sign in the definition of the action leaves the equations of motion intact but changes the principle to one of greatest action), that in general action is least only for sufficiently short trajectories (Section 5), that the principle is not valid for all force laws (Section 4), and, when valid, it is a mathematical consequence of the equations of motion. No a priori physical argument requires a principle of least or stationary action in classical mechanics, but the classical principle is a consequence of quantum mechanical variational principles in the classical limit (Section 10). Quantum mechanics itself can be based on postulates of action principles (Section 10), and in new fields one often simply postulates an action principle. At the end of Section 9 we briefly discuss the history of the role of action principles in establishing new laws of physics, and at the end of the preceding section we mention the long history of using action principles in attempting to unify the various laws of physics.

As with Maupertuis, unifying the treatments of geometric optics and mechanics motivated Hamilton. He used both actions, $$W$$ and $$S$$, to find paths of rays in optics and paths of particles in mechanics. Hamilton introduced the action $$S$$ and its variational principle described above, and an extension he called the law of varying action, which is closely related to a generalization of Hamilton's principle which we call the unconstrained Hamilton principle in Section 7. Just as the true paths satisfy the Euler-Lagrange differential equation discussed in the next section, from his law of varying action Hamilton showed that the action $$S$$ for true paths, when considered as a function of the final end-point variables $$q_B$$ and $$T$$, satisfies a partial differential equation, nowadays called the time-dependent Hamilton-Jacobi equation. He found the corresponding time-independent Hamilton-Jacobi equation for action $$W$$ for true paths, and the optical analogues of both Hamilton-Jacobi equations. He also reformulated the second-order Euler-Lagrange equation of motion for coordinate $$q(t)$$ as a pair of first-order differential equations for coordinate $$q(t)$$ and momentum $$p(t)$$, with the Hamiltonian $$H(q,p)$$ replacing the Lagrangian $$L(q,\dot{q})$$ (via equation (6) below), giving what are called the Hamilton or canonical equations of motion, i.e., $$\dot{q} = \partial H/\partial p$$ and $$\dot{p} = -\partial H/\partial q$$. These form the basis of modern Hamiltonian mechanics (Goldstein et al. 2002), with its large array of useful concepts and techniques such as canonical transformations, action-angle variables, integrable vs nonintegrable systems (Section 10), Poisson brackets, canonical perturbation theory, canonical or symplectic invariants, and flow in phase space and Liouville's theorem. (In general there is one pair of Hamilton canonical equations for each pair of canonical variables $$q_{\alpha},p_{\alpha}$$, with $$\alpha = 1,2,...,f$$. The set of canonical variables $$q_\alpha,p_\alpha$$ defines the $$2f$$-dimensional phase space of the system. Because of the uniqueness of the solution of the first-order Hamilton equations for the trajectory starting from any point $$(q_{\alpha}(0), p_{\alpha}(0))$$, a set of trajectories in phase space $$(q_{\alpha}(t), p_{\alpha}(t))$$ flows without crossing, thus behaving like the flow lines of an incompressible fluid, which is a simple version of Liouville's theorem.) Using (6) we can express the Lagrangian in terms of $$q$$ and $$p$$ and the Hamiltonian, instead of $$q$$ and $$\dot{q}$$, and the Hamilton action principle then yields directly the Hamilton equations of motion as its Euler-Lagrange equations (Goldstein et al. 2002). This last result was given implicitly by Hamilton in his papers and somewhat more explicitly by Jacobi in his lectures (Clebsch 1866). To distinguish this form of the Hamilton principle from the usual one involving the Lagrangian $$L(q,\dot{q})$$, it is sometimes called the phase space form of Hamilton's principle. The various versions of quantum mechanics all developed from corresponding classical mechanics results of Hamilton, i.e., wave mechanics from the Hamilton-Jacobi equation (Schrödinger), matrix and operator mechanics from the Hamilton equations of motion and Poisson Brackets (Heisenberg and Dirac), and the path integral from Hamilton's action (Dirac and Feynman).

Over the years, and even recently, a number of reformulations and generalizations of the basic Maupertuis and Hamilton action principles have been given (see Gray et al. 1996a, 2004 for extensive discussions and references). In Section 7 we discuss several of the most recent generalizations.

## Euler-Lagrange Equations

Using standard calculus of variations techniques one can carry out the first-order variation of the action, set the result to zero as in (2) or (4), and thereby derive differential equations for the true trajectory, called the Euler-Lagrange equations, which are equivalent to the variational principles. For Hamilton's principle, the corresponding Euler-Lagrange equation of motion (often called simply Lagrange's equation) is (see, e.g., Brizard 2008, Goldstein et al. 2002) $\tag{5} \frac{d}{d t} \; \left(\frac{\partial L}{\partial \dot{q}_{\alpha}} \right)\; -\; \frac{\partial L}{\partial q_{\alpha}} \; =\; 0\quad ,$

where $$\alpha = 1,2,...,f\ .$$ As with the action principles, eqs.(5) are covariant (i.e. valid for any choice of the coordinates $$q_\alpha$$), and can be written out explicitly as coupled second-order differential equations for the $$q_\alpha$$'s. For particle systems these equations reduce to the standard Newton equations of motion if one chooses Cartesian coordinates in an inertial frame. The time-dependent version of Maupertuis' principle yields the same equation of motion for the space-time trajectories $$q(t)\ .$$ The time-independent version of Maupertuis' principle yields (Lanczos 1970, Landau and Lifshitz 1969) corresponding differential equations for the true spatial paths (orbits).

As a simple example, consider the Hamilton principle for the one-dimensional harmonic oscillator with the usual inertial frame Cartesian coordinate $$x$$. The Lagrangian is $$L \; = \; K \;- \; V \; = \; (1/2)m \dot{x}^2 \, - \; (1/2)k x^2 \ ,$$ where m is the mass and k is the force constant. The partial derivatives of $$L$$ are $$\partial L/ \partial \dot{x} \; = \; m \dot{x}$$ and $$\partial L/ \partial x \, = \; -kx$$ so that the Euler-Lagrange equation (5) gives $$m \ddot{x} \; + \; kx \; = \; 0 \ ,$$ which is Newton's equation of motion for this system. The well known general solution is $$x(t) \; = \; C_1 sin \omega t \; + \; C_2 cos \omega t \ ,$$ where $$\omega \; = \; (k/m)^{1/2}$$ is the frequency. The constants $$C_1$$ and $$C_2$$ are chosen to satisfy the constraints $$x \; = \; x_A$$ at $$t \; = \; 0$$ and $$x \; = \; x_B$$ at $$t \; = \; T -$$ see next paragraph.

Strictly speaking, because the action principles are formulated as boundary value problems ($$q$$ is specified at two points $$q_A$$ and $$q_B$$) and not as initial value problems ($$q$$ and $$\dot{q}$$ are specified at one point $$q_A$$), there may be more than one solution: there can in fact be zero, one, two, ..., up to an infinite number of solutions in particular problems. For example, applying the Hamilton principle to the one dimensional harmonic oscillator with coordinate $$x$$ (see preceding paragraph) and specifying $$x = 0$$ at $$t = 0$$ and $$x = 0$$ at $$t = T$$ (one period $$2\pi/\omega$$) gives an infinite number of solutions, i.e. $$x(t) = A sin \omega t$$ with one solution for each value of the amplitude $$A$$, which is arbitrary. The same system with the constraints $$x = 0$$ at $$t = 0$$ and $$x = A$$ at $$t = T/4$$ has the unique solution $$x(t) = A sin \omega t\ ,$$ and for the constraints $$x = 0$$ at $$t = 0$$ and $$x = C$$ at $$t = T/2$$ no solution exists for nonzero $$C$$. In practice, one usually has initial conditions in mind, where the solution is unique, and selects the appropriate solution of the corresponding boundary value problem, or imposes the initial conditions directly on the solution of the Euler-Lagrange equation of motion. Thus for the harmonic oscillator example with specified initial conditions, say $$x=0$$ and $$\dot{x}=v_0$$ at $$t=0$$, we simply choose $$C_1 = v_0/\omega$$ and $$C_2=0$$ in the general solution given in the paragraph above.

Another system exhibiting multiple solutions under space-time boundary condition constraints is the quartic oscillator, discussed in Sections 5 and 8. In Fig.1 note that two true trajectories (labelled $$1$$ and $$0$$) are shown connecting the initial space-time event P at the origin and the final space-time event denoted by a square symbol where the two trajectories intersect. The true trajectories $$1$$ and $$0$$ shown on the figure are the two of lowest energies having P and the square symbol as space-time end-events. Additional true trajectories with higher energies also satisfy the boundary conditions.

As an example giving multiple solutions with Maupertuis principle constraints (specified initial and final positions, and specified energy), consider throwing a ball in a uniform gravitational field from a specified position P and with specified energy $$E$$ (which corresponds to a specific initial speed). Ignore air friction. If we throw the ball twice, in the same vertical plane, with two different angles of elevation of the initial velocity, say one with 45 degrees and the other with 75 degrees, but the same initial speed, the two parabolic spatial paths will recross at some point in the plane, call it R. Thus specifying P and R and $$E$$ does not in general determine a unique true trajectory, as we have found two true trajectories here with the same values of P and R and $$E$$.

We see in these examples of multiple solutions the roles of the differing constraints in the Hamilton and Maupertuis principles. In the Hamilton principle examples the multiple solutions have the same prescribed travel time $$T$$, and differ in energy $$E$$ which is not prescribed. In the Maupertuis principle example, the opposite is true: the multiple solutions have the same prescribed $$E$$, and differ in $$T$$ which is not prescribed. The complementarity between prescribed $$T$$ and prescribed $$E$$ is discussed in Section 6.

## Restrictions to Holonomic and Nondissipative Systems

The action principles (2) and (4) are restricted to holonomic systems, i.e. systems whose geometrical constraints (if any) involve only the coordinates and not the velocities. Simple examples of holonomic and nonholonomic systems are a particle confined to a spherical surface, and a wheel confined to rolling without slipping on a horizontal plane, respectively. Attempts to extend the usual action principles to nonholonomic systems have been controversial and ultimately unsuccessful (Papastavridis 2002). Hamilton's principle in its standard form (2) is not valid, but a more general and correct Galerkin-d'Alembert form has been derived. For a holonomic system with $$n$$ coordinates and $$c$$ constraints, the number of independent coordinates (degrees of freedom) is $$f = n-c$$. Thus for the example of the particle confined to a spherical surface we have $$n = 3$$ coordinates, $$c = 1$$ constraint, and hence $$f = 2$$ independent coordinates. These can be chosen as any two of the particle's three Cartesian coordinates with respect to axes with origin at the center of the sphere, or as latitude and longitude coordinates on the sphere surface, etc. One can implement holonomic constraints as in Sections 1 and 3 by using a Lagrangian $$L$$ with any set of $$f$$ independent coordinates $$q$$, or one can treat the $$n$$ coordinates symmetrically by expressing $$L$$ as a function of all of them and using the method of Lagrange multipliers (Lanczos 1970, Morse and Feshbach 1953, Fox 1950) to take account of the constraints. In essence, the Lagrange multipliers relax the constraints, with one multiplier for each constraint relaxed.

In the literature (e.g., Dirac 1964) a second type of velocity-dependent constraint, nongeometic and called "kinematic" in Gray et al. (2004), has been discussed. The usual action principles are valid for this type of velocity-dependent constraint. As simple examples, for conservative systems one could impose the additional constraint of fixed energy $$K(\dot{q}) \; + \; V(q)$$ on the trial trajectories in the Hamilton principle, and the fundamental constraints in the Hamilton and Maupertuis principles involve the velocities. The Dirac-type constraints are implemented by the method of Lagrange multipliers. In Section 7 we use Lagrange multipliers to relax the fundamental constraints of the Hamilton and Maupertuis principles.

In general, the action principles do not apply to dissipative systems, i.e. systems with frictional forces. However, for some dissipative systems, including all one-dimensional ones, Lagrangians have been shown to exist, and Hamilton's principle then applies (see Gray et al. 2004 for a brief review, and Chandrasekhar et al. 2007 for more recent developments).

More generally, the question of whether a Lagrangian and corresponding action principle exist for a particular dynamical system, given the equations of motion and the nature of the forces acting on the system, is referred to as the "inverse problem of the calculus of variations" (Santilli 1978). If a Lagrangian $$L$$ does exist, it will not be unique. For example, it is obvious from (1) and (2), or (5), that $$c_1L$$ and $$L + c_2$$ are equally good Lagrangians for any constants $$c_1$$ and $$c_2$$. It is also clear from (1) and (2) that we can add to $$L$$ a total time-derivative of any function of the coordinates and time (i.e. $$dF/dt$$ for arbitrary $$F(q,t)$$) to obtain another valid Lagrangian; the action integral $$S$$ will only change by the addition of constant boundary value terms $$F(q_B,T) - F(q_A,0)$$, so that the variation of the action will be unchanged. Additional freedom of choice will also often exist. For example, for a free particle in one dimension with $$q$$ the Cartesian coordinate in an inertial frame, it is easy to check that, in addition to $$L = K$$ (the traditional choice), choosing $$L$$ equal to the square of the kinetic energy $$K = \frac{1}{2}m \dot{q}^2$$ also gives the correct equation of motion $$\ddot{q} = 0$$. By putting additional conditions on the Lagrangian we can narrow down the choice. Thus for the free particle in one dimension, using an inertial frame of reference and by requiring the Lagrangian function to be invariant under Galilean transformations, i.e. have the same functional form in all inertial frames, we can rule out all but the kinetic energy, up to the free multiplicative and additive constants $$c$$ discussed above (Landau and Lifschitz 1969). Requiring that $$S$$ be a minimum for short true trajectories and not a maximum (see next section) will fix the sign of the multiplicative free constant $$c$$ in $$cL$$, requiring that the Lagrangian have the dimension of energy will fix the multiplicative and additive free constants in the Lagrangian up to numerical factors, and requiring that the Lagrangian approach zero for zero velocity and a particular value of the coordinate $$q$$ (such as infinity or zero) will fix the additive free constant $$c$$ in $$L + c$$.

In this review we restrict ourselves to the most common case where the Lagrangian depends on the coordinates and their first derivatives, but when higher derivatives occur in the Lagrangian the Euler-Lagrange equation generalizes in a natural way (Fox, 1950). As an example, in considering the vibrational motion of elastic continuum systems (Section 11) such as beams and plates, the standard Lagrangian contains spatial second derivatives, and the corresponding Euler-Lagrange equation of motion contains spatial fourth derivatives (Reddy, 2002). Figure 1: Space-time diagram for a family of true trajectories $$x(t)$$ for the quartic oscillator $$[V(x) = (1/4)Cx^4]$$ starting at $$P(0,0)$$ with $$v_0 > 0\ .$$ Kinetic foci ($$Q_i$$) of the trajectories are denoted by open circles. For this particular oscillator the kinetic focus occurs approximately at a fraction 0.646 of the half-period $$T_0/2\ ,$$ illustrated here for trajectory $$0\ .$$ The kinetic foci of all true trajectories of this family lie along the heavy gray line, the caustic, which is approximately a hyperbolic curve for this oscillator. Squares indicate recrossing events of true trajectory $$0$$ with the other two true trajectories. (From Gray and Taylor 2007.)

## When Action is a Minimum

The action $$S$$ (or $$W$$) is stationary for true trajectories, i.e., the first variation $$\delta S$$ vanishes for all small trajectory variations consistent with the given constraints. If the second variation is positive definite $$( \delta^2 S > 0 )$$ for all such trajectory variations, then $$S$$ is a local minimum; otherwise it is a saddle point, i.e., at second order the action is larger for some nearby trial trajectories and smaller for others, compared to the true trajectory action. As defined in Section 1, action is never a local maximum, as we shall discuss. (In relativistic mechanics (see Section 9) two sign conventions for the action have been employed, and whether the action is never a maximum or never a minimum depends on which convention is used. In our convention it is never a minimum.) We discuss here the case of the Hamilton action $$S$$ for one-dimensional ($$1$$D) systems, and refer to Gray and Taylor (2007) for discussions of Maupertuis' action $$W\ ,$$ and $$2$$D etc. systems. For some $$1$$D potentials $$V(x)$$ (those with $$\partial^2V/ \partial x^2 \leq 0$$ everywhere), e.g. $$V(x) = 0\ ,$$ $$V(x) = mg x\ ,$$ and $$V(x) = -Cx^2\ ,$$ all true trajectories have minimum $$S\ .$$ For most potentials, however, only sufficiently short true trajectories have minimum action; the others have an action saddle point. "Sufficiently short" means that the final space-time event occurs before the so-called kinetic focus event of the trajectory. The latter is defined as the earliest event along the trajectory, following the initial event, where the second variation $$\delta^2 S$$ ceases to be positive definite for all trajectory variations, i.e., where $$\delta^2S = 0\$$ for some trajectory variation. Establishing the existence of a kinetic focus using this criterion is discussed by Fox (1950). An equivalent and more intuitive definition of a kinetic focus can be given. As an example, consider a family of true trajectories $$x(t,v_0)$$ for the quartic oscillator with $$V(x) = (1/4) Cx^4\ ,$$ all starting at $$P (x = 0$$ at $$t = 0)\ ,$$ and with various initial velocities $$v_0 > 0\ .$$ Three trajectories of the family, denoted $$0\ ,$$ $$1\ ,$$ and $$2\ ,$$ are shown in Figure 1. These true trajectories intersect each other – note the open squares in Figure 1 showing intersections of trajectories $$1$$ and $$2$$ with trajectory $$0\ .$$ The kinetic focus event $$Q_0$$ of the true trajectory $$0\ ,$$ with starting event $$P\ ,$$ is the event closest to $$P$$ at which a second true trajectory, with slightly different initial velocity at $$P\ ,$$ intersects trajectory $$0\ ,$$ in the limit for which the two trajectories coalesce as their initial velocities at $$P$$ are made equal. Based on this definition a simple prescription for finding the kinetic focus can be derived (Gray and Taylor 2007), i.e., $$\partial x(t,v_0)/ \partial v_0 = 0\ ,$$ and for a quartic oscillator trajectory starting at $$P(0,0)$$ the kinetic focus $$Q$$ occurs at time $$t_Q$$ given approximately by $$t_Q = 0.646(T/2)\ ,$$ where $$T$$ is the period, as shown in Fig.1 for trajectory $$0$$. This is the first kinetic focus, usually called simply the kinetic focus. Subsequent kinetic foci may exist but we will not be concerned with them.

The other trajectories shown in Figure 1 have their own kinetic foci, i.e. $$Q_1$$ for trajectory $$1$$ and $$Q_2$$ for trajectory $$2\ .$$ The locus of all the kinetic foci of the family is called the caustic (it is an envelope), and is shown as the heavy gray line in Figure 1.

Thus, for trajectory $$0$$ in Figure 1, if the trajectory terminates before kinetic focus $$Q_0\ ,$$ the action $$S$$ is a minimum; if the trajectory terminates beyond $$Q_0\ ,$$ the action is a saddle point.

By an argument due originally to Jacobi, it is easy to see intuitively that action $$S$$ can never be a local maximum (Morin 2008, Gray and Taylor 2007). Note that for any true trajectory the action $$S$$ in (1) can be increased by considering a varied trajectory with wiggles added somewhere in the middle. The wiggles are to be of very high frequency and very small amplitude so that there is increased kinetic energy $$K$$ compared to the original trajectory but only a small change in potential energy $$V$$. (We also ensure the overall travel time $$T$$ is kept fixed.) The Lagrangian $$L = K - V$$ in the region of the wiggles is then larger for the varied trajectory and so is the action integral $$S$$ over the time interval $$T$$. Thus $$S$$ cannot be a maximum for the original true trajectory. A similar intuitive argument due originally to Routh shows that action $$W$$ also cannot be a local maximum for true trajectories (Gray and Taylor 2007).

For the purpose of determining the true trajectories, the nature of the stationary action (minimum or saddle point) is usually not of interest. However, there are situations where this is of interest, such as investigating whether a trajectory is stable or unstable (Papastavridis 1986), and in semiclassical mechanics where the phase of the propagator (Section 10) depends on the true classical trajectory action and its stationary nature; the latter dependence is expressed in terms of the number of kinetic foci occurring between the end-points of the true trajectory (Schulman 1981). In general relativity kinetic foci play a key role in establishing the Hawking-Penrose singularity theorems for the gravitational field (Wald 1984). Kinetic foci are also of importance in electron and particle beam optics. Finally, in seeking stationary action trajectories numerically (Basile and Gray 1992, Beck et al. 1989, Marsden and West 2001), it is useful to know whether one is seeking a minimum or a saddle point, since the choice of algorithm often depends on the nature of the stationary point. If a minimum is being sought, comparison of the action at successive stages of the calculation gives an indication of the error in the trajectory at a given stage since the action should approach the minimum value monotonically from above as the trajectory is refined. The error sensitivity is, unfortunately, not particularly good, as, due the stationarity of the action, the error in the action is of second order in the error of the trajectory. Thus a relatively large error in the trajectory can produce a small error in the action.

## Relation of Hamilton and Maupertuis Principles

For conservative (time-invariant) systems the Hamilton and Maupertuis principles are related by a Legendre transformation (Gray et al. 1996a, 2004). Recall first that the Lagrangian $$L \left(q\; ,\; \dot{q}\right)$$ and Hamiltonian $$H(q, p)$$ are so-related, i.e. $\tag{6} H \left(q\; ,\; p\right)\; =\; p \dot{q}\; -\; L \left(q\; ,\; \dot{q}\right)\quad ,$

where in general $$p \dot{q}$$ stands for $$p_1 \dot{q_1} + p_2 \dot{q_2} + \; ... + \; p_f \dot{q_f}$$. If we integrate (6) with respect to $$t$$ along an arbitrary virtual or trial trajectory between two points $$q_A$$ and $$q_B\ ,$$ and use the definitions (1) and (3) of $$S$$ and $$W$$ we get $$\bar{E}T = W - S\ ,$$ or $\tag{7} S\; =\; W\; -\; \bar{E}\; T\quad ,$

where $$\bar{E}\; \equiv \; \int _{0}^{T}d t\; H/T$$ is the mean energy along the trial trajectory. (Along a true trajectory of a conservative system, with $$\bar{E}= E =$$ const, (7) reduces to the well-known relation (Goldstein et al. 2002) $$S=W-ET\ .$$) From the Legendre transformation relation (7) between $$S$$ and $$W\ ,$$ for conservative systems one can derive Hamilton's principle from Maupertuis' principle, and vice-versa (Gray et al., 1996a, 2004). The two action principles are thus equivalent for conservative systems, and related by a Legendre transformation whereby one changes between energy and time as independent constraint parameters.

The existence in mechanics of two actions and two corresponding variational principles which determine the true trajectories, with a Legendre transformation between them, is analogous to the situation in thermodynamics (Gray et al. 2004). There, as established by Gibbs, one introduces two free energies related by a Legendre transformation, i.e. the Helmholtz and Gibbs free energies, with each free energy satisfying a variational principle which determines the thermal equilibrium state of the system.

## Generalizations

We again restrict the discussion to time-invariant (conservative) systems. If we vary the trial trajectory $$q(t)$$ in (7), with no variation in end positions $$q_A$$ and $$q_B$$ but allowing a variation in end-time $$T$$, the corresponding variations $$\delta S\ ,$$ $$\delta W\ ,$$ $$\delta \bar{E}$$ and $$\delta T$$ for an arbitrary trial trajectory are seen to be related by $\tag{8} \delta S\; +\; \bar{E}\; \delta \; T\; =\; \delta \; W\; -\; T \; \delta \; \bar{E} \; \; .$

Next one can show (Gray et al. 1996a) that the two sides of (8) separately vanish for variations around a true trajectory. The left side of (8) then gives $$\delta S + E \delta T = 0\ ,$$ since $$\bar{E} = E$$ (a constant) on a true trajectory for conservative systems, which is called the unconstrained Hamiltonian principle. This can be written in the standard form for a variational relation with a relaxed constraint$\delta S = \lambda \delta T\ ,$ where $$\lambda$$ is a constant Lagrange multiplier, here determined as $$\lambda = -E$$ (negative of energy of the true trajectory). If we constrain $$T$$ to be fixed for all trial trajectories, then $$\delta T = 0$$ and we have ($$\delta S)_T = 0\ ,$$ the usual Hamilton principle. If instead we constrain $$S$$ to be fixed we get ($$\delta T)_S = 0\ ,$$ the so-called reciprocal Hamilton principle.

The right side of (8) gives $$\delta W - T \delta \bar{E} = 0\ ,$$ which is called the unconstrained Maupertuis principle, which can also be written in the standard form of a variational principle with a relaxed constraint, i.e. $$\delta W = \lambda \delta \bar{E}$$ where $$\lambda = T$$ (duration of true trajectory) is a constant Lagrange multiplier. If we constrain $$\bar{E}$$ to be fixed for the trial trajectories, we get ($$\delta W)_\bar{E} = 0\ ,$$ which is a generalization of Maupertuis' principle (4); we see that the constraint of fixed energy in (4) can be relaxed to one of fixed mean energy. If instead we constrain $$W$$ to be fixed, we get

$(\delta \bar{E})_W = 0\ ,$

which is called the reciprocal Maupertuis principle. In these various generalizations of Maupertuis' principle, conservation of energy is a consequence of the principle for time-invariant systems (just as it is for Hamilton's principle), whereas conservation of energy is an assumption of the original Maupertuis principle.

In all the variational principles discussed here, we have held the end-positions $$q_A$$ and $$q_B$$ fixed. It is possible to derive additional generalized principles (Gray et al. 2004) which allow variations in the end-positions. A word on notation may be appropriate in this regard: the quantities $$\delta S \ ,$$ $$\delta W \ ,$$ $$\delta T$$ and $$\delta \bar{E}$$ denote unambiguously the differences in the values of $$S$$ etc. between the original and varied trajectories, and $$q(t)$$ and $$q(t) + \delta q(t)$$ denote the original and varied trajectory positions at time $$t$$. In considering a generalized principle involving a trajectory variation which includes an end-position variation of, say, $$q_B \ ,$$ one needs a more elaborate notation (Whittaker 1937, Papastavridis 2002) in order to distinguish between the variation in position at the end-time $$t_B$$ of the original trajectory, i.e. $$\delta q_B \equiv \delta q(t = t_B = T) \ ,$$ and the total variation in end-position $$\Delta q_B$$ which includes the contribution due to the end-time variation $$\delta t_B \equiv \delta T$$ if it is nonzero, i.e. $$\Delta q_B = \delta q_B + \dot{q}_B \delta T \ .$$ Since we consider only variational principles with fixed end-positions in this review (i.e.$$\Delta q_B = 0$$), we do not need to pursue this issue here.

As we shall see in the next section and in Section 10, the alternative formulations of the action principles we have considered, particularly the reciprocal Maupertuis principle, have advantages when using action principles to solve practical problems, and also in making the connection to quantum variational principles. We note that reciprocal variational principles are common in geometry and in thermodynamics (see Gray et al. 2004 for discussion and references), but their use in mechanics is relatively recent.

## Practical Use of Action Principles

Just as in quantum mechanics, variational principles can be used directly to solve a dynamics problem, without employing the equations of motion. This is termed the direct variational or Rayleigh-Ritz method. The solution may be exact (in simple cases) or essentially exact (using numerical methods), or approximate and analytic (using a restricted and simple set of trial trajectories). We illustrate the approximation method with a simple example and refer the reader elsewhere for other pedagogical examples and more complicated examples dealing with research problems (Gray et al. 1996a, 1996b, 2004). Consider a one-dimensional quartic oscillator, with Hamiltonian $\tag{9} H\; =\; \frac{p^{2} }{2 m} \; +\; \frac{1}{4} \; C\; x^{4} \quad .$

Unlike a harmonic oscillator, the frequency $$\omega$$ will depend on the amplitude or energy of motion, as is evident in Fig.1. We wish to estimate this dependence. We consider a one-cycle trajectory and for simplicity we choose $$x = 0$$ at $$t = 0$$ and at $$t = T$$ (the period $$2 \pi / \omega$$). As a trial trajectory we take $\tag{10} x(t) = A \sin \omega t\ ,$

where the amplitude $$A$$ is regarded as known and where we treat $$\omega$$ as a variational parameter; we will vary $$\omega$$ such that an action principle is satisfied. For illustration, we use the reciprocal Maupertuis principle $$(\delta \bar{E})_W = 0$$ discussed in the preceding section, but the other action principles can be employed similarly. From the definitions, we find the mean energy $$\bar{E}$$ and action $$W$$ over a cycle of the trial trajectory (10) to be

$\tag{11} \bar{E}\; =\; \frac{\omega }{4 \pi} \; W\; +\; C\; \frac{3\; W^{2} }{32 \pi ^{2} m^2 \omega ^{2} } \quad ,$

$\tag{12} W\; =\; \pi \; \omega \; m\; A^{2} \quad .$

Treating $$\omega$$ as a variational parameter in (11) and applying $$\left(\partial \bar{E}/\partial \omega \right)_{W} \; =\; 0$$ gives $\tag{13} \omega \; =\; \left(\frac{3\; C\; W}{4\; \pi \; m^{2} } \right)^{1/3} \quad .$

Substituting (13) in (11) gives for $$\bar{E}$$ $\tag{14} \bar{E}\; =\; \frac{1}{2} \; \left(\frac{C}{m^{2} } \right)^{1/3} \left(\frac{3\; W}{4 \pi } \right)^{4/3} \quad .$

Eq. (13) can be combined with (12) or (14) to give $\tag{15} \omega = \; \left(\frac{3\; C\; }{4m } \right)^{1/2} A = \; \left(\frac{2\; C\; \bar{E}}{m^{2} } \right)^{1/4} \quad,$

i.e. a variational estimate of the frequency as a function of the amplitude or energy. The frequency increases with amplitude, confirming what is seen in Fig.1.

This problem is simple enough that the exact solution can be found in terms of an elliptic integral (Gray et al. 1996b), with the result $$\omega_{exact}/ \omega_{approx} = 2^{3/4} \pi \Gamma(3/4)/ \Gamma(1/2) \Gamma(1/4) = 1.0075\ .$$ Thus the approximation (15) is accurate to 0.75%, and can be improved systematically by including terms $$B \sin{3\omega t}\ ,$$ $$D \sin{5\omega t}\ ,$$ etc., in the trial trajectory $$x(t)\ .$$

Direct variational methods have been used relatively infrequently in classical mechanics (Gray et al. 2004) and in quantum field theory (Polley and Pottinger 1988). These methods are widely used in quantum mechanics (Epstein 1974, Adhikari 1998), classical continuum mechanics (Reddy 2002), and classical field theory (Milton and Schwinger 2006). They are also used in mathematics to prove the existence of solutions of differential (Euler-Lagrange) equations (Dacorogna 2008).

## Relativistic Systems

The Hamilton and Maupertuis principles, and the generalizations discussed above in Section 7, can be made relativistic and put in either Lorentz covariant or noncovariant forms (Gray et al. 2004). As an example of the relativistic Hamilton principle treated covariantly, consider a particle of mass $$m$$ and charge $$e$$ in an external electromagnetic field with a four-potential having contravariant components $$A^\alpha = (A_0, A_i) \equiv (\phi, A_i)\ ,$$ and covariant components $$A_\alpha = \left(A_{0} ,\; -\; A_{i} \right) \equiv (\phi, - A_i)\ ,$$ where $$\phi(x)$$ and $$A_i(x)$$ (for $$i = 1, 2, 3$$) are the usual scalar and vector potentials respectively. Here $$x = (x^0, x^1, x^2, x^3)$$ denotes a point in space-time. A Lorentz invariant form for the Hamilton action for this system is (Jackson 1999, Landau and Lifshitz 1962, Lanczos 1970) $\tag{16} S\; =\; m\; \int d s\; +\; e\; \int A_{\alpha } \; d x^{\alpha } \quad .$

The sign of the Lagrangian and corresponding action can be chosen arbitrarily since the action principle and equations of motion do not depend on this sign; here we choose the sign of Lanczos (1970) in (16), opposite to that of Jackson (1999). An advantage of the choice of sign of Lagrangian $$L$$ implied by (16), as discussed briefly by Gray et al. (2004) and in detail by Brizard (2009) who relates this advantage to the consistent choice of sign of the metric (given just below), is that the standard definitions of the canonical momentum and Hamiltonian can be employed - with the other choice unorthodox minus signs are required in these definitions (Jackson 1999). A disadvantage of our choice of sign is that our action is a maximum for short true trajectories, rather than the traditional minimum, and correspondingly our $$L$$ approaches the negative of the standard nonrelativistic Lagrangian in the nonrelativistic limit (Brizard 2009). The four-dimensional path in (16) runs from the initial space-time point $$x_{A}$$ to the final space-time point $$x_{B} \ ,$$ with corresponding proper times $$s_A$$ and $$s_B\ .$$ Here $$ds$$ is the infinitesimal interval of the path (or of the proper time), $$ds^2 = dx_\alpha dx^\alpha = g_{\alpha \beta} dx^\alpha dx^\beta = dx_{0}^{2} \; -\; dx_{i}^{2} \ ,$$ the metric has signature ($$+\ ,$$ $$-\ ,$$ $$-\ ,$$ $$-$$), and we use the summation convention and take $$c$$ (speed of light ) $$= 1\ .$$ $$S$$ itself is not gauge invariant, but a gauge transformation $$A_\alpha \rightarrow A_\alpha + \partial_{\alpha} f$$ (for arbitrary $$f(x)$$), where $$\partial_{\alpha} = \partial / \partial x^{\alpha} \ ,$$ adds only constant boundary points terms to $$S\ ,$$ so that $$\delta S$$ is unchanged. The Hamilton principle is thus gauge invariant.

If we introduce a parameter $$\tau$$ along the four-dimensional path (a valid choice is proper time $$s$$ along the true or any virtual path), we can write $$S$$ in standard form, $$S = \int L d \tau\ ,$$ where $$L = m [v_\alpha v^\alpha ]^{1/2} + e A_\alpha v^\alpha$$ is the Lagrangian and $$v^\alpha = dx^\alpha /d \tau \ .$$ The Euler-Lagrange equation yields the covariant Lorentz equation of motion $\tag{17} m\; \frac{d\, v_{\alpha } }{d\, s} \; =\; e\; F_{\alpha \beta } \; v^{\beta } \quad ,$

where $$F_{\alpha \beta} = \partial_\alpha A_\beta - \partial_\beta A_\alpha$$ is the electromagnetic field tensor, and we have chosen the parameter $$\tau = s\ ,$$ the true path proper time. Specific examples, such as an electron in a uniform magnetic field, are discussed in the references (Gray et al. 2004, Jackson 1999). As discussed below, the equations for the field (Maxwell equations) can also be derived from an action principle.

Action principles are important also in general relativity. First note from (16) that for a special relativistic free particle the action principle $$\delta S = \delta \int ds = 0$$ can be interpreted as a "principle of stationary proper time" (Rohrlich 1965), or more colloquially as a "principle of maximal aging" (Taylor and Wheeler 1992). The proper time is stationary, here a maximum, for the true trajectory (which is straight in a Lorentz frame) compared to the proper time for all virtual trajectories. The principle of stationary proper time, or maximal aging, is also valid in general relativity for the motion of a test particle in a gravitational field (Taylor and Wheeler 2000); for "short" true trajectories the proper time is a maximum, and for "long" true trajectories ("long" and "short" trajectories are defined in Section 5) the proper time is a saddle point (Misner et al. 1973, Wald 1984, Gray and Poisson 2011). The corresponding Euler-Lagrange equation of motion is the relativistic geodesic equation. In general relativity the Einstein gravitational field equations can also be derived from an action principle, using the so-called Einstein-Hilbert action (Landau and Lifshitz 1962, Misner et al. 1973).

General relativity is perhaps the first, and still the best, example of a field where new laws of physics were derived heuristically from action principles, since Einstein and Hilbert were both motivated by action principles, at least partly, in establishing the field equations, and the principle of stationary proper time was used to obtain the equation of motion of a test particle in a gravitational field. A second example is modern (Yang-Mills type) gauge field theory. Some of the pioneers (e.g., Weyl, Klein, Utiyama) explicitly used action principles to implement their ideas, and others, including Yang and Mills, used them implicitly by working with the Lagrangian (O'Raifeartaigh and Straumann, 2000). Some of the early gauge theories were unified field theories of gravitational and electromagnetic fields interacting with matter, and other early unified field theories developed by Einstein, Hilbert and others were also based on action principles (Vizgin 1994). Modern quantum field theories under development, for gravity alone (Rovelli 2004) or unified theories (Freedman and Van Proeyen 2012, Zwiebach 2009, Weinberg 2000), are usually based on action principles. The earliest general quantum field theory (Heisenberg and Pauli 1929-30), essentially the theory used in the 1930s for quantum electrodynamics, strong, and weak interactions (Wentzel 1949), and the basis of one of the modern methods (Weinberg 1995), derives from action principles; commutation relations (or anticommutation relations for fermion fields) are applied to the field components and their conjugate momenta, with the latter being determined from the Hamilton principle and Lagrangian density for the classical fields (Section 11). As for the role of action principles in the creation of quantum mechanics in 1925-26, in the case of wave mechanics, following hints given in de Broglie's Ph.D. thesis (Yourgrau and Mandelstam 1968), there was a near miss by Schrödinger using the Maupertuis principle, as described in the next section. Heisenberg did not use action principles in creating matrix mechanics, but his close collaborators (Born and Jordan 1925) immediately showed that the equations of motion in matrix mechanics can be derived from a matrix mechanics version of Hamilton's principle. Later, following a hint from Dirac in 1933, in his Ph.D. thesis in 1942 Feynman formulated the path integral version of quantum mechanics using the classical Hamilton action, which we discuss briefly at the end of the next section (Brown 2005, Feynman and Hibbs 1965). A very general quantum operator version of Hamilton's principle was devised by Schwinger in 1951 (Schwinger 2001, Toms 2007).

## Relation to Quantum Variational Principles

We discuss here only the Schrödinger time-independent quantum variational principle; apart from a few remarks at the end of this section, for discussion and references to the various quantum time-dependent principles, we refer to Gray et al. (2004), Feynman and Hibbs (1965),Yourgrau and Mandelstam (1968), Schwinger (2001), and Toms (2007).

As is well known (e.g. Merzbacher 1998), the time-independent Schrödinger equation $\tag{18} \hat{H}\; \psi _{n} \; =\; E_{n} \; \psi _{n}$

for the stationary states $$\psi_n\ ,$$ with energies $$E_n\ ,$$ is equivalent to the variational principle of stationary mean energy $\tag{19} \left(\delta \; \frac{\left\langle \psi \; \left|\, \hat{H}\, \right|\; \psi \right\rangle }{\left\langle \psi \; |\; \psi \right\rangle } \right)_{n} \; =\; 0\quad ,$

where $$\hat{H}$$ is the Hamiltonian operator corresponding to the classical Hamiltonian $$H(q, p)$$, $$\left\langle \psi_1 \vert \psi_2 \right\rangle$$ denotes the scalar product of two states, and trial state $$\psi$$ in (19) has no constraint on its normalization. (The word stationary is used in this section with two different meanings.) Equation (18) is the Euler-Lagrange equation for (19). The subscript in (19), quantum number $$n\ ,$$ indicates a constrained variation of $$\psi$$ such that $$\psi_n$$ is the particular stationary solution selected; for example, to obtain the ground state, for which (19) is a minimum mean energy principle, one could restrict the search to nodeless trial functions $$\psi\ .$$ As mentioned earlier, (19) is the basis of a very useful approximation scheme in quantum mechanics (Epstein 1974, Drake 2005), analogous to the direct use of classical action principles to solve approximately classical dynamics problems (see Section 8 above).

The reader will notice the striking similarity of (19) to one of the classical variational principles discussed above in Section 7, i.e. the reciprocal Maupertuis principle applied to the case of stationary (steady-state) motions: $\tag{20} \left(\delta \bar{E}\right)_{W} \; =\; 0\quad .$

Here the time average $$\bar{E} \; \equiv \; \int _{0}^{T}dt \; H/T$$ is over a period for periodic motions, and is over an infinite time interval for other stationary motions, i.e., quasiperiodic and chaotic. The classical mean energy $$\bar{E}\; \equiv \; \int _{0}^{T}d t\; H/T$$ in (20) is clearly analogous to the quantum mean energy $$\left\langle \psi \; \left|\, \hat{H}\, \right|\; \psi \right\rangle /\left\langle \psi \; |\; \psi \right\rangle$$ in (19). The constraints ($$W$$ in (20), n in (19)) are also analogous because at large quantum numbers we have for stationary bound motions $$W_n \sim nh$$ (Bohr-Sommerfeld), where $$h$$ is Planck's constant. Thus fixed $$n$$ and fixed $$W$$ are equivalent, at least for large quantum numbers.

The above heuristic arguments can be tightened up. First, (20) can be derived (in simple cases) in the classical limit $$(h \to 0)$$ from (19) (Gray et al. 1996a). Conversely, one can "derive" quantum mechanics (i.e. (19)) by applying quantization rules to (20) (Gray et al. 1999).

Schrödinger, in his first paper on wave mechanics (Schrödinger 1926a), tried to derive the quantum variational principle from a classical variational principle. Unfortunately he did not have available the formulation (20) of the classical action principle, and, in his second paper (Schrödinger 1926b), abandoned this route to quantum mechanics. In his second paper, instead of using the Maupertuis action principle directly he used the Hamilton-Jacobi equation for the action $$W$$, which is a consequence of a generalized action principle due to Hamilton, as described briefly in Section 2. This enabled him to exploit the analogy between ray and wave optics, on the one hand, and particle and wave mechanics, on the other. He showed that just as in optics, where the short-wavelength or geometric optics equation for families of rays (the optical Hamilton-Jacobi or eikonal equation) generalizes to the standard wave equation when the wavelength is not short, in mechanics the Hamilton-Jacobi equation for families of particle trajectories can be regarded as a short-wavelength wave equation and generalized to a wave equation describing particles with nonzero de Broglie wavelength (the Schrödinger equation). Schrödinger worked with the time-independent versions of the equations and thus first found the time-independent Schrödinger equation (18), from which he later found (in part IV of his series in 1926) the time-dependent Schrödinger equation. It is a bit simpler to work with the time-dependent versions of the equations, which first gives the time-dependent Schrödinger equation, from which one can then find the time-independent one in the now standard way.

A semiclassical variational principle can be based on the reciprocal Maupertuis principle (20) (Gray et al. 2004). For simplicity, consider first one-dimensional systems. Thus, for bound states, one first determines the classical energy of a periodic orbit as a function of the one-cycle action $$W$$ by solving (20) as described earlier (e.g. see eq.(14) for the quartic oscillator), and then imposes the Bohr-Sommerfeld quantization condition (or one of its refinements) on action $$W$$. This gives the allowed energies semiclassically as a function of the quantum number. Thus, from (14) and the Bohr-Sommerfeld quantization condition $$W_n=nh$$, for a quartic oscillator we find the semiclassical estimate $$E_n=(1/2)(C/m^2)^{1/3} (3n \hbar /2)^{4/3}$$, where $$\hbar = h/2 \pi$$ and $$n= 0, 1, 2, ...~$$ . A simple refinement is obtained by replacing the Bohr-Sommerfeld quantization rule $$W_n = nh$$ by the modified old quantum theory rule due to Einstein, Brillouin, and Keller (EBK), $$W_n = (n + \alpha)h$$, where $$\alpha$$ is the so-called Morse-Maslov index. The latter is most easily derived in the Wentzel-Kramers-Brillouin or WKB-like semiclassical approximations in wave mechanics or path integrals, and accounts approximately for some of the quantum effects missing in Bohr-Sommerfeld theory, such as zero-point energy, the uncertainty principle, wave function penetration beyond classical turning points and tunnelling. For example, for a harmonic oscillator we have $$\alpha = 1/2$$, and using the harmonic oscillator energy-action relation $$E = W \omega/2\pi$$ (the result corresponding to (14) for a quartic oscillator) we find $$E_n = (n + 1/2) \hbar \omega$$, which happens to be the exact quantum result for a harmonic oscillator. The effect of the Morse-Maslov index is more noticeable at smaller quantum numbers.

The EBK quantization rule was introduced originally to handle nonseparable, but integrable, multidimensional systems (Brack and Bhaduri 1997). An integrable system has at least $$f$$ independent constants of the motion, or "good" actions $$W_i$$, where $$f$$ is the number of degrees of freedom. The classical bound motions are all periodic or quasiperiodic, i.e., nonchaotic. The total Maupertuis action $$W$$ over a long true trajectory is a linear combination of $$f$$ partial or good actions $$W_i$$, i.e., $$W = \sum{_i} N_i W_i$$, where $$N_i$$ is the number of complete cycles with partial action $$W_i$$ in the total trajectory. For integrable systems, the energy can be expressed as a function of $$f$$ good actions, $$E = E(W_1,W_2,...,W_f)$$. Examples of applying the reciprocal Maupertuis principle to multidimensional systems to find $$E(W_1,W_2,...,W_f)$$ approximately, and then quantizing semiclassically using EBK quantization $$W_i = (n_i + \alpha_i)h$$ are reviewed in Gray et al. (2004). This semiclassical approximation method has been applied to estimate energy levels $$E_{n_1,n_2,...,n_f}$$ even for some nonintegrable systems, where strictly speaking, $$f$$ good actions $$W_i$$ and $$f$$ corresponding good quantum numbers $$n_i$$ do not exist. For example, consider the two-dimensional quartic oscillator with mass $$m$$ and potential energy $$V(x,y) = Cx^2y^2$$, a nonintegrable system with just one exact constant of the motion (the energy) and having mostly chaotic classical trajectories. As in Section 8 for the one-dimensional quartic oscillator, we start with the simplest trial trajectory $$x(t) = A_x \cos(\omega_x t), \; y(t) = A_y \cos(\omega_y t)$$. With this trial trajectory the reciprocal Maupertuis principle gives the classical energy approximately as a function of two actions, $$E(W_x,W_y) = (3/4 \pi)(C/\pi m^2)^{1/3}[W_xW_y]^{2/3}$$. EBK quantization (with $$\alpha = 1/2$$) then gives the energy levels semiclassically as $$E_{n_x,n_y} = (3/2)(2C \hbar^4/m^2)^{1/3}[(n_x + 1/2)(n_y + 1/2)]^{2/3}$$, which is found to be accurate to within 5% for the 50 lowest levels in comparison to a numerical calculation. The energy level degeneracies are not given correctly by this first approximation, and simple variational and perturbational improvements are also discussed in the review cited.

As discussed in Section 2, in classical mechanics per se there is no particular physical reason for the existence of a principle of stationary action. However, as first discussed by Dirac and Feynman, Hamilton's principle can be derived in the classical limit of the path integral formulation of quantum mechanics (Feynman and Hibbs 1965, Schulman 1981). In quantum mechanics the propagator $$G(q_A,0\; ;q_B,T)$$ gives the probability amplitude for the system to be found with configuration $$q_B$$ at time $$t = T$$, given that it starts with configuration $$q_A$$ at $$t = 0$$. Feynman's path integral expression for the propagator is $$G(q_A,0\; ;q_B,T) = \int d[q]\; exp(i S[q]/ \hbar)$$, where $$S[q]$$ is the classical Hamilton action functional defined by equation (1) for virtual path $$q(t)$$ which starts at $$A(q_A,0)$$ and ends at $$B(q_B,T)$$, and the functional or path integral $$\int d[q]...$$ (defined precisely in the above references) is over all such virtual paths between the fixed events $$A$$ and $$B$$. In the limit $$\hbar \to 0$$ the phase factors $$exp(i S[q] / \hbar)$$ contributed by all the virtual paths $$q(t)$$ to the propagator cancel by destructive interference, with the exception of the contributions of the one or more stationary phase paths satisfying $$\delta S = 0 \ ;$$ the latter are the classical paths. Thus, in the classical limit, the classical Hamilton principle of stationary action is a consequence of the quantum stationary phase condition for constructive interference.

There is an extensive literature on a variety of systems (particles and fields) studied semiclassically via the Feynman path integral expression for the propagator discussed in the preceding paragraph (Feynman and Hibbs 1965, Schulman 1981, Brack and Bhaduri 1997). A celebrated result of this approach is the Gutzwiller trace formula, which relates the distribution function for the quantized energy levels of the system to the complete set of the system's classical periodic orbits.

## Continuum Mechanics and Field Theory

Action principles can be applied to field-like quantities $$\phi (x, t)\ ,$$ both classically (Goldstein et al. 2002, Landau and Lifshitz 1962, Soper 1976, Burgess 2002, Jackson 1999, Melia 2001, Morse and Feshbach 1953, Brizard 2008) and quantum-mechanically (Dyson 2007, Toms 2007). The systems can be nonrelativistic or relativistic. We have already mentioned above the application of action principles to the electromagnetic and gravitational fields, and to the Schrödinger wave function. These methods are also widely applied in classical continuum mechanics, e.g., to strings, membranes, elastic solids and fluids (Yourgrau and Mandelstam 1968, Lanczos 1970, Reddy 2002).

As our first example, we consider the classical nonrelativistic one-dimensional vibrating string with fixed ends, following Brizard (2008). Assuming small displacements from equilibrium, we find the equation of motion for the transverse displacement $$\phi(x, t)$$ is $\tag{21} \rho \; \frac{\partial ^{2} \; \phi }{\partial \; t^{2} } \; -\; \tau \; \frac{\partial ^{2} \; \phi }{\partial \; x^{2} } \; =\; 0\quad ,$

where $$\rho$$ is the density and $$\tau$$ the tension. Eq. (21) is the well known classical linear wave equation. It is assumed that $$\phi (x, t)$$ is zero at all times at the two ends, $$x = 0$$ and $$x = X\ ,$$ and that $$\phi (x, t)$$ is given for all positions at two times, $$t = 0$$ and $$t = T\ .$$ One easily verifies that the equation of motion (21) follows from the action principle $$\delta S = 0\ ,$$ with the given constraints, where $\tag{22} S\; =\; \int _{0}^{T}d t\; \int _{0}^{X}d x\; \mathcal{L}\; \left(\phi \; ,\; \partial _{t} \phi \; ,\; \partial _{x} \phi \right)\quad ,$

with $\tag{23} \mathcal{L}\; \left(\phi \; ,\; \partial _{t} \phi \; ,\; \partial _{x} \phi \right) = \frac{1}{2} \; \rho \left(\frac{\partial \phi }{\partial t} \right)^{2} \; -\; \frac{1}{2} \; \tau \left(\frac{\partial \phi }{\partial x} \right)^{2}$

the Lagrangian density $$\left(\int_0^{X} dx\; \mathcal{L}\; =\; L \right) , ~ \partial _{t} \phi = \partial \phi / \partial t ~~\text{and}~~\partial _{x} \phi = \partial \phi / \partial x.$$ Because of the simple quadratic Lagrangian density (23), the variation of (22) can readily be done directly; alternatively, we can use the Euler-Lagrange equation for 1D fields $$\phi(x, t)\ ,$$ a natural generalization of (5), $\tag{24} \frac{\partial }{\partial t} \; \left(\frac{\partial \mathcal{L}} {\partial \left(\partial _{t} \phi \right)} \right)\; +\; \frac{\partial }{\partial x} \; \left(\frac{\partial \mathcal{L}}{\partial \left(\partial _{x} \phi \right)} \right)\; -\; \frac{\partial \mathcal{L}}{\partial \phi } \; =\; 0\quad ,$

which also gives (21) for the Lagrangian density (23).

As a second example we consider the classical relativistic description of a source-free electromagnetic field $$F_{\alpha \beta}(x)$$ enclosed in a volume V, where $$x$$ denotes a space-time point and we use covariant notation (see Section 9 above). Because of the structure of the two Maxwell equations which never have source terms (due to the absence of magnetic monopoles) the field $$F_{\alpha \beta}(x)$$ can be represented in terms of the four-potential $$A_{\alpha}(x)$$ by $$F_{\alpha \beta} = \partial_{\alpha} A_{\beta} - \partial_{\beta} A_{\alpha}$$ (Jackson 1999). As in (23) we assume that in general the Lagrangian density $$\mathcal{L}(A_{\alpha},\partial_{\beta} A_{\alpha}) \ ,$$ here a Lorentz scalar, depends at most on the potential and its first derivatives so that the Lorentz invariant action $$S$$ is given by $\tag{25} S \; = \; \int d^4 x \; \mathcal{L}(A_{\alpha},\; \partial_{\beta} A_{\alpha}) \quad,$

where the space-time integration is over the spatial volume $$V$$ and time interval $$T$$. Assuming $$A_{\alpha}(x)$$ is fixed on the boundary of $$(V,T)$$ and setting $$\delta S = 0$$ gives the Lorentz covariant Euler-Lagrange equations $\tag{26} \partial_{\beta} \; \left( \frac{\partial{} \mathcal{L}} {\partial{} \left( \partial _{\beta} A_{\alpha} \right)} \right) \; - \; \frac{\partial \mathcal{L}}{\partial A_{\alpha}} \; = \; 0 \quad , \; \; \alpha \; = \; 0,1,2,3 \quad .$

For a source-free field the Lagrangian density is given by (Jackson 1999, Melia 2001) $\tag{27} \mathcal{L}(A_{\alpha}, \; \partial_{\beta} A_{\alpha}) \; = \; \frac{g_{\mu \mu'} g_{\nu \nu'}}{16 \pi} (\partial_{\mu} A_{\nu} \; - \; \partial_{\nu} A_{\mu}) (\partial_{\mu'} A_{\nu'} \; - \; \partial_{\nu'}A_{\mu'}) \quad ,$

where $$g_{\mu \mu'}$$ is the Lorentz metric tensor defined earlier (Section 9). Again we have a choice of sign in (27) and have chosen that of Melia (2001), opposite to that of Jackson (1999). $$\mathcal{L}$$ defined by (27) is proportional to $$F_{\mu \nu}F^{\mu \nu}$$ and is therefore gauge invariant. From (27) and (26) we find the field equations $\tag{28} \partial_{\beta} \partial^{\beta} A_{\alpha} \; - \; \partial_{\alpha} \partial^{\beta} A_{\beta} \; = \; 0 \quad ,$

which represent the source-free version of the two Maxwell equations which in general contain source terms. As mentioned above, the other two Maxwell equations are satisfied identically by the representation of the field in terms of the four-potential, i.e. $$F_{\alpha \beta} \; = \; \partial_{\alpha} A_{\beta} \; - \; \partial_{\beta} A_{\alpha} \ .$$ Eq.(28) is valid for any choice of gauge; in the Lorenz gauge ($$\partial^{\beta} A_{\beta} \; = \; 0$$) (28) reduces to the simpler form $$\partial_{\beta} \partial^{\beta} A_{\alpha} \; = \; 0 \ ,$$ which is the 3D homogeneous wave equation of type (21). (Note that $$\partial_{\beta} \partial^{\beta} = \partial^2/\partial t^2 - \nabla^2 \ ,$$ where $$\nabla^2$$ is the Laplacian.)

So far we have assumed a source-free field and the Lagrangian density $$\mathcal{L}(A_{\alpha}, \; \partial_{\beta} A_{\alpha})$$ given by (27) is actually independent of $$A_{\alpha} \ .$$ If a prescribed source four-current density $$J_{\alpha}(x)\;=\; (\rho(x),\; -J_i(x))$$ is present, where $$\rho$$ and $$J_i$$ are the charge and three-current densities, respectively, one adds to (27) a term (assuming $$c \; = \; 1$$) $$J^{\mu} A_{\mu}$$ (Melia 2001). The Euler-Lagrange equation (26) now gives the inhomogeneous wave equation $$\partial^{\beta} \partial_{\beta} A_{\alpha} \; = \; 4 \pi J_{\alpha} \ ,$$ where we have again assumed the Lorenz gauge.

## Conservation Laws

Conservation laws are a consequence of symmetries of the Lagrangian or action. For example, conservation of energy, momentum, and angular momentum follow from invariance under time translations, space translations, and rotations, respectively. The link between symmetries and conservation laws holds for particle and continuum systems (Noether's theorem (1918)). The conservation laws can be derived either from the Lagrangian and equations of motion (Goldstein et al. 2002), or directly from the action and the variational principle (Brizard 2008, Goldstein et al. 2002, Melia 2001, Lanczos 1970, Oliver 1994, Schwinger et al. 1998). A treatment which is introductory yet reaches applications to gauge field theory, and includes historical background on Emmy Noether's work and career, is given by Neuenschwander (2011). Noether's theorem is to be discussed elsewhere in Scholarpedia, and we do not go into detail here.

## References (historical)

• Born, M. and P.Jordan (1925). "Zur Quantenmechanik", Zeit. f. Phys.34, 858-888. (English translation available in Sources of Quantum Mechanics, edited by B.L. van der Waerden, Dover, New York, 1968)
• Brown, L. M. (2005), editor. Feynman's Thesis: The Principle of Least Action in Quantum Mechanics, World Scientific, Singapore.
• Clebsch, A. (1866, 2009), editor. Jacobi's Lectures on Dynamics, Hindustan Book Agency, New Delhi. (English translation in 2009 from 1866 German edition of Jacobi's Konigsberg lectures from winter semester 1842-3)
• Goldstine, H.H. (1980). A History of the Calculus of Variations from the 17th Through the 19th Century, Springer, New York.
• Hankins, T.L. (1980). Sir William Rowan Hamilton, Johns Hopkins U.P., Baltimore.
• Heisenberg, W. and W.Pauli (1929). "Zur Quantendynamik der Wellenfelder", Zeit. f. Phys. 56,1-61; (1930) "Zur Quantentheorie der Wellenfelder II", Zeit. f. Phys. 59,168-190.
• Lanczos, C. (1970). The Variational Principles of Mechanics, 4th edition, University of Toronto Press, Toronto.
• Lützen, J. (2005). Mechanistic Images in Geometric Form: Heinrich Hertz's Principles of Mechanics, Oxford U.P., Oxford.
• O'Raifeartaigh, L. and N. Straumann (2000). "Gauge Theory: Historical Origins and Some Modern Developments", Rev. Mod. Phys. 72, 1-23.
• Schrödinger, E. (1926a). "Quantisierung als eigenwert problem I", Ann. Phys. 79, 361-376; (1926b). "Quantisierung als eigenwert problem II", Ann. Phys. 79, 489-527. (English translations available in E. Schrödinger, Collected Papers on Wave Mechanics, Chelsea, New York, 1982)
• Terrall, M. (2002). The Man who Flattened the Earth, University of Chicago Press, Chicago. (biography of Maupertuis)
• Todhunter, I. (1861). A History of the Progress of the Calculus of Variations During the Nineteenth Century, Cambridge U.P., Cambridge.
• Yourgrau, W. and S. Mandelstam (1968). Variational Principles in Dynamics and Quantum Theory, 3rd edition, Saunders, Philadelphia.
• Vizgin, V. P. (1994). Unified Field Theories in the First Third of the 20th Century, Birkhauser, Basel.
• Wentzel, G. (1949). Quantum Theory of Fields, Interscience, New York. (translation of 1942 German edition)