Runge-Kutta Methods

Post-publication activity

Prof. John Butcher, Mathematics Department, The University of Auckland, New Zealand

Runge-Kutta methods

To obtain numerical approximations to the solution of the initial value problem \[\tag{1} y'(x) = f(x,y(x)),\quad y(x_0) = y_0, \]

at a sequence of points \(x_1\ ,\) \(x_2\ ,\) \(\dots\ ,\) the Euler method can be used. This method computes in turn \(y_1 \approx y(x_1)\ ,\) \(y_2 \approx y(x_2)\ ,\) \(\dots\) using the formula \[ y_n = y_{n-1} + (x_n-x_{n-1}) f(x_{n-1}, y_{n-1}), \quad n=1,2,\dots . \] As we shall see, the Euler method is often inefficient because of the large number of steps required to achieve a specified accuracy. Furthermore, round-off error accumulation, when many steps are used, can make the numerical results unusable. The reason for poor accuracy is that the Euler method is based on the underlying quadrature approximation \[ \int_{x_0}^{x_1} y'(x) dx \approx (x_1 - x_0) y'(x_0), \] which is accurate only when \(y(x)\) is a polynomial of degree 1. Replacing this by one of the formulae \[ \int_{x_0}^{x_1} y'(x) dx \approx (x_1 - x_0) y'((x_0+x_1)/2),\] or \( \int_{x_0}^{x_1} y'(x) dx \approx (x_1 - x_0) (y'(x_0)+y'(x_1))/2, \) or other formulae which are exact for second degree polynomials, is the essence of the methods proposed by Runge. To use these approximations, a second "stage" has to be evaluated. That is \(y'(x_0)\) is first evaluated so that \(y(x_0+h/2)\) or \(y(x_0+h)\) can then be approximated in preparation for the approximate evaluation of \(y'(x_0+h/2)\) or \(y'(x_0+h)\) respectively. Taking these ideas further, the immediate aim of later authors was to obtain methods which approximate \(y_1 \approx y(x_0+h)\) so that the error in this first and typical step can be bounded by a multiple of \(h^{p+1}\ ,\) where the integer \(p\ ,\) known as the order, takes on increasingly high values.

A method with \(s\) stages can be represented by a tableau, often called the "Butcher tableau",

0	0	0	0	\( \cdots \)	0
\( c_2 \)	\( a_{21} \)	0	0	\( \cdots \)	0
\( c_3 \)	\( a_{31} \)	\( a_{32} \)	0	\( \cdots \)	0
\( \vdots \)	\( \vdots \)	\( \vdots \)	\( \vdots \)		\( \vdots \)
\( c_s \)	\( a_{s1} \)	\( a_{s2} \)	\( a_{s3} \)	\( \cdots \)	0
	\( b_1 \)	\( b_2 \)	\( b_3 \)	\( \cdots \)	\( b_s \)

indicating that the output approximation is equal to \( y_1 = y_0 + h\sum_{i=1}^s b_i F_i \ ,\) where \(F_i\) is equal to \(f(x_0+hc_i, y_0 + h\sum_{j=1}^s a_{ij} F_j)\ .\) For example, the two methods of order 2 already introduced above from the work of Runge, have tableaux given respectively by

0
1/2	1/2
	0	1	,

0
1	1
	1/2	1/2	,

where zero values on and above the diagonal have been omitted.

The work of Runge was extended by (Heun 1900), who completed a discussion of order 3 methods and pointed the way to order 4, and by (Kutta 1901) who gave a complete classification of order 4 methods. The most well-known method, due to Runge, has order 4 and is defined by the tableau

0
1/2	1/2
1/2	0	1/2
1	0	0	1
	1/6	1/3	1/3	1/6	.

Order conditions

The standard differential equation (1), can be replaced by the autonomous system \(y'(x) = f(y(x))\ ,\) as long as it is understood that the problem is multi-dimensional. This greatly simplifies the analysis and also draws attention to a short-coming of analyses based on (1), if is assumed that \(y\) is a scalar-valued variable. This short-coming is that for \(p> 4\ ,\) it is possible to achieve order \(p\) for a single non-autonomous problem but some lower order for a general autonomous problem. A partial explanation of this can be seen from Table 1

Table 1. Number of conditions to achieve a specified order

order \(p\)	1	2	3	4	5	6
Conditions for high-dimensional problem	1	2	4	8	17	37
Conditions for one-dimensional problem	1	2	4	8	16	31

which compares the number of order conditions for each problem. The number of conditions for the high-dimensional general problem to have order \(p\) is equal to the number of rooted-trees with less than or equal to \(p\) vertices. The systematic structure of order conditions was presented in (Butcher 1963); the connection with rooted trees was known to (Gill 1951) and (Merson 1957). An example of a method which has order 5 for a scalar problem, but only order 4 for a system, is presented in (Butcher 1995).

Explicit Runge-Kutta methods

Although it is not known, for arbitrary orders, how many stages are required to achieve this order, the result is known up to order 8 and is given in Table 2. Also shown for comparison is the number of free parameters in an \(s\) stage method.

Table 2. Number of stages to achieve a specified order

order \(p\)	1	2	3	4		5	6		7		8
Conditions	1	2	4	8		17	37		85		200
Stages	1	2	3	4	5	6	7	8	9	10	11
Parameters	1	3	6	10	15	21	28	36	45	55	66

The 8 conditions for order 4 in the case of a 4 stage method are given in (2) - (9), for lower orders, where fewer stages are required, the conditions are identical in form but some terms are omitted from the left-hand sides. For example for a 3 stage method, terms with a factor \(b_4\) are omitted. We see immediately that (9) then becomes contradictory, thus confirming that 4 stages are required for order 4.

\[\tag{2} b_1 + b_2 + b_3 + b_4 = 1, \]

\[\tag{3} b_2 c_2 + b_3 c_3 + b_4 c_4 = 1/2, \]

\[\tag{4} b_2 c_2^2 + b_3 c_3^2 + b_4 c_4^2 = 1/3, \]

\[\tag{5} b_3 a_{32} c_2 + b_4 a_{42} c_2 + b_4 a_{43} c_3 = 1/6, \]

\[\tag{6} b_2 c_2^3 + b_3 c_3^3 + b_4 c_4^3 = 1/4, \]

\[\tag{7} b_3 c_3 a_{32} c_2 + b_4 c_4 a_{42} c_2 + b_4 c_4 a_{43} c_3 = 1/8, \]

\[\tag{8} b_3 a_{32} c_2^2 + b_4 a_{42} c_2^2 + b_4 a_{43} c_3^2 = 1/12, \]

\[\tag{9} b_4 a_{43} a_{32} c_2 = 1/24. \]

The actual task of solving the order conditions to derive specific methods becomes increasingly more complicated as the order increases. However, for low orders, a simple procedure exists and consists of three steps (i) choose appropriate values of \(c_2\ ,\) \(\dots\ ,\) \(c_s\ ,\) (ii) choose \(b_1\ ,\) \(\dots\ ,\) \(b_s\) as solutions to the equations, where applicable, amongst (2), (3), (4), (6), (iii) solve for the \(a_{ij}\) from the remaining equations which involve these quantities linearly. For orders less than 4 these steps are all that is required. However, for order 4, the remaining equation (9) still needs to be satisfied and this imposes a condition on the \(c\) values selected in step (i). This condition is that \(c_4=1\ .\)

Runge-Kutta pairs

In modern software for the solution of initial value problems, the stepsize is allowed to vary, taking account of estimates of the error produced in each step. To achieve this, it is standard practice to build method pairs, based on the same stages which produce an output answer of order \(p\) and a second approximation of order \(q\ .\) Assuming that \(q>p\ ,\) the difference of these two approximations will give an asymptotically correct estimate of the error in the output value. Because, for small \(h\ ,\) the actual local error is approximately proportional to \(h^{p+1}\ ,\) the stepsize in the following step can be chosen to give a value close to that specified as a user tolerance. As order increases, it becomes increasingly costly, in terms of required stages, to build pairs with \(q>p\) and the value \(q=p-1\) is often chosen. Even though the actual error is not estimated, the difference in the two approximations, which now behaves asymptotically like \(h^{q+1}\) can still be used to give satisfactory control. For examples of method pairs see Fehlberg , Verner and Dormand and Prince.

Implicit Runge-Kutta methods

In principle, it is possible to include non-zero coefficients on and above the diagonal in a tableau. For example, the following method has order 4:

\(1/2-\sqrt{3}/6\)	\(1/4\)	\(1/4-\sqrt{3}/6\)
\(1/2+\sqrt{3}/6\)	\(1/4+\sqrt{3}/6\)	\(1/4\)
	\(1/2\)	\(1/2\)

We need to ask what it means to carry out a computational step using this type of "implicit" method. Extending the formulae from explicit methods, it seems that we need to satisfy a sequence of (non-linear) equations for \(Y_i\) and \(F_i\ ,\) \(i=1,2,\dots,s\ .\) Once these have been solved, the values of the \(F_i\) are substituted into the equation \(y_n = y_{n-1} + h\sum_{i=1}^s b_i F_i\) to evaluate \(y_n\) from an input approximation \(y_{n-1}\ .\) We note that \(c_i=1/2 \mp \sqrt3/6\) are the zeros of the second degree Legendre polynomial on the interval \([0,1]\ .\) This method is actually a generalization, to differential equations, of the 2-point Gauss-Legendre quadrature formula. It is remarkable that a similar implicit Runge-Kutta formula exists with order \(2s\) for every positive integer \(s\ .\)

Other families of implicit Runge-Kutta methods exist and many of these have been the subject of systematic study. In particular, the Radau and Lobatto families of methods sacrifice one or two, in terms of order, but have other advantages. For the so-called DIRK methods, also known as SDIRK or semi-explicit or semi-implicit methods, \(A\) has a lower triangular structure with \(a_{ii}=\gamma\ ,\) \(i=1,2,\dots,s\) where the constant \(\gamma\) is chosen for stability reasons. In cases in which the output value \(y_{n-1} + h\sum_{j=1}^s b_j F_j\) is identical with the final stage \(y_{n-1} + h\sum_{j=1}^s a_{ij} F_j\ ,\) it is possible that \(a_{11}\) is equal to \(0\) rather than to \(\gamma\ ,\) without taking away from the essential nature of a DIRK method. An example of such a method is given by the tableau

**Tableau 1.**
\(0\)	\(0\)	\(0\)	\(0\)	\(0\)
\(2\gamma\)	\(\gamma\)	\(\gamma\)	\(0\)	\(0\)
\(1-2\gamma\)	\(\frac{-20\gamma^2 + 10\gamma-1}{4\gamma}\)	\(\frac{(2\gamma-1)(4\gamma-1)}{4\gamma}\)	\(\gamma\)	\(0\)
\(1\)	\(\frac{24\gamma^3-36\gamma^2+12\gamma-1}{12\gamma(1-2\gamma)}\)	\(\frac{12\gamma^2-6\gamma+1}{12\gamma(1-4\gamma)}\)	\(\frac{6\gamma^2-6\gamma+1}{3(4\gamma-1)(2\gamma-1)}\)	\(\gamma\)
	\(\frac{24\gamma^3-36\gamma^2+12\gamma-1}{12\gamma(1-2\gamma)}\)	\(\frac{12\gamma^2-6\gamma+1}{12\gamma(1-4\gamma)}\)	\(\frac{6\gamma^2-6\gamma+1}{3(4\gamma-1)(2\gamma-1)}\)	\(\gamma\)

where for reasons we will discuss later, \(\lambda\approx 0.4358665215\) is a zero of the polynomial \(6\gamma^3-18\gamma^2+9\gamma-1\ .\)

A final remark about this DIRK method. Even though formally there are 4 stages, the first of these is identical to the last stage in the previous step. Hence, in a large number of steps there are effectively only 3 stages per step.

Stability

Figure 1: Stability regions for some explicit RK methods

For many problems, the quality of local approximations can be nullified in the long term because of unstable behaviour. This is especially true of stiff problems, which are characterised by a rapid change in one or more components of the solution, at least over part of the solution interval. One way of exploring what is required for a method to solve these problems reliably is to see what happens when the simple linear equation \(y' = qy\) is solved over a single time step. The constant \(q\) can be a complex number but we focus our attention more on this quantity scaled in terms of the stepsize; that is the complex number \(z=hq\ .\) For a given Runge-Kutta method, define the stability function as the rational function (a polynomial in the case of an explicit method). given by \(R(z) = 1 +zb^T (I-zA)^{-1} e\ ,\) where \(e\in \mathbb{R}^s\) has every component equal to 1. The stability region is the set in the complex plane such that \(|R(z)|\le 1\ .\) To solve the problem \(y'=qy\ ,\) it is necessary to choose \(h\) small enough to ensure that \(z=hq\) is in the stability region, otherwise the sequence of approximations produced in many steps will be unbounded. We interpret this as unstable behaviour. For explicit method with \(s = p\leq4\ ,\) the stability regions are shown in Figure 1.

In the case of a stiff problem, even though it may not have a linear form, there will be growth factors corresponding to values of \(z\) which may lie anywhere in the left half-plane. Accordingly we are especially interested in "A-stable" methods for which all complex numbers with negative real part lie in the stability region.

The importance of stiff methods lies in the observation that explicit methods are not A-stable but many implicit methods are. For example, all the Gauss methods have this property, as does the method given by Tableau 1, if \(\lambda\) is chosen suitably, such as the recommended value given approximately by 0.4358665215.

Implementation costs

As a general rule, it is best to solve nonstiff problems using explicit methods. This should be expected to achieve acceptable accuracy with minimal costs. However, as problems become increasingly stiff, stability rather than accuracy becomes the dominant consideration, and implicit methods become the more appropriate choice.

The implementation of explicit methods is based on the computation of \( s \) evaluations of the function \( f \) per time step. In comparison, linear multistep methods, in a predictor-corrector formulation, require only one or two \( f \) evaluations per time step and this might make them seem to be more efficient than Runge-Kutta methods of the same order. However, this comparison is misleading because, in many cases, Runge-Kutta can obtain the same accuracy as predictor-corrector pairs but with greater stepsizes. Furthermore, in the case of Runge-Kutta methods, stepsize can be adjusted from step to step in a simple manner, something that is not possible for linear multistep methods. Good error estimators are known for many Runge-Kutta methods, just as for predictor-corrector pairs. Personal preferences are catered for in practice, because computer codes based on explicit Runge-Kutta pairs, and on linear multistep methods in predictor-corrector mode, are available as public domain software.

When stiff problems come into consideration, and implicit Runge-Kutta become appropriate, implementation costs are dominated by the iteration scheme required to solve the implicit stage equations. The iteration scheme is usually performed by a simplified Newton method. That is, an approximation to the Jacobian matrix for the problem is computed at the start of the numerical solution and updated whenever necessary. Denote this Jacobian approximation by \(J\) so that \(J=(\partial f/\partial y)\ ,\) evaluated at suitable values of \(y\ .\) From this, the matrix \(M=I-hA\otimes J\) is computed and LU factors are formed. Let \(Y\) be an approximation to the stage values for the method and let \(F\) be the corresponding vector of approximations to the stage derivatives so that \(Y\) is made up from \(s\) sub-vectors, \(Y_i\ ,\) \(i=1,2,\dots,s\) and \(F\) is made up from the corresponding values of \(f(Y_i)\ .\) To carry out a single simplified Newton iteration, in which \(Y\) is replaced by \(Y-\delta\ ,\) the vector of corrections is defined from \(M\delta = \epsilon\ ,\) where \(\epsilon\) is made up of the subvectors \(\epsilon_i\) where \(\epsilon_i = Y_i - h\sum_{j=1}^s a_{ij} F_j\ .\) This is usually a very expensive process and typically solving for the stage values in this way dominates the computational costs. If \(N\) is the dimension of the problem being solved, so that \(J\) is an \(N\times N\) matrix, the cost of the LU factorizations, in terms of floating-point operations, is a moderately sized number multiplied by \(s^3 N^3\) and a single iteration costs a moderately sized number multiplied by \(s^2 N^2\ .\)

These costs can be significantly reduced in the case of DIRK methods and even in cases where all the eigenvalues of \(A\) are real and equal.

Detailed studies of Runge-Kutta methods are included in (Butcher 1987) and (Butcher 2003), and in (Hairer et al. 1993) and (Hairer & Wanner 1991).

References

Alexander, R., Diagonally implicit Runge-Kutta methods for stiff ODEs, SIAM J. Numer. Anal. 14 (1977) 1006-1021.
Butcher, J. C., Coefficients for the study of Runge-Kutta integration processes, J. Austral. Math. Soc. 3 (1963) 185-201.
Butcher, J. C., The numerical analysis of ordinary differential equations, Runge-Kutta and general linear methods, Wiley, Chichester and New York (1987).
Butcher, J. C., On ﬁfth order Runge-Kutta methods, BIT 35 (1995), 202-209.
Butcher, J. C., Numerical methods for ordinary differential equations, Wiley, Chichester and New York (2003).
Dormand, J. R. and Prince, P. J., A family of embedded Runge-Kutta formulae, J. Comput. Appl. Math. 6 (1980), no. 1, 19-26.
Dormand, J. R. and Prince, P. J., High order embedded Runge-Kutta formulae, J. Comput. Appl. Math. 7 (1981), no.1, 67-75.
Fehlberg, E., Klassische Runge-Kutta-Formeln fünfter and siebenter Ordnung mit Schrittweiten-Kontrolle, Computing (Arch. Elektron. Rechnen) 4 1969 93-106.
Gill, S., A process for the step-by-step integration of differential equations in an automatic computing machine, Proc. Cambridge Philos. Soc. 47 (1951), 96-108.
Hairer, E., Nørsett, S. P. and Wanner G., Solving Ordinary Differential Equations I: Nonstiff Problems, Springer-Verlag, Berlin (1993).
Hairer, E. and Wanner, G., Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems, Springer-Verlag, Berlin (1991).
Heun, K., Neue Methoden zur approximativen Integration der Differentialgleichungen einer unabhängigen Veränderlichen, Z. Math. Phys., 45 (1900), 23-38.
Kutta, W., Beitrag zur näherungsweisen Integration totaler Differentialgleichungen, Z. Math. Phys. 46 (1901). 435-453.
Merson, R. H., An operational method for the study of integration processes, Proc. Symposium Data Processing, Weapons Research Establishment, Salisbury, S. Australia, (1957).
Runge, C., Über die numerische Auflösung von Differentialgleichungen, Math. Ann. 46 (1895) 167-178.
Verner, J. H., High-order explicit Runge-Kutta pairs with low stage order. Special issue celebrating the centenary of Runge-Kutta methods, Appl. Numer. Math. 22 (1996), no. 1-3, 345-357.

Internal references

Eugene M. Izhikevich (2007) Equilibrium. Scholarpedia, 2(10):2014.
Lawrence F. Shampine and Skip Thompson (2007) Initial value problems. Scholarpedia, 2(3):2861.
Kendall E. Atkinson (2007) Numerical analysis. Scholarpedia, 2(8):3163.
Philip Holmes and Eric T. Shea-Brown (2006) Stability. Scholarpedia, 1(10):1838.

Runge-Kutta Methods

Contents