Post-publication activity

Curator: Martin Buhmann

Radial basis functions are means to approximate multivariable (also called multivariate) functions by linear combinations of terms based on a single univariate function (the radial basis function). This is radialised so that in can be used in more than one dimension. They are usually applied to approximate functions or data (Powell 1981,Cheney 1966,Davis 1975) which are only known at a finite number of points (or too difficult to evaluate otherwise), so that then evaluations of the approximating function can take place often and efficiently.

Primarily in computational applications, functions of many variables often need to be approximated by other functions that are better understood or more readily evaluated. This may be for the purpose of displaying them frequently on a computer screen for instance, so computer graphics are a field of practical use. Radial basis functions are one efficient, frequently used way to do this. Further applications include the important fields of neural networks and learning theory. Since they are radially symmetric functions which are shifted by points in multidimensional Euclidean space and then linearly combined, they form data-dependent approximation spaces. This data-dependence makes the spaces so formed suitable for providing approximations to large classes of given functions. It also opens the door to existence and uniqueness results for interpolating scattered data by radial basis functions in very general settings (in particular in many dimensions).

Indeed, one of the greatest advantages of this method lies in its applicability in almost any dimension (whence its versatility) because there are generally little restrictions on the way the data are prescribed. Examples below include positive definite kernels where there are no restrictions on the data except that they need to be at distinct points. This should be contrasted to, e.g., multivariable polynomial interpolation (but see de Boor and Ron 1990 for an especially flexible approach) or splines. A further advantage is their high accuracy or fast convergence to the approximated target function in many cases when data become dense.

For applications it is indeed desirable that there are few conditions on the geometry or the directions in which the data points have to be placed in space. No triangulations of the data points or the like are required for radial basis function algorithms, whereas for instance finite element (Brenner and Scott 1994,Ciarlet 1978) or multivariate spline methods (de Boor 1993,Lai and Schumaker 2007) normally need triangulations. In fact, the advance structuring of the data that some other approximation schemes depend on can be prohibitively expensive to compute in applications, especially in more than two dimensions. Therefore our approximations here are considered as meshfree approximations, also for instance to be used to facilitate the numerical solution of partial differential equations (Fasshauer 2007). On the other hand, as will be mentioned below, advanced numerical methods for computing the radial basis function approximations are required when the data set is large while more standard software is required for instance for finite elements.

## Definition of the method

### The given data

The method usually works in $$n$$ dimensional Euclidean space which we call $$R ^n$$ fitted with the Euclidean norm $$\|\cdot\|\ .$$ Other choices of spaces are possible. There are $$m$$ points in this space at which the function to be approximated is known, call them $$x_1, x_2, \ldots, x_m\ .$$ These points are usually assumed to be all different from each other, otherwise the problem will become singular when interpolation is used. Both $$n$$ and $$m$$ are positive integers.

The given values at the points may be called $$f(x_1), f(x_2), \ldots, f(x_m),$$ the idea being that they come from a function $$f: R ^n\to R$$ which is evaluated at the respective points. This function may be completely unknown except at those $$m$$ points or, for example, too difficult or time-consuming to evaluate otherwise. Although this is a useful ansatz, other approaches without an underlying function are possible which allow coalescing points with different values, using the idea of spline smoothing (Wahba 1990).

Functions $$f: R^n\to R^k \ ,$$ where $$k$$ is a positive integer as well, are also admitted in the concept of radial basis functions; approximations are then carried out componentwise in the $$k$$ components of $$f\ :$$ $$f=(f_1,f_2,\ldots,f_k), f_i:R^n\to R, 1\leq i\leq k,$$ each. Generalising the radial basis function approach to matrix-valued kernels, alternative ideas are also available (e.g., Lowitzsch 2005).

### The approximation scheme

Given this information, we create the sought approximant by a sum

$\tag{1} s(x) = \sum_{j=1}^m \lambda_j \phi (\| x-x_j \|),\qquad x\in R^n,$

where

• the mentioned $$x_j$$ are the data points, at which we know $$f\ ,$$ which lie in $$R ^n\ ,$$
• the $$x$$ is a free variable at which we wish to evaluate our approximant later,
• the $$\phi$$ is a univariate, normally continuous function $$\phi:R_+\to R\ ,$$ namely the radial basis function (examples will be given below),
• the $$\|\cdot\|$$ denotes a norm $$\|\cdot\|: R ^n\to R \ ,$$ normally the Euclidean norm but there are more general approaches, and
• the $$\lambda_j$$ are scalar parameters.

In fact, norms other than Euclidean are possible, but rarely chosen, and at any rate the individual terms would then of course no longer be radially symmetric about the $$x_j.$$

The scalar parameters are chosen so that $$s$$ approximates $$f$$ in a useful way at all desired points other than at the $$x_j\ ,$$ where it is known anyway.

Thus, in general, the unknown (or difficult to evaluate) function $$f$$ is approximated by a linear expression, which remains conceptually simple even if $$m$$ or $$n$$ become very large. The flexibility of the approach is also based on the radial symmetry of each term (although not, of course, of the whole expression (1)) since its definition essentially depends only on a univariate function.

### Choice of interpolation as approximation method

Most often, radial basis function approximations are used in combination with interpolation, i.e. the scalar parameters $$\lambda_j$$ are chosen, if possible, such that $$s$$ matches $$f$$ exactly at the given $$m$$ points $$x_j\ .$$ This is called interpolation and can be defined by the conditions

$\tag{2} s(x_j) = f(x_j),\qquad j=1,2,\ldots,m.$

These, in combination with the form (1), result in a square, $$m\times m$$ linear system of equations in the $$\lambda_j$$ which may or may not be uniquely solvable. Its interpolation matrix is the square symmetric matrix $\tag{3} A=\Bigl(\phi(\| x_j-x_\ell \|)\Bigr)$

whose non-singularity will guarantee the unique existence of the coefficients $$\lambda_j\ .$$ They are then to be computed in standard ways of solving linear systems (using matrix decompositions) if $$m$$ is small or in non-standard ways (see below) if it is large. When the kernel function in form of the radial basis function is strictly positive definite, the interpolation matrix is a positive definite matrix and non-singular (positive definite functions were considered in the classical paper Schoenberg 1938 for example). Positive definite functions, and their generalisations conditionally positive definite functions, see below, are closely related to reproducing kernel Hilbert spaces (see literature under further reading).

Indeed, for those radial basis functions which are most often used in practice, the linear system is often hard to solve numerically because of high condition numbers but there exists specialised software to compute the sought coefficients even for very large $$m\ .$$

## Examples of radial basis functions

Clearly, a good choice of the $$\phi$$ is important for the quality of the approximation and for the existence of the interpolants. Many choices guarantee the unique existence of (1) satisfying (2) for all $$m$$ and $$n$$ solely under the condition that the data points are all different (Micchelli 1986). This is the case for

• linear radial basis function $$\phi(r)=r\ ,$$ so long as $$m>1\ ,$$
• (Hardy) multiquadrics radial basis function $$\phi(r)=\sqrt{r^2+c^2}\ ,$$ which contains another scalar parameter $$c$$ which may be adjusted to improve the approximation, where the choice $$c=0$$ gives the previous example,
• Gaussian kernel $$\phi(r)=\exp(-c^2 r^2) \ ,$$ which also contains another scalar parameter $$c\neq0$$ which may be adjusted to adapt the approximation, or finally
• (Hardy) inverse multiquadrics radial basis function $$\phi(r)=1/\sqrt{r^2+c^2}\ ,$$ which contains another scalar parameter $$c\neq0$$ which provides further flexibility.

## Further example with linear additional terms

Sometimes, the unique existence of interpolants can be guaranteed with a small variation on the concept by adding low order polynomials to $$s$$ and some mild extra conditions. For example

$s(x) = \sum_{j=1}^m \lambda_j \phi (\| x-x_j \|)+a+b^T x,\qquad x\in R^n,$

where $$a$$ is a scalar and $$b$$ is a vector in $$R ^n$$ will give unique existence of interpolating $$s$$ using thin-plate splines $$\phi(r)=r^2\log r\ ,$$ with its value at the origin declared to be zero, so long as the $$x_j$$ are not collinear and the extra conditions

$\tag{4} \sum_{j=1}^m\lambda_j=0,\qquad \sum_{j=1}^m\lambda_j x_j=(0,0,\ldots,0)^T,$

are fulfilled. Here, $$(0,0,\ldots,0)^T$$ denotes the zero vector in $$n$$ dimensions. Thus, under these conditions, the scalars $$\lambda_j, a$$ and the vector $$b$$ can be solved for uniquely.

These extra conditions (4) take up the new degrees of freedom that come in with $$a$$ and $$b\ .$$ Many further examples of radial basis functions $$\phi$$ exist which are equally useful and sometimes require that polynomials of degree more than one are added and suitable extra conditions similar to (4) are imposed. But the polynomials are normally not of very high degree, constant to cubic being typical. The side conditions (4) have to be adjusted to different order conditions when polynomials of degree other than one are used. Also the geometric conditions (centres not being collinear in the case (4)) will have to be strengthened accordingly (Duchon 1976). Such kernels are no longer positive definite as mentioned above, but conditionally positive definite due to the aforementioned side conditions.

## Properties of the method

### Convergence

In spite of the simplicity of the idea even in high dimensions, good convergence properties have been observed when the $$x_j$$ become dense, for example in compact sub-sets of the space $$R^n\ .$$ In particular Duchon has studied the thin-plate splines and related radial basis functions when the scattered data points are becoming dense, other examples of radial basis functions that are included in his theory being pseudo cubics $$\phi(r)=r^3$$ and $$\phi(r)=r^4\log r$$ (Duchon 1976), (Powell 1994), or with multiquadrics (Madych and Nelson 1992).

For the convergence analysis, one assumes sometimes that the data points are on equispaced grids in $$R ^n\ ,$$ so that infinitely many data are given. The spacing being denoted by $$h\ ,$$ one then lets $$h\to0\ .$$ We find in cases that include most of the radial basis functions mentioned above that the uniform difference between $$s$$ and $$f$$ (the error) goes to zero at the same rate as some power of $$h$$ (Buhmann 2003, Wendland 2005). One also looks specifically for the best possible powers there (saturation orders) when $$f$$ satisfies suitable conditions (Johnson 2000).

An example of a uniform convergence result in $$n$$ dimensions states that multiquadric interpolation on an infinite uniform grid of spacing $$h$$ provides convergence of $$O(h^{n+1})$$ to sufficiently smooth $$f$$ with certain partial derivatives bounded, i.e., the uniform error is bounded above by a fixed multiple of $$h^{n+1}\ .$$ It should be noted here that the exponent of $$h$$ increases with the dimension.

More general convergence theory is given for instance in (Wu and Schaback 1993,Narcowich, Ward and Wendland 2005).

### Computation of the approximations

An unwelcome aspect appears when the linear systems are solved for large $$m$$ since for most radial basis functions the matrix is full and ill-conditioned (Narcowich and Ward 1991). An exception is provided by the radial basis functions of compact support described below.

A typical case is the multiquadric function, where the interpolation matrix has spectral properties which depend both on the parameter and on the distances of the data-points. They can lead to large condition numbers, and of course the matrix is not sparse.

Some preconditioning and iterative methods are to be applied (for an early approach see Dyn and Levin 1983). The case when multiquadrics are used is very important since they are most often used in applications, other important choices the aforementioned thin-plate splines and exponential functions. One class of particularly successful methods for computing interpolants with many centres are Krylov space methods (Powell 1994), others contain particle methods and far field expansions (Beatson et al. 1998, see also the article of Beatson and Greengard in Levesley et al. 1997). General preconditioning methods are also useful, especially if $$m$$ is not too large (Fasshauer 1999).

Other approaches which avoid the difficulty of ill-conditioned interpolation matrices include the idea of quasi-interpolation (e.g. see Buhmann 2003 for a number of useful examples of quasi-interpolation), or the aforementioned spline smoothing.

### Aspects of the parameters in multiquadrics and exponentials

In applications, the parameters $$c$$ in multiquadrics, inverse multiquadrics and exponentials play an important role. The aforementioned ill-conditioning problems become very severe if the parameters go to limits (e.g., $$c\to\infty$$ in (inverse) multiquadrics, or to zero in Gaussian kernels, where the entries of the matrix become constant asymptotically). Nonetheless, methods have been developed to handle this situation and it has even been observed that sometimes the whole approximants $$s$$ tend to polynomial limits in those cases (Larsson and Fornberg 2005).

## Compactly supported radial basis functions

Compactly supported radial basis functions have been invented for the purpose of getting finite-element type approximations (Brenner and Scott 1994). They give rise to sparse interpolation matrices and can be used to solve numerically partial differential equations (Fasshauer 1999). Some of them are piecewise-polynomial as a one-dimensional function $$\phi$$ (usually only two pieces) (Wendland 1995 where there are useful lists of examples provided together with the theory). Under suitable conditions on degree and dimension $$n\ ,$$ they give rise to positive definite interpolation matrices $$A$$ that are banded, therefore sparse, and then of course also regular (for further choices see Buhmann 2001).

A little less flexibility stems from restrictions on $$n$$ which may not be arbitrarily large anymore, but there are still no upper bounds on $$m\ .$$ Also, the positive definiteness of the interpolation matrices (similarly as with Gaussian kernels and inverse multiquadrics) makes the radial basis functions useful in statistics.

## Applications

Applications are manifold, they include

• Buhmann, Martin (2003). Radial basis functions: theory and implementations Cambridge University Press, Cambridge. ISBN 978-0-521-10133-2.
• Fasshauer, Greg (2007). Meshfree Approximation Methods with Matlab World Scientific, Singapore.
• Powell, M J D (1981). Approximation Theory and Methods Cambridge University Press, Cambridge.
• Wendland, H (2005). Scattered data approximation Cambridge University Press, Cambridge.

Internal references

• David Gottlieb and Sigal Gottlieb (2009) Spectral methods. Scholarpedia, 4(9):7504.