Second order efficiency

From Scholarpedia
Calyampudi Radhakrishna Rao (2009), Scholarpedia, 4(3):7084. doi:10.4249/scholarpedia.7084 revision #91751 [link to/cite this article]
(Redirected from Fisher-Rao theorem)
Jump to: navigation, search

The Fisher–Rao Theorem provides an asymptotic bound to loss of information in replacing the sample by an estimator of the unknown parameters. Rao Theorem provides a lowed bound to asymptotic variance of an estimator up to terms of \(O(1/n^2)\ .\)



Denote by \(L(X,\theta)\) the likelihood based on a sample of \(n\) independent observations, \(x_1,\cdots,x_n\) which we represent by \(X\ .\) Further let

\[ Z(\theta) = \frac{\partial \log L}{\partial \theta}, ni(\theta) = V \left[Z(\theta) \right] \]

where \(ni(\theta)\) is Fisher information in the sample. Let \(T\) be an estimator of \(\theta\) and \(M(T,\theta)\) be the likelihood based on \(T\ .\) Define:

\[ ni_T = V \left[ \frac{\partial \log M}{\partial \theta} \right] \]

as the information contained in \(T\ .\) Fisher (1925) considered

\[ E^\prime = \lim n(i-i_T) \]

as a measures of efficiency of T and showed that the maximum likelihood estimator has the minimum value for \(E^\prime\) in the estimation of parameter in a \(k\)–category multinomial distribution, with cell probabilities as functions of \(\theta\ .\)

Rao (1961) defined first order efficiency of an estimator \(T\) as the property

\[ \mid n^{-1/2} Z(\theta) - \alpha - \beta n^{1/2} (T-\theta) \mid \to 0 \] in probability as \(n \to \infty\ ,\) for appropriate choice of \(\alpha\) and \(\beta\ ,\) which implies as Doob (1934) showed \(i_T \to i\) as \(n \to \infty\ .\)

Rao (1961) defined the second order efficiency as

\[ E = \min_{\lambda} V_a \left[ Z(\theta) - n^{1/2}\alpha - n\beta(T-\theta) -n\lambda(T-\theta)^2 \right] \]

where \(V_a\) stands for asymptotic variance.

The two concepts of Fisher (1925) and of Rao (1961) are similar but in particular cases \(E\) and \(E^\prime\) may not be the same as pointed out by Efron (1975). However, Fisher (1925) reported \(E\) as his computation of \(E^\prime\ .\) In the case of a multinomial distribution with probabilities in the classes as \(\pi_1(\theta),\pi_2(\theta),\cdots,\pi_k(\theta)\ ,\) \(E\) has the lower bound obtained by Fisher and Rao

\[\tag{1} \frac{\mu_{02}-2\mu_{21}+\mu_{40}}{i} - i - \frac{\mu_{11}^2 + \mu_{30}^2 - 2\mu_{11}\mu_{30}}{i^2} \]


\[ \mu_{rs} = \sum \pi_j \left( \frac{\pi^\prime_j}{\pi_j} \right)^r\left( \frac{\pi^{\prime\prime}_j}{\pi_j} \right)^s \]

which is attained by the maximum likelihood estimator. Efron (1975) called the result (1) as Fisher–Rao Theorem. He extended the computations to exponential family and identified the expression \(E\) as the curvature of the family of distributions at \(\theta\ .\) In another paper, Rao (1961) obtained the expansion of the asymptotic variance of a consistent estimator, corrected for bias of \(O(1/n)\ ,\) up to terms of \(O(1/n^2)\) as

\[\tag{2} \frac{1}{ni} + \frac{\phi}{n^2} + o(1/n^2) \]

and showed that the minimum value of \(\phi\) is

\[\tag{3} \frac{E}{i^2} + \frac{\mu_{11}^2}{2i^4} \]

which is attained by the maximum likelihood estimator (MLE). Ghosh and Subramanyam (1974) called the result (2), (3) as Rao Theorem. They clarified the computations of \(E^\prime\) and \(E\) and extended the results to exponential family of distributions.


After Fisher (1922) introduced maximum likelihood as a general method of estimation of unknown parameters asserting that it provides estimators which are consistent and have least asymptotic variance, several papers appeared questioning Fisher’s claims. Examples have been given of other methods of estimation which yield estimators with the same or better properties. This motivated the author to make a deeper investigation of properties of estimators and methods of estimation. In a series of papers, Rao (1960, 1961, 1962, 1963) introduced the concepts of Fisher consistency, which places a restriction on the estimating function, first order efficiency, correction for bias up to \(O(1/n)\ .\) These concepts bring out maximum likelihood estimates as having better properties than those obtained by other proposed methods.


Second Order Efficiency (SOE) provides an effective measure to choose an estimator with the best possible summary of data for drawing inference. Berkson (1955) claimed that minimum logit–chisquare estimator performs better than the maximum likelihood (ML) estimator. Ghosh and Subramanyam (1974) showed that the ML estimator corrected for bias has better performance in terms of (SOE)


  • Berkson.J (1955), J.Am.Statist. Ass.50, 130-136.
  • Efron, B (1975), Ann. Statist. 3, 1189-1242.
  • Clarke, B. and Ghosal, S. (2008), IMS Collections, 3, 1-18.
  • Fisher, R.A. (1925), Proc. Camb. Phil.Soc., 22, 700-724.
  • Fisher, R.A. (1922), Phil.Trans.Roy. Soc. London, Series A, 222, 309-368.
  • Ghosh, J. and Subramanyam, K (1974), Sankhya, A,36,325-358.
  • Ghosh, J. and Sinha, B.K. (1982), Calcutta Statist. Assoc. Bull. 31,151-158.
  • Ghosh, J.K., Sinha, B.K. and Wieand(1980). Annals of Statistics,
  • Ghosh,J.K., Sinha,B.K., and Joshi(1982), Proc. Third Purdue Symposium,
  • Ghosh, J.K.(1994), Higher Order Asymptotics, IMS monograph,4.
  • Rao, C.R. (1945),Bull.Calcutta Math. Soc., 37, 81-91.
  • Rao, C.R. (1960), Proc.32nd Session of ISI
  • Rao, C.R. (1961), Sankhya, 24, 73-102 (1961)
  • Rao, C.R. (1961), Proc. 4th Berkeley Symposium, 1, 531-546.
  • Rao, C.R. (1962), J.Roy. Statist. Soc. B, 24,46-63.
  • Rao, C.R. (1963), Sankhya, 25, 189-206.

Further reading

There is extensive literature arising out of the work of Fisher and Rao on SOE. The main contributors are Efron (1975), and a number of authors who contributed to the discussion on Efron’s paper and Ghosh with a number of collaborators listed in Clarke and Ghosal (2008). These papers raise a number of questions some of which are not resolved yet. Ghosh and Sinha (1982) showed that MLE does not have third order efficiency. See Rao (1945) for some basic results on estimation. Rao's theorem has been extended well beyond curved exponential families, using Bayesian methods. Ghosh and Subramanyam(1974) show that Bayes estimates can be approximated up to second order by a function of the mle alone,the derivatives at mle are not required.(Such results do not hold for Bayes tests.) It is conjectured there that this can be used to prove Rao's Theorem under general regularity condittions and for more general loss function than squared loss functions and squared error. This program is implemented in Ghosh,Sinha and Wieand(1980). A slightly different proof is offered in Ghosh,Sinha and Joshi(1982).Essentially the same result was obtained by Takeuchi and Akhiara,and Bickel. Goetz and van Zwet. References to all the above papers are available in the monograph by Ghosh(1994) on Higher Order Asymptotics. It may be noted that Rao's second order efficiency is usually called third order efficiency by other authors. If one considers the aymptotic expansion of expected squared error loss of an estimate up to O(1/n^2), one gets two terms in powers of 1/n. hence the term second order. If one approaches through Edgeworth expansions,one has 3 terms in powers of (1/square root of n), hence third order. Second order efficiency in this sense is different from Rao's.

External links

See also

Personal tools

Focal areas