Talk:Dynamic causal modeling
I will make a list of suggested changes to improve the clarity of the page as I go along.
2. In the motivation could a link be placed to an online description of Martingales, as the reader (particularly with an experimental neuroscience background, but even those of a more theoretical bent(!) may well be unfamiliar with/welcome a reminder of the theory of random processes.
3. Could something be done to improve Figures 1 and 2? For example, in Figure 2, the inclusion of equations of motion relating to the BOLD signal change does not to me assist with an understanding? I think if these figures could somehow be changed to be a purely schematic representation of the hierarchical structure of DCM this would definitely help. More generally, it would be good to keep somehow all of the technical details, whilst at the same time providing a 'glossy' overview - as this would greatly increase the potential readership, from simply those with a thorough understanding of dynamical systems theory.
4. DCM for fMRI - could the authors consider adapting the introduction to provide details of why a bilinear model is appropriate / 'the simplest' model to use in this setting. Could they perhaps relate the choice of model to some kind of assumptions regarding the underlying neural system maybe?
5. DCM for evoked responses - the underlying model is now a neural mass model. It would definitely help the exposition to describe why for evoked responses this is a necessary model, whereas it is not so for fMRI. Perhaps an additional introductory section (or an extension of the motivation) before all of these, where the authors explain how model choice and data type are related would be good. As for Figures 1 and 2 - Figure 3 would be much better if it was somehow a schematic rather than a set of detailed equations.
6. Model evidence and selection - could the authors expand a little on how these quantities (AIC), (BIC) and (F) are obtained? They are again phrases, which one should perhaps know instantly what they mean, but it would be very nice for an explanation to be provided at this point. For example, in the description of the EM method F is a function, depending on q, \lambda and m - could the authors provide an explanation of how these dependencies arise, in a manner that is easily accessible to readers from a variety of backgrounds?
7. Model evidence and selection - they authors reference the work of Raftery regarding interpretations of the Bayes Factors as weak, strong etc. Could they expand on why these particular values have the meaning associated with them. It would be nice to keep the article, where possible, closed and certainly a work from the mid 1980s may be difficult to track down. For example the authors mention the BF as being essentially an exponential of the difference in log evidences. Could they then provide a simple example of how a value of <3 corresponds to weak, vs one in excess of 150 for strong?
8. The application sections are a nice touch - relating nicely the early theory. Could the authors though link the evidence for one model over the others more directly in the text, to the previous interpretation of BF?
In summary I think this will provide a very readable introduction to DC for fMRI and I would hope that once live, it would be regularly extended and updated.
Overall opinion :
DCM provides a method for analyzing brain functions using a unified model of two sub-models. The method is an important step forward towards a more sophisticated level of modeling of the brain function than is usual in conventional methods.
The present reviewer thinks this is a very important contribution to the development of the study of the functioning of the brain dynamics. Certainly the article is well qualified to be included in Scholarpedia.
Although many achievements are demonstrated in the actual modeling examples given, I do find, however, a few comparatively minor points which are I think it would be worth the authors’ consideration. Especially if the article is to be valid for the next few decades, improving these points will surely be helpful to the future readers.
The present article consists of two parts. One concerns “A dynamic modeling of brain functions using unobserved latent variables, as well as observables variables”, and another concerns “The estimation of a model from partially observed time series data”.
My comments for the improvement of the article mainly concern the estimation part.
-) The points are:
E1) Apart from the Bayesian method, another method exists (the classic method called the prediction-error method) for the estimation of the DCM type model (a typical example is seen in Valdes,P. et al.(1999) where the estimation of the Zetterberg Model for the dynamics of neural mass is discussed using the classic approach).
Since obviously space is limited, it is not necessary for the authors to discuss and compare both methods in the present, but at least I think they could mention about this classic approach as a possible alternative for the estimation of the coupling model (DCM).
This is because the Bayesian approach is known to be computationally inferior and inaccurate in estimations (in simpler linear examples (Shumway and Stoffer,1982)) compared with the classic method. In future, this point (the difference between the two methods) could become a very critical issue when the DCM becomes a more common method for the modeling of dynamic brain functions. (Note that, unlike the whitening-based classic method, the Bayesian method does not explicitly seek temporal independence of the prediction errors obtainable from the “dynamic” model. Maximization of the log-likelihood of a dynamic model is essentially equivalent to the minimization of prediction errors of the dynamic model. The variance of the temporally dependent prediction errors can be always reduced by whitening the errors into temporally independent errors, and the log-likelihood could theoretically become infinity when the variance of the prediction errors reach zero. )
E2) In the section concerning model evidence and selection, the estimation method of Bayesian dynamic coupling model is not well presented.
For example :
E2.1) The maximization of the marginal likelihood should be emphasized before the EM algorithm. EM is just one of the computational methods for maximizing the marginal likelihood.
The maximum marginal likelihood method is known as the Type-II likelihood method (I.J.Good(1965)) in statistics. Even Dempster, the originator of the EM algorithm, could not completely ignore the important original contribution of I.J.Good to Bayesian model estimation and the EM algorithm. It would be more suitable if the authors referred to the type II likelihood method in the introduction, especially for future readers of the articles.
E2.2) As is mentioned in the article, the model selection method between several possible Bayesian models is one of the important factors when users start applying the method in the modeling and analysis of real data from various experiments.
The authors briefly explain this point by referring to the work of Raftery (1995). However, a more relevant article may be Akaike(1983)’s article on ABIC ( ABIC is different from AIC) which is a generalized version of I.J.Good’s type II likelihood method. Using ABIC, several different Bayesian models (as well as non-Bayesian models) can be compared on the same log basis, as approximations to the single true model. Incidentally, the metric employed in ABIC (as well as in AIC), for the measurement of the deviation of the approximate models from the true model is the negative of the generalized Boltzmann entropy (see Akaike, 1983),which is equivalent to the Kullback-Leibler divergence employed by the authors for DCM.
E2.3) At the beginning of the page 4, the authors briefly discuss AIC , BIC and the free energy method relating to the computational problem of the marginal likelihood. This part is not very well written and is a little confusing. The AIC and BIC are criteria for the determination of model orders of non-Bayesian models such as autoregressive models. It would be more relevant and suitable to compare and discuss the free energy method with the model selection criteria for the Bayesian models, such as the type II likelihood method and the ABIC method, rather than AIC and BIC.
E3) As is mentioned by the authors in the article, the search for the best model is very important. One of the points missing in the present article is the diagnostic checking of the model. Statistical modeling is an endless work of updating and improving, where diagnostic checking plays an important role in the further improvement of the model. Box(1980) may be one of the useful references for the interested readers of the present article.
I suggested the inclusion of "time series" before "data" in the beginning of the article. I come from a different background I had no previous contact with the data. This simple statement (found inLohmann et al. 2012) made me understand how data was acquired and models fitted.
(1) Akaike, H.(1983), “Prediction and Entropy”, In “A Celebration of Statistics”, ed. A.C.Atkinson and S.E.Feinberg, New York, Springer, 1-24.
(2) Box,G.E.P.(1980),”Sampling and Bayes’ Inference in Scientific Modelling and Robustness”, J. Roy. Stat. Soc., Series A, Vol.143, 383-430.
(3) Good, I.J.(1965), “The Estimation of Probabilities: An Essay on Modern Bayesian Methods”, Cambridge, Mass, MIT Press.
(4) Raftery, K.R.(1995), “Bayes Factors”, J. Amer. Statist. Assoc., 773-795.
(5) Shumway, R.H. and Stoffer, D.S.(1982), “An Approach to Time Series Smoothing and Forecasting using the EM Algorithm”, J. of Time Series Analysis, Vol.3, 253-264.
(6) Valdes, P.A., Jimenez, J.C., Riera, J. Biscay, R. and Ozaki, T. (1999), “Nonlinear EEG analysis based on a neural mass model”, Biol. Cybern., 81, 415-424.
(7) Gabriele Lohmanna et al. Critical comments on dynamic causal modelling.2012, Neuroimage. Volume 59, Issue 3, 1 February 2012, Pages 2322–2329