A comparison of hypothesis testing methods for ODE models of biochemical systems
In this talk we present a comparison of different methods of testing alternative hypotheses expressed using ODE models of biochemical systems. We investigated applicability, limitations and stability of a range of hypotheses testing methods including maximum likelihood based information criteria, local deterministic approximations around maximum a posteriori estimates (Laplace approximations) for computing marginal likelihoods, importance sampling based marginal likelihood estimators, and a path sampling estimator built upon the principles of thermodynamic integration. We demonstrate that in the cases where models are linear in the parameter space, Laplace approximations provide a fast and stable estimate of the marginal likelihoods required for computing Bayes factors. This estimate, however, fails when the models have non-trivial parameter posteriors. We reject common importance sampling estimators as they produce very unstable estimates in practical cases (relative errors of the estimates vary from 40% to 600% depending on the particular example used). We demonstrate that the annealed importance sampling estimator of the marginal likelihoods and path sampling methods produce very good estimates even in non-trivial cases (relative error within 1%-8%). Maximum likelihood information criteria often produce the correct ordering of the hypotheses. These methods, however, do not produce a quantitative measure of model preference (odds) and sometimes even fail, preferring a more complex model over the true one, and there is no general method to detect such a failure. The study is performed over realistically sized ODE models of biochemical systems using simulated data sets.