## Archive for 2008-08-09

### Inconsistent Maximum Likelihood Estimation: An “Ordinary” Example

The widespread use of the Maximum Likelihood Estimate (MLE) is partly based on an intuition that the value of the model parameter that best explains the observed data must be the best estimate, and partly on the fact that for a wide class of models the MLE has good asymptotic properties. These properties include “consistency” — that as the amount of data increases, the estimate will, with higher and higher probability, become closer and closer to the true value — and, moreover, that the MLE converges as quickly to this true value as any other estimator. These asymptotic properties might be seen as validating the intuition that the MLE must be good, *except *that these good properties of the MLE *do not hold* for some models.

This is well known, but the common examples where the MLE is inconsistent aren’t too satisfying. Some involve models where the number of parameters increases with the number of data points, which I think is cheating, since these ought to be seen as “latent variables”, not parameters. Others involve singular probability densities, or cases where the MLE is at infinity or at the boundary of the parameter space. Normal (Gaussian) mixture models fall in this category — the likelihood becomes infinite as the variance of one of the mixture components goes to zero, while the mean is set to one of the data points. One might think that such examples are “pathological”, and do not really invalidate the intuition behind the MLE.

Here, I’ll present a simple “ordinary” model where the MLE is inconsistent. The probability density defined by this model is free of singularities (or any other pathologies), for any value of the parameter. The MLE is always well defined (apart from ties, which occur with probability zero), and the MLE is always in the interior of the parameter space. Moreover, the problem is one-dimensional, allowing easy visualization. (more…)