Understanding Maximum Likelihood Estimation

Maximum Likelihood Estimation (MLE) is a statistical method used for estimating the parameters of a given statistical model. It is based on the principle of maximizing the likelihood function, ensuring that the observed data is most probable under the assumed statistical model.

The Concept of Likelihood

The likelihood of a statistical model is a function that measures the probability of observing the given data for different parameter values. For a set of parameters, $\theta$ , and given observations, $X$ , the likelihood is defined as the probability of $X$ given $\theta$ , denoted as $L(\theta | X)$ . Mathematically, it is written as:

(1) $\begin{equation*}L(\theta | X) = P(X | \theta)\end{equation*}$

Maximum Likelihood Estimation

MLE involves finding the parameter values that maximize the likelihood function. This process consists of the following steps:

Define the likelihood function for the statistical model based on the probability distribution of the data.
Convert the likelihood function to a log-likelihood function for computational ease:

(2) $\begin{equation*} \ell(\theta) = \log L(\theta | X) \end{equation*}$

Differentiate the log-likelihood function with respect to the parameters. Set these derivatives equal to zero to find the maximum likelihood estimates (MLEs).
Solve the resulting equations to obtain the parameter estimates.

Advantages and Limitations of MLE

Advantages

Consistency: As the sample size increases, MLE tends to provide more accurate estimates.
Efficiency: MLE estimates often have the smallest variance among all unbiased estimators.
Invariance: The MLE of a function of parameters is the same function of the MLEs of those parameters.

Limitations

Overfitting: MLE can overfit the model, especially in cases with small sample sizes or large number of parameters.
Sensitivity to Model Specifications: MLE is highly sensitive to the assumed statistical model of the data.
Complexity in Calculation: For complex models, the maximization of the likelihood function can be computationally challenging.

Example Problem and Solution

Problem Statement

Suppose we have a dataset $\{4, 5, 6, 8, 9\}$ , which represents samples drawn from a normal distribution. The task is to estimate the mean ( $\mu$ ) of this distribution, assuming the variance ( $\sigma^2$ ) is known and equals 4 (i.e., $\sigma = 2$ ).

Solution

Given the likelihood function for a normal distribution, the task is to find the value of $\mu$ that maximizes this likelihood.

The likelihood function is:

(3) $\begin{equation*}L(\mu | X) = \prod_{i=1}^{n} \frac{1}{\sqrt{2\pi\sigma^2}} e^{-\frac{(x_i - \mu)^2}{2\sigma^2}}\end{equation*}$

For our dataset $X = {4, 5, 6, 8, 9}$ and $\sigma = 2$ , the log-likelihood function becomes:

(4) $\begin{equation*}\ell(\mu) = -\frac{5}{2} \log(8\pi) - \frac{1}{8} \sum_{i=1}^{5} (x_i - \mu)^2\end{equation*}$

Differentiating $\ell(\mu)$ with respect to $\mu$ gives:

(5) $\begin{equation*}\frac{d\ell}{d\mu} = \frac{1}{4} \sum_{i=1}^{5} (x_i - \mu)\end{equation*}$

Setting this derivative to zero for maximization:

(6) $\begin{equation*}\frac{1}{4} \sum_{i=1}^{5} (x_i - \mu) = 0\end{equation*}$

Solving for $\mu$ gives:

(7) $\begin{equation*}\hat{\mu} = \frac{1}{5} \sum_{i=1}^{5} x_i = \frac{4 + 5 + 6 + 8 + 9}{5} = 6.4\end{equation*}$

Thus, the MLE of the mean of this normally distributed population, given the known variance of 4, is 6.4.

Conclusion

MLE is a powerful and widely-used method in statistical inference, providing a framework for parameter estimation by maximizing the likelihood of observing the given data. While it has its advantages, such as consistency and efficiency, it also faces limitations like sensitivity to model specifications and complexity in calculations.