If the probability model that describes a population is completely known (along with its parameters), then we can use it to obtain information about the population. However, in the real world, this is rarely the case. Instead, we have observed data. We may have the information that the observed data follows a particular distribution but its parameters are not known. In other words, the form of the distribution from which the observed data is drawn is known (perhaps it is an assumption) but the specific values of the parameters are not known. Then the option we have is to use the observed data to estimate the values of the parameters.
One way to estimate the parameters is the method of moments, which is relative easy to use (for the most parts). This is the focus of the practice problem set. In this post, we discuss the method of maximum likelihood estimation.
The method of maximum likelihood estimation is to maximize the probability or likelihood of observing the data we collected. Suppose that the form of the distribution is known and its density function is . But the parameters are not known. The goal is to choose one particular member of the assumed parametric distribution family that gives the highest likelihood of the observed data. Let’s consider the exponential distribution as an example.
Suppose it is known that size of claims from a large group of insureds has an exponential distribution with unknown mean . The density function is where . We observe claims . The method of maximum likelihood is to choose the value of that has the highest likelihood of observing these observations. The likelihood of observing the data is:
The goal is to choose the value of so that the function is as large as possible. In other words, the goal is to maximize the function , which is called the likelihood function. In many cases, it is easier to maximize the natural log of .
The function is called the log-likelihood function. The for which is maximum is also a value for which is maximum. The following gives the first and second derivatives of .
Setting the first derivative equal to zero and solving for gives
Plugging into the second derivative produces a negative value. Thus gives the maximum log-likelihood and thus the maximum likelihood . The value is called the maximum likelihood estimate (MLE) of the parameter . It is also called the maximum likelihood estimator of the parameter since is also a function (as the observations change, the estimate will change). Note that is the mean of the sample . In this instance, the maximum likelihood estimate coincides with the method of moments estimate. Though such examples are the exception, several more examples of MLE = method of moments estimates are discussed below.
As the above example suggests, the first step in maximum likelihood estimation is to come up with the likelihood function and then the log-likelihood function (by taking the natural log of the likelihood function). If there is only one parameter, take the derivative of the log-likelihood function and then set it equal to zero and solve for the parameter. If there are more than one parameters in the log-likelihood function, take partial derivative with respective to each parameter. Then set the resulting partial derivatives equal to zero and solve the resulting system of equations.
The likelihood of a data point (if its value is completely known) is simply the density function evaluated at (for a continuous distribution) or the probability function evaluated at (for a discrete distribution). For a given sample , the likelihood function is simply the product of the likelihoods at the individual data points .
Another point to keep in mind. When working with likelihood function or log-likelihood function, positive constants can be omitted. This is illustrated by the example of normal distribution.
Observations: . We assume that the data are drawn from a normal distribution with parameters and . The following is the density function.
The following is the full likelihood function.
The constant in the last expression can be skipped. When taking the derivative of the log-likelihood function, the log of this constant will become a zero. Thus the essential likelihood function and the log-likelihood function are the following:
Now take partial derivatives of , first with respect to and then with respect to .
Solving the first equation, we obtain the solution . Plug that into the second equation and we produce .
The MLE estimate for the mean for the normal distribution is the sample mean and the MLE estimate for is the sample variance.
The MLE method does not always have a closed form calculation. For some distributions, the only way to get MLE estimates is through software package. The following list gives several distributions that have accessible calculation for MLE. The list is by no means exhaustive. The distribution names in red are the ones whose MLE estimates coincide with the method of moments estimates.
|Inverse Exponential Distribution||
|Negative Binomial Distribution||
The observed data discussed in all the above examples and formulas are the case for complete data (or individual data). In this scenarios, each data point in the data set is known. In other words, the data is not grouped data (not summarized in any way), not censored and not truncated. For claims data in the form of individual data, no deductible or other insurance coverage modification has been applied. So complete data or individual data is exactly as it is recorded. The next post discusses how to calculate MLE for grouped data and censored or truncated data.
actuarial practice problems
Dan Ma actuarial
Daniel Ma actuarial
Daniel Ma Math
Daniel Ma Mathematics