If the probability model that describes a population is completely known (along with its parameters), then we can use it to obtain information about the population. However, in the real world, this is rarely the case. Instead, we have observed data. We may have the information that the observed data follows a particular distribution but its parameters are not known. In other words, the form of the distribution from which the observed data is drawn is known (perhaps it is an assumption) but the specific values of the parameters are not known. Then the option we have is to use the observed data to estimate the values of the parameters.
One way to estimate the parameters is the method of moments, which is relative easy to use (for the most parts). This is the focus of the practice problem set. In this post, we discuss the method of maximum likelihood estimation.
The method of maximum likelihood estimation is to maximize the probability or likelihood of observing the data we collected. Suppose that the form of the distribution is known and its density function is . But the parameters are not known. The goal is to choose one particular member of the assumed parametric distribution family that gives the highest likelihood of the observed data. Let’s consider the exponential distribution as an example.
Exponential Example
Suppose it is known that size of claims from a large group of insureds has an exponential distribution with unknown mean . The density function is where . We observe claims . The method of maximum likelihood is to choose the value of that has the highest likelihood of observing these observations. The likelihood of observing the data is:
The goal is to choose the value of so that the function is as large as possible. In other words, the goal is to maximize the function , which is called the likelihood function. In many cases, it is easier to maximize the natural log of .
The function is called the loglikelihood function. The for which is maximum is also a value for which is maximum. The following gives the first and second derivatives of .
Setting the first derivative equal to zero and solving for gives
Plugging into the second derivative produces a negative value. Thus gives the maximum loglikelihood and thus the maximum likelihood . The value is called the maximum likelihood estimate (MLE) of the parameter . It is also called the maximum likelihood estimator of the parameter since is also a function (as the observations change, the estimate will change). Note that is the mean of the sample . In this instance, the maximum likelihood estimate coincides with the method of moments estimate. Though such examples are the exception, several more examples of MLE = method of moments estimates are discussed below.
MLE
As the above example suggests, the first step in maximum likelihood estimation is to come up with the likelihood function and then the loglikelihood function (by taking the natural log of the likelihood function). If there is only one parameter, take the derivative of the loglikelihood function and then set it equal to zero and solve for the parameter. If there are more than one parameters in the loglikelihood function, take partial derivative with respective to each parameter. Then set the resulting partial derivatives equal to zero and solve the resulting system of equations.
The likelihood of a data point (if its value is completely known) is simply the density function evaluated at (for a continuous distribution) or the probability function evaluated at (for a discrete distribution). For a given sample , the likelihood function is simply the product of the likelihoods at the individual data points .
Another point to keep in mind. When working with likelihood function or loglikelihood function, positive constants can be omitted. This is illustrated by the example of normal distribution.
Normal Example
Observations: . We assume that the data are drawn from a normal distribution with parameters and . The following is the density function.
The following is the full likelihood function.
The constant in the last expression can be skipped. When taking the derivative of the loglikelihood function, the log of this constant will become a zero. Thus the essential likelihood function and the loglikelihood function are the following:
Now take partial derivatives of , first with respect to and then with respect to .
Solving the first equation, we obtain the solution . Plug that into the second equation and we produce .
The MLE estimate for the mean for the normal distribution is the sample mean and the MLE estimate for is the sample variance.
Formulas
The MLE method does not always have a closed form calculation. For some distributions, the only way to get MLE estimates is through software package. The following list gives several distributions that have accessible calculation for MLE. The list is by no means exhaustive. The distribution names in red are the ones whose MLE estimates coincide with the method of moments estimates.
.
Exponential Distribution  

Inverse Exponential Distribution  

Normal Distribution  

Lognormal Distribution  

Pareto Distribution  

Weibull Distribution  

Uniform Distribution  

Gamma Distribution  

Binomial Distribution  

Poisson Distribution  

Negative Binomial Distribution  

Remarks
The observed data discussed in all the above examples and formulas are the case for complete data (or individual data). In this scenarios, each data point in the data set is known. In other words, the data is not grouped data (not summarized in any way), not censored and not truncated. For claims data in the form of individual data, no deductible or other insurance coverage modification has been applied. So complete data or individual data is exactly as it is recorded. The next post discusses how to calculate MLE for grouped data and censored or truncated data.
actuarial practice problems
Dan Ma actuarial
Daniel Ma actuarial
Daniel Ma Math
Daniel Ma Mathematics
Actuarial exam
2018 – Dan Ma
Tagged: Maximum Likelihood Estimation, Maximum Likelihood Estimators, Method of Moments
[…] post continues the preceding post on maximum likelhood estimation. The preceding post focuses on calculating MLE when there is […]
[…] estimation. The practice problems are to reinforce the concepts discussed in two posts – this one and this one. The first post shows how to obtain maximum likelihood estimates given complete data […]
[…] one present basic practice problems to reinforce the concepts discussed in two posts – this one and this one. The first post shows how to obtain maximum likelihood estimates given complete data […]