Normal and Laplace Distributions
I’m interested in studying the Laplace distribution. I was once under the impression that price fluctuations in the financial markets were normally distributed. However, as I plan to show in a later post, stock prices seem to move up and down according to a Laplace distribution instead. Before analyzing any historical price data, I first want to lay some groundwork and compare the Laplace distribution to the normal distribution.
The Normal Distribution
Suppose we have a continuous random variable whose possible values are distributed according to a normal distribution. The probability density function is:
If we have some samples of a random variable that we expect to have a normal distribution, we can estimate the parameters of the density function using the maximum likelihood method described in some of my previous posts. Since it’s more convenient in this case, instead of maximizing the likelihood function, let’s maximize the logarithm of the likelihood function:
We want to know what values for the mean and standard deviation parameters have the highest possible likelihood. To do that, we can figure out where the derivative of the log-likelihood function with respect to each of the parameters is equal to zero. Here is the partial derivative of the log-likelihood function with respect to the mean:
Setting the partial derivative to zero and solving for the mean, we arrive at the following estimated value:
Once we have the value for the mean, we can follow the same steps to solve for the standard deviation. Here is the partial derivative of the log-likelihood function with respect to the standard deviation:
Setting the partial derivative to zero and solving for the standard deviation, we get this estimated value:
If you want to see a more detailed breakdown of the steps above, you can reference my post titled Least Squares and Normal Distributions. As I mentioned in that post, the maximum likelihood estimator for the standard deviation can give an estimate that is too low for small sample sizes. If using a limited sample size, it might be a good idea to apply Bessel’s correction to get a more accurate estimate.
The Laplace Distribution
Suppose we have a continuous random variable whose possible values are distributed according to a Laplace distribution. The probability density function is:
If we have a set of samples of a random variable that we know to have a Laplace distribution, we can estimate the parameters using the same approach we took for estimating the parameters of the normal distribution. We can use the maximum likelihood method. Here is the log-likelihood function we want to maximize:
We want to know what values of the location and scale parameters have the greatest likelihood. The analytical approach is to take the derivative, set it to zero, and solve for the parameters. But consider the absolute value function:
It’s a piecewise function. Taking the derivative of the log-likelihood function with respect to the location parameter can be a bit tricky because the absolute value function, although continuous, is not differentiable at all points:
To be more succinct, we can represent the derivative of the absolute value function using the sign function:
The sign function simply returns the sign of a value:
We can express the partial derivative of the log-likelihood function with respect to the location parameter as:
This is really just giving us the number of samples with a value greater than the location parameter minus the number of samples with a value less than the location parameter. Note also that the derivative is undefined at points where the location parameter equals the value of one of the samples. While not adequate for an analytical solution, this does provide a clue that the best estimate is at or near the median value. Let’s rank our samples in ascending order:
Let’s also choose a middle value that is about halfway between the first and last sample in the ordered set. The exact value depends on whether the total number of samples is an even number or an odd number:
We can glean some insights by looking at a plot of the likelihood value for possible values of the location parameter. When there is an even number of samples, the likelihood function looks like this:
Notice that there is a range of possible values where the likelihood is at a maximum when there is an even number of samples. For an odd number of samples, the likelihood function looks slightly different:
For an odd number of samples, there is a single point at which the likelihood is maximized. By inspection, we can conclude that the median value of our samples has the highest likelihood for the location parameter:
If we have an even number of samples, we just take the mean of the two median values. Once an estimate of the location parameter is known, solving for the scale parameter is a bit easier since there is an analytical solution. Here is the partial derivative of the log-likelihood function with respect to the scale parameter:
Setting the partial derivative to zero and solving for the scale parameter, we get the following estimate:
I think it’s worth mentioning here that this method of estimating the parameters of a Laplace distribution doesn’t sit well with me. Choosing the median value for the location parameter seems like a coarse approach. In cases where there is a range of possible values for the location, I wonder just how wide that range can be in practice. There might be other estimation techniques worth looking into, but I want to see how well this one works with real data before exploring alternatives.
Comparison
The normal distribution and the Laplace distribution are both symmetrical. The density functions of each have a similar structure. And with a small number of samples, it might be difficult to determine if a random variable has a normal distribution or a Laplace distribution. However, there are some important differences that are best shown with an illustration:
Both density functions have the same basic shape. The density plot of the Laplace distribution, however, is taller and skinnier in the middle. It also has fatter tails than the normal distribution. I think those fat tails are worth taking a closer look at. Here is the same chart with the density plotted on a logarithmic scale:
Notice the difference in magnitude for values far from the middle. The probability of observing a value of a normally distributed random variable far from the mean is quite small. The probability of observing the same value, while still small, might be orders of magnitude greater if the random variable has a Laplace distribution.
Comments