The generalized hyperbolic distribution was pioneered by Ole Barndorff-Nielsen with applications related to wind-blown sand. There are several probability distributions that can be expressed as special cases of the generalized hyperbolic distribution, which is indicative of its versatility. As we shall see in this post, this distribution seems to work pretty well for modeling price fluctuations in financial markets.
In the previous post, I explored the use of the generalized normal distribution to model the price movements of financial instruments. This approach offered better fitting distributions than the normal and Laplace distributions studied in earlier posts. But the shape of the fitted distributions still didn’t quite match the shape of the histogram. In this post, I want to explore a class of probability distributions known as Lévy alpha-stable distributions. And to explore these distributions, we need to understand characteristic functions.
The generalized normal distribution is a family of probability distributions that vary according to a shape parameter. The symmetrical variant of this distribution may go by other names such as the generalized error distribution, the generalized Gaussian distribution, etc. In this post, we will explore this probability distribution and its relationship with the normal distribution and the Laplace distribution. I’ll also show some examples illustrating the use of the maximum likelihood method to estimate the parameters of the distribution using real-life data.
This is a study of the convolution operation and its applications to time series data and probability distributions. In this post, I first demonstrate the use of the convolution operation to find the first differences of some randomly generated time series data. I then show how to find the distribution of the first differences based on the distribution of the values in the original series. I also show how to work backwards using deconvolution, which is the inverse of the convolution operation.
Suppose a gambler is playing a game in which he has a statistical advantage. And let’s assume he can quantify his advantage with a fair amount of accuracy. If the gambler plays this game over and over again, what percentage of his bankroll should he bet on each round if he wants to maximize his winnings? In this post, I explore this question using a series of examples illustrating the application of the Kelly criterion. Many of the ideas presented here are inspired by materials written by Ed Seykota, Edward O. Thorp, and J. L. Kelly.
For the longest time, the Fourier transform remained a bit of a mystery to me. I knew it involved transforming a function in the time domain into a representation in the frequency domain. And I knew it had something to do with sinusoidal waves. But I didn’t understand what it meant to have a frequency domain representation of a function. As it turns out, it’s quite a simple thing once you realize what the frequency values represent. In this post, I explain the discrete Fourier transform by working through a set of examples.
This is an extension of one of my earlier posts on polynomial approximations. Previously, I showed how to find approximate solutions to the weighted coin toss problem using first, second, and third order polynomials to describe the weights of the biased coins. In this post, I demonstrate a generalized method for applying this technique using higher order polynomials.
This post is an extension of the previous post in which I explored some techniques for speeding up the calculations used to find approximate solutions to the coin toss problem. Here I want to examine a couple of enhancements to these ideas. First, I describe an enhanced computation method that cuts the number of floating-point operations required almost in half. Second, I introduce a progressive polynomial approximation technique that can reduce the number of iterations needed to find a solution.
I wrapped up the last post expressing a desire to study the approximation technique using larger models of the coin toss game. Up until now, I was using a naive implementation of the computation method to perform the calculations—an implementation that was crudely implemented and too slow for larger models. In this post, I demonstrate an alternative approach that has a much better performance profile. I also describe a simple technique that can be used to reduce the number of iterations required when applying the hill climbing algorithm.
The previous post demonstrates the use of biases derived from a simple line formula to find an approximate solution to the weighted coin toss problem. In this post, I want to expand on some of these ideas using various polynomial formulas to describe the weights of the biased coins. As this experiment demonstrates, higher order polynomials do seem to yield better results.
In previous studies of the weighted coin toss game, our focus was on finding a set of weights for the biased coins that would yield a given target distribution for the expected outcome. In this post, I want to explore a different approach. Instead of finding an exact solution, I want to try finding an approximate solution using a set of weights based on a parameterized formula. This might produce an approximate solution that is good enough for practical purposes while also being easier to compute for a model with a large number of coin toss events per round.
This is a continuation of a series of posts on weighted coin toss games. In previous posts, we explored variations of the weighted coin toss game using two, three, and four flips per round. In each variation, the game was described using a Markov model with a fixed number of coin toss events. This post presents a generalized form of the Markov model that can be used to model a game with an arbitrary number of coin toss events. I also show a few examples using a model of the coin toss game with ten flips per round.
The two previous posts demonstrated how to use the method of Lagrange multipliers to find the optimum solution for a coin toss game with biased coins of unknown weight. In one case, we found the minimum of a cost function based on the Lagrangian function. In the other case, we found the saddle point of the Lagrangian function itself. The purpose of this post is to provide some visual representations of these functions.
In the last post, we explored the use of gradient descent and other optimization methods to find the root of a Lagrangian function. These optimization methods work by finding the minimum of a cost function. In this post, I want to explore the multivariate form of Newton’s method as an alternative. Unlike optimization methods such as gradient descent, Newton’s method can find solutions that lie on a saddle point, eliminating the need for a cost function. This may or may not be a better approach.
My last few posts have centered around a weighted coin toss game in which the weights of a set of biased coins are determined based on a known target distribution. And while multiple solutions are possible, the inclusion of a scoring function allowed for a unique solution to be found. Until now, I was not sure how to include the scoring function in such a way that I could solve the problem numerically for an arbitrary number of coin tosses. In this post, I show how to use the method of Lagrange multipliers to minimize the scoring function while conforming to the constraints of the coin toss problem.
The previous post demonstrates the use of a hill climbing algorithm to find a set of parameters that minimize a cost function associated with a coin toss game. In this post, I want to explore the use of a gradient descent algorithm as an alternative. The two classes of algorithms are very similar in that they both iteratively update an estimated parameter set. But while the hill climbing algorithm only updates one parameter at a time, the gradient descent approach updates all parameters in proportion to the direction of steepest descent.
The hill climbing algorithm described in my previous post finds the weights of biased coins for a coin toss game in which the distribution of possible outcomes is known. In the example presented, there are many possible solutions. A cost function is used to find a valid solution, and a scoring function is used to narrow down the set of valid solutions to a single result. In this post, I want to look at some visualizations to get a better feel for how the algorithm works.
If you’re climbing a hill, you know you’ve reached the top when you can’t take any further steps that lead to a higher elevation. But if the hill is actually a plateau with a flat top, the topmost point you reach can depend largely on where you started climbing. In this post, I elaborate on the topic of my previous post titled Estimating the Weights of Biased Coins. This post presents the results of an improved hill climbing algorithm and also some ideas for ranking the different solutions that fall on a plateau of valid values.
Suppose we flip a coin four times. If the coin lands on heads, we win a dollar. If the coin lands on tails, we lose a dollar. After four tosses of the coin, the best possible outcome is a winning total of four dollars. The worst possible outcome is a loss of four dollars. Let’s assume the coin is a biased coin. Furthermore, let’s also assume a different biased coin is used on each flip depending on the total amount won or lost since the beginning of the game. How can we determine the bias of each coin given a probability mass function of the expected outcome?
I want to experiment with modeling price changes over time as the combination of a smooth trend component overlaid with a random noise component. My goal is to examine the statistical properties of each constituent component and compare the results to the statistical properties of the undecomposed market price.
In my previous post, I explored the distribution of price fluctuations for a variety of different markets and time frames. Across all data sets, plotting the log returns in a histogram appears to roughly approximate the density function of a Laplace distribution. The intraday prices of the Chinese yuan, however, seem to exhibit a distinctly strange phenomenon.
Are price fluctuations in the financial markets normally distributed? If I understand history correctly, it was French mathematician Louis Bachelier who was the first to explore this topic over 100 years ago. While Bachelier’s work assumed that price movements were normally distributed, a mathematician named Benoit Mandelbrot made some interesting observations that suggest otherwise.
In this post, I want to explore the logarithmic analogues of the normal and Laplace distributions. We can define a log-normal probability distribution as a distribution in which its logarithm is normally distributed. Likewise, a log-Laplace distribution is a distribution whose logarithm has a Laplace distribution. If we have a given probability density function, how can we determine its logarithmic equivalent?
I’m interested in studying the Laplace distribution. I was once under the impression that price fluctuations in the financial markets were normally distributed. However, as I plan to show in a later post, stock prices seem to move up and down according to a Laplace distribution instead. Before analyzing any historical price data, I first want to lay some groundwork and compare the Laplace distribution to the normal distribution.
When doing a regression analysis, you might want to weight some data points more heavily than others. For example, when fitting a model to historic stock price data, you might want to assign more weight to recently observed price values. In this post, I demonstrate how to estimate the coefficients of a linear model using weighted least squares regression. As with the previous post, I also show an alternative derivation using the maximum likelihood method.
The method of least squares estimates the coefficients of a model function by minimizing the sum of the squared errors between the model and the observed values. In this post, I show the derivation of the parameter estimates for a linear model. In addition, I show that the maximum likelihood estimation is the same as the least squares estimation when we assume the errors are normally distributed.
Moving averages are often overlaid on stock price charts to give a smooth representation of choppy price movements. But a simple moving average can lag significantly in a trending market. In this post, I explore the use of least squares regression methods to generate more accurate moving averages.
In my previous post titled Fixed Fractions and Fair Games, I explored the properties of two different betting strategies applied to a repeated coin toss game. The focus was on the expected value for each of the two betting strategies. In this post, I take a deeper look at the distribution of possible outcomes after a large number of plays.
A gambler has a $100 bankroll. He’s feeling lucky and he wants to make some bets. But he only wants to play fair games where the expectation is breakeven for a large number of plays. If the gambler plays a fair game repeatedly using a constant bet amount, would it still be a fair game if he decides to bet a fixed fraction of his bankroll instead of betting a fixed constant amount?
Consider an at-the-money call option with a strike price of $50. The underlying asset is currently trading at $50 per share. Assume it’s a European-style option. One trader wants to take the long side of the contract. Another trader wants to take the short side. How can they agree on a fair price?