Visualizing Saddle Points and Minimums
The two previous posts demonstrated how to use the method of Lagrange multipliers to find the optimum solution for a coin toss game with biased coins of unknown weight. In one case, we found the minimum of a cost function based on the Lagrangian function. In the other case, we found the saddle point of the Lagrangian function itself. The purpose of this post is to provide some visual representations of these functions.
The Model
The model of the coin toss game we’ll use here is a simplified version. It uses only two flips per round instead of three flips or four flips per round as used in previous examples. Here is what the Markov model looks like:
The first coin toss is always a fair coin, by definition. Let’s assume the probability of tossing both a head and a tail, in any order, is twice that of getting either two heads in a row or two tails in a row. Let’s also assume a scoring function that gives preference to weights that are as close to that of a fair coin as possible. The Lagrangian function for this model looks like this:
Like we did previously, we can derive a cost function from the Lagrangian function by computing the magnitude of the gradient of the Lagrangian function. We’ll take the square of the magnitude to simplify the calculation. Here is the cost function:
For such a simple model, it might be obvious what the solution is. The critical points for both functions lie at the same location. But for the Lagrangian function, the critical point exists at a saddle point, whereas the critical point for the cost function lies at the minimum. The illustrations in the following sections provide the visual representations.
Lagrangian Function
The optimum value lies at the point at which the gradient is equal to the zero vector:
Here is a heatmap of the Lagrangian function:
Here is a surface plot of the function at three slightly different angles:
Here is the profile of a slice of the function taken across a diagonal line:
Here is the profile of another slice taken across a different diagonal:
The surface plot resembles the shape of a cowboy hat or a western horse saddle. The left and right sides are tilted upwards, while the front and back sides are tilted downwards. In the profile charts, one is convex while the other is concave. This type of curvature is characteristic of a function with a saddle point.
Cost Function
The optimum value lies at the point at which the gradient is equal to the zero vector:
Here is a heatmap of the cost function:
Here is a surface plot of the function at three slightly different angles:
Here is the profile of a slice of the function taken across a diagonal line:
Here is the profile of another slice taken across a different diagonal:
The surface plot resembles the shape of an elongated bowl or an enclosed valley. All sides are tilted upwards. In the profile charts, both profiles show a convex curve. As a matter of fact, all slices through the optimum point will be convex. This type of curvature is characteristic of a function with a minimum point.
Comments