Jim Killingsworth

Ap­prox­i­mat­ing the Tar­get Dis­tri­b­u­tion

In pre­vi­ous stud­ies of the weight­ed coin toss game, our fo­cus was on find­ing a set of weights for the bi­ased coins that would yield a giv­en tar­get dis­tri­b­u­tion for the ex­pect­ed out­come. In this post, I want to ex­plore a dif­fer­ent ap­proach. In­stead of find­ing an ex­act so­lu­tion, I want to try find­ing an ap­prox­i­mate so­lu­tion us­ing a set of weights based on a pa­ra­me­ter­ized for­mu­la. This might pro­duce an ap­prox­i­mate so­lu­tion that is good enough for prac­ti­cal pur­pos­es while al­so be­ing eas­i­er to com­pute for a mod­el with a large num­ber of coin toss events per round.

Bi­as­es Based on a For­mu­la

The prob­a­bil­i­ty mass func­tion of the pos­si­ble out­comes of the coin toss game is a func­tion of the bi­as­es of the weight­ed coin­s. Us­ing the gen­er­al­ized Markov mod­el pre­sent­ed in the pre­vi­ous post, the prob­a­bil­i­ty mass func­tion can be com­put­ed for a game with any num­ber of coin toss events per round and any com­bi­na­tion of weight­s. Con­sid­er a coin toss game with ten flips per round. And sup­pose the coin be­ing tossed is al­ways a fair coin:

Figure 1
Figure 2

In this case, we wind up with a bi­no­mi­al dis­tri­b­u­tion for the dis­tri­b­u­tion of ex­pect­ed out­comes. You can think of the weight­ed coin toss game as a sys­tem that maps a set of weights to a dis­crete prob­a­bil­i­ty dis­tri­b­u­tion. The shape of the prob­a­bil­i­ty mass func­tion is de­ter­mined by the weights of the bi­ased coin­s. Now sup­pose the weights of the bi­ased coins are de­ter­mined by a for­mu­la with a sin­gle pa­ra­me­ter:

Figure 3

Here is an ex­am­ple of weights de­fined by the for­mu­la above and the prob­a­bil­i­ty dis­tri­b­u­tion of the pos­si­ble out­comes:

Figure 4
Figure 5

Here is an­oth­er ex­am­ple with a dif­fer­ent pa­ra­me­ter val­ue:

Figure 6
Figure 7

As you can see in the ex­am­ples above, a sin­gle pa­ra­me­ter can de­ter­mine the shape of the prob­a­bil­i­ty mass func­tion. The shape of the dis­tri­b­u­tion ap­pears to be ei­ther a stretch­ed or com­pressed form of the bi­no­mi­al dis­tri­b­u­tion, de­pend­ing on whether the pa­ra­me­ter is larg­er or small­er than 50%. Now let’s con­sid­er an­oth­er for­mu­la, this time with two pa­ra­me­ter­s:

Figure 8

Re­mem­ber that, ac­cord­ing to our mod­el of the coin toss game, the coin in the ini­tial state is al­ways a fair coin:

Figure 9

Here is an ex­am­ple of weights de­fined by the for­mu­la above and the prob­a­bil­i­ty dis­tri­b­u­tion of the pos­si­ble out­comes:

Figure 10
Figure 11

Here is an­oth­er ex­am­ple with dif­fer­ent pa­ra­me­ter val­ues:

Figure 12
Figure 13

Again, the shape of the prob­a­bil­i­ty dis­tri­b­u­tion can be de­ter­mined by the pa­ra­me­ter­s. Us­ing two pa­ra­me­ters gives a greater de­gree of vari­abil­i­ty than us­ing a sin­gle pa­ra­me­ter. How­ev­er, a for­mu­la with on­ly a few pa­ra­me­ters can­not rep­re­sent all pos­si­ble shapes that the prob­a­bil­i­ty mass func­tion can have. If we are try­ing to find a set of pa­ra­me­ters that fit a giv­en tar­get dis­tri­b­u­tion, we might not be able to get an ex­act match.

Ap­prox­i­ma­tion Er­ror

We want to use the two-pa­ra­me­ter for­mu­la de­scribed in the pre­vi­ous sec­tion to find an ap­prox­i­mate so­lu­tion for a giv­en tar­get dis­tri­b­u­tion. For some dis­tri­b­u­tion­s, we might be able to find an ex­act so­lu­tion. For oth­er­s, we want to find the clos­est match. To do this, we need to be able to quan­ti­fy the dif­fer­ence be­tween the tar­get dis­tri­b­u­tion and a pro­posed ap­prox­i­ma­tion. The num­ber of unique da­ta points we need to com­pare is giv­en by:

Figure 14

This is few­er than the to­tal num­ber of pos­si­ble out­comes. In our mod­el, the dis­tri­b­u­tion of pos­si­ble out­comes is al­ways sym­met­ri­cal about the ini­tial state, so we on­ly both­er look­ing at val­ues on one side. For each el­e­men­t, we can com­pute an er­ror term that is the dif­fer­ence be­tween the tar­get val­ue and the ap­prox­i­mate val­ue. We can then take the sum of the square of the er­rors to give us a val­ue that quan­ti­fies the ap­prox­i­ma­tion er­ror:

Figure 15

This al­lows us to quan­ti­fy the ap­prox­i­ma­tion er­ror and de­ter­mine whether one ap­prox­i­ma­tion is bet­ter than an­oth­er. A low­er val­ue would mean a bet­ter ap­prox­i­ma­tion. An ex­act match would be ze­ro. In the fol­low­ing sec­tion­s, we’ll use this as a cost func­tion for which we can find the op­ti­mal set of pa­ra­me­ters us­ing a hill climb­ing al­go­rith­m.

Ap­prox­i­mat­ing a Bi­no­mi­al Dis­tri­b­u­tion

In this first ex­am­ple, let’s find a suit­able ap­prox­i­ma­tion of the bi­no­mi­al dis­tri­b­u­tion us­ing the two-pa­ra­me­ter for­mu­la for the weight­s. You might be able to guess what the so­lu­tion is for this ex­am­ple, but let’s go through the mo­tion­s. Here is our tar­get dis­tri­b­u­tion:

Figure 16

Giv­en the tar­get dis­tri­b­u­tion, we can com­pute the ap­prox­i­ma­tion er­ror for all pos­si­ble ap­prox­i­ma­tions us­ing the two-pa­ra­me­ter for­mu­la for the weights of the bi­ased coin­s. Here is a sur­face plot of the er­ror func­tion:

Figure 17

We want to find the low­est point on the sur­face. To do this, we can use a sim­ple hill climb­ing al­go­rithm like the ones used in pre­vi­ous post­s. Here is a trace of the paths tak­en by the hill climb­ing al­go­rithm for three dif­fer­ent start­ing points:

Figure 18
Figure 19
Figure 20

Since the er­ror func­tion is con­vex, the al­go­rithm al­ways con­verges to the best so­lu­tion, re­gard­less of the ini­tial guess. Here is the op­ti­mal set of weights and cor­re­spond­ing prob­a­bil­i­ty mass func­tion for the ap­prox­i­mate so­lu­tion found:

Figure 21
Figure 22

Once we have found the op­ti­mal so­lu­tion, we can then com­pute the sum of squared er­rors to give us a mea­sure of how well the op­ti­mal so­lu­tion ap­prox­i­mates the tar­get dis­tri­b­u­tion:

Figure 23

In this case, it’s an ex­act match. The er­ror func­tion is ze­ro at the op­ti­mal point. The ap­prox­i­mat­ed dis­tri­b­u­tion is iden­ti­cal to the tar­get dis­tri­b­u­tion. As you might have guessed, a set of weights in which all coins are fair is the so­lu­tion that yields the tar­get dis­tri­b­u­tion.

Ap­prox­i­mat­ing a Tri­an­gu­lar Dis­tri­b­u­tion

Let’s take a look at an­oth­er ex­am­ple. What hap­pens if we try to ap­prox­i­mate a tri­an­gu­lar dis­tri­b­u­tion? The so­lu­tion for this ex­am­ple might not be as ob­vi­ous as it was for the last one. Here is what the tar­get dis­tri­b­u­tion looks like:

Figure 24

Giv­en the tar­get dis­tri­b­u­tion, we can com­pute the ap­prox­i­ma­tion er­ror for all pos­si­ble ap­prox­i­ma­tions us­ing the two-pa­ra­me­ter for­mu­la for the weights of the bi­ased coin­s. Here is a sur­face plot of the er­ror func­tion:

Figure 25

We want to find the low­est point on the sur­face. To do this, we can use a sim­ple hill climb­ing al­go­rithm like the ones used in pre­vi­ous post­s. Here is a trace of the paths tak­en by the hill climb­ing al­go­rithm for three dif­fer­ent start­ing points:

Figure 26
Figure 27
Figure 28

Since the er­ror func­tion is con­vex, the al­go­rithm al­ways con­verges to the best so­lu­tion, re­gard­less of the ini­tial guess. Here is the op­ti­mal set of weights and cor­re­spond­ing prob­a­bil­i­ty mass func­tion for the ap­prox­i­mate so­lu­tion found:

Figure 29
Figure 30

Once we have found the op­ti­mal so­lu­tion, we can then com­pute the sum of squared er­rors to give us a mea­sure of how well the op­ti­mal so­lu­tion ap­prox­i­mates the tar­get dis­tri­b­u­tion:

Figure 31

The er­ror is not ze­ro, so we don’t have an ex­act match. But eye­balling the ap­prox­i­mat­ed dis­tri­b­u­tion, it looks like a pret­ty close es­ti­mate of the tar­get dis­tri­b­u­tion. The weights are sim­i­lar to those found for the tri­an­gu­lar dis­tri­b­u­tion in the pre­vi­ous post.

Ap­prox­i­mat­ing an Ex­po­nen­tial Dis­tri­b­u­tion

Now let’s con­sid­er a third ex­am­ple. In this ex­am­ple, we want to try to ap­prox­i­mate an ex­po­nen­tial dis­tri­b­u­tion. How close of a match do you think we can find us­ing this es­ti­ma­tion tech­nique? Here is the tar­get dis­tri­b­u­tion we want to ap­prox­i­mate:

Figure 32

Giv­en the tar­get dis­tri­b­u­tion, we can com­pute the ap­prox­i­ma­tion er­ror for all pos­si­ble ap­prox­i­ma­tions us­ing the two-pa­ra­me­ter for­mu­la for the weights of the bi­ased coin­s. Here is a sur­face plot of the er­ror func­tion:

Figure 33

We want to find the low­est point on the sur­face. To do this, we can use a sim­ple hill climb­ing al­go­rithm like the ones used in pre­vi­ous post­s. Here is a trace of the paths tak­en by the hill climb­ing al­go­rithm for three dif­fer­ent start­ing points:

Figure 34
Figure 35
Figure 36

Since the er­ror func­tion is con­vex, the al­go­rithm al­ways con­verges to the best so­lu­tion, re­gard­less of the ini­tial guess. Here is the op­ti­mal set of weights and cor­re­spond­ing prob­a­bil­i­ty mass func­tion for the ap­prox­i­mate so­lu­tion found:

Figure 37
Figure 38

Once we have found the op­ti­mal so­lu­tion, we can then com­pute the sum of squared er­rors to give us a mea­sure of how well the op­ti­mal so­lu­tion ap­prox­i­mates the tar­get dis­tri­b­u­tion:

Figure 39

Again, the ap­prox­i­ma­tion er­ror is a nonze­ro val­ue, so we don’t have an ex­act match. Nonethe­less, the er­ror is still quite smal­l, and the shape of the ap­prox­i­mat­ed dis­tri­b­u­tion still looks quite sim­i­lar to that of the tar­get dis­tri­b­u­tion. Like the pre­vi­ous ex­am­ple, the weights are quite sim­i­lar to those found for the ex­po­nen­tial dis­tri­b­u­tion in the pre­vi­ous post.

Pos­si­ble Im­prove­ments

In some cas­es, the min­i­mum val­ue of the er­ror func­tion might lie at a point out­side the range of pos­si­ble val­ues for the weight­s. Since all of the weights of the bi­ased coins to the right of the ini­tial state must fall some­where on or be­tween the two pa­ra­me­ter val­ues, and since the pa­ra­me­ters are lim­it­ed to val­ues be­tween ze­ro and one, we might not be able to find the best ap­prox­i­ma­tion us­ing this ap­proach. A more ac­cu­rate ap­prox­i­ma­tion might be found by al­low­ing the pa­ra­me­ters to float out­side the bound­ary. Tak­ing this ap­proach, we would have to cap the weights of the bi­ased coins that would oth­er­wise fall out­side the range of valid val­ues be­fore com­put­ing the ap­prox­i­ma­tion er­ror.

An­oth­er thing that might im­prove the ac­cu­ra­cy of the ap­prox­i­ma­tion would be to use some sort of curve in­stead of a straight line for the for­mu­la that de­ter­mines the weights of the bi­ased coin­s. I think this would re­quire more than two pa­ra­me­ter­s.

Ac­com­pa­ny­ing source code is avail­able on GitHub.

Com­ments

Show comments