Jim Killingsworth

Weight­ed Lin­ear Re­gres­sion

When do­ing a re­gres­sion analy­sis, you might want to weight some da­ta points more heav­i­ly than oth­er­s. For ex­am­ple, when fit­ting a mod­el to his­toric stock price data, you might want to as­sign more weight to re­cent­ly ob­served price val­ues. In this post, I demon­strate how to es­ti­mate the co­ef­fi­cients of a lin­ear mod­el us­ing weight­ed least squares re­gres­sion. As with the pre­vi­ous post, I al­so show an al­ter­na­tive de­riva­tion us­ing the max­i­mum like­li­hood method.

Least Squares Es­ti­ma­tion

Sup­pose we have a set of da­ta points that we ex­pect to fall on a line giv­en by the fol­low­ing lin­ear equa­tion:

Figure 1

The ob­served da­ta, how­ev­er, con­tain er­rors for val­ues on the ver­ti­cal ax­is. For each data point, we de­fine the er­ror as the dif­fer­ence be­tween the ob­served val­ue and the fit­ted val­ue of the lin­ear mod­el:

Figure 2

If we were per­form­ing an or­di­nary least squares re­gres­sion, we would want to find the co­ef­fi­cients for the lin­ear mod­el that min­i­mize the sum of the squared er­rors. But in this case, we want to con­sid­er the weight­ed sum of squares:

Figure 3

There is a unique weight as­so­ci­at­ed with the er­ror of each ob­ser­va­tion. Some val­ues are count­ed more than oth­er­s, de­pend­ing on the scheme used to de­ter­mine the weight­s. Let’s treat the weight­ed sum of squares as a func­tion of the co­ef­fi­cients:

Figure 4

Fol­low­ing the same ap­proach we used in the pre­vi­ous post, we can es­ti­mate the co­ef­fi­cients of the mod­el func­tion by find­ing the val­ues that min­i­mize the weight­ed sum of squares. We take the par­tial de­riv­a­tive of the weight­ed sum of squares func­tion with re­spect to each of the co­ef­fi­cients, set the de­riv­a­tive to ze­ro, and then solve for the co­ef­fi­cien­t. Here are the de­riv­a­tives with re­spect to each co­ef­fi­cien­t:

Figure 5

Set­ting the de­riv­a­tive with re­spect to the first co­ef­fi­cient to ze­ro, we get the fol­low­ing re­sult:

Figure 6

Re­ar­rang­ing the equa­tion and solv­ing for the co­ef­fi­cien­t:

Figure 7

Set­ting the de­riv­a­tive with re­spect to the sec­ond co­ef­fi­cient to ze­ro, we get the fol­low­ing re­sult:

Figure 8

Re­ar­rang­ing the equa­tion and solv­ing for the co­ef­fi­cien­t:

Figure 9

If you plug in the weights and the ob­served val­ues, find­ing the co­ef­fi­cients is fair­ly straight­for­ward. No­tice that if all the weights are equal, the re­sult is the same as the or­di­nary least squares method pre­sent­ed in the pre­vi­ous post.

Max­i­mum Like­li­hood Es­ti­ma­tion

Let’s as­sume the er­rors are nor­mal­ly dis­trib­uted around the mod­el. Re­call the prob­a­bil­i­ty den­si­ty func­tion for the nor­mal dis­tri­b­u­tion:

Figure 10

For a giv­en set of ob­ser­va­tion­s, we know the like­li­hood of a par­tic­u­lar mean and stan­dard de­vi­a­tion val­ue is the prod­uct of the prob­a­bil­i­ty den­si­ty of each ob­ser­va­tion giv­en that par­tic­u­lar mean and stan­dard de­vi­a­tion. But how do we weight one ob­ser­va­tion dif­fer­ent­ly than an­oth­er? For each ob­ser­va­tion, we can raise the prob­a­bil­i­ty den­si­ty to the pow­er of the weight as­so­ci­at­ed with that ob­ser­va­tion:

Figure 11

If the weight of one ob­ser­va­tion is twice that of all the oth­er­s, for ex­am­ple, then it is treat­ed as if the mea­sure­ment had ap­peared twice in the ob­served da­ta set. The es­ti­mat­ed mean and stan­dard de­vi­a­tion val­ues can be found by max­i­miz­ing the like­li­hood func­tion. To make things eas­ier, we can work with the log-like­li­hood func­tion in­stead:

Figure 12

Let’s re­place the mean with the body of the mod­el func­tion and treat the log-like­li­hood func­tion as a func­tion of the co­ef­fi­cients we want to solve for:

Figure 13

Now we can find the max­i­mum of the log-like­li­hood func­tion and solve for the co­ef­fi­cients us­ing the same ap­proach as be­fore. Here are the par­tial de­riv­a­tives of the log-like­li­hood func­tion with re­spect to each of the co­ef­fi­cients:

Figure 14

Set­ting the de­riv­a­tive with re­spect to the first co­ef­fi­cient to ze­ro, we get the fol­low­ing re­sult:

Figure 15

Re­ar­rang­ing the equa­tion and solv­ing for the co­ef­fi­cien­t:

Figure 16

Set­ting the de­riv­a­tive with re­spect to the sec­ond co­ef­fi­cient to ze­ro, we get the fol­low­ing re­sult:

Figure 17

Re­ar­rang­ing the equa­tion and solv­ing for the co­ef­fi­cien­t:

Figure 18

As ex­pect­ed, the weight­ed max­i­mum like­li­hood es­ti­ma­tion gives the same re­sult as the weight­ed least squares es­ti­ma­tion when we as­sume the er­rors are nor­mal­ly dis­trib­ut­ed. While I could go a step fur­ther and solve for stan­dard de­vi­a­tion, I’m go­ing to stop here. I’d like to do a more in­-depth study of vari­ances at an­oth­er time.

Com­ments

Show comments