The linear regression is the 2nd step, and here are the results (with an R^2 value of about .99)
(Numbers derived from http://kenpom.com)
Step one is a bit harder in some-ways, and should probably be done on a team-by-team basis. We'll cover that soon.
Wouldn't it be better to regress the four factors onto efficiency margin, then adjust that for pace? Possessions (and the intercept) shouldn't have a non-0 value, or else the predicted MOV will change depending on which team you label as the offense.
ReplyDeleteI did a regression using only 2009 numbers, and got the following coefficients:
Intercept 0
Oefg 1.321490754
Oto -1.21443408
Oor 0.632136795
Oftr 0.104561921
Defg -1.321490754
Dto 1.21443408
Dor -0.632136795
Dftr -0.104561921
Those are all exactly the same as yours, but multiplied by a constant (1.350315943). My R^2 is only 0.97.
How did you end up with a negative intercept anyway? Were the losers always the offense?
I forgot I had auto-calculations turned of in Excel. Those obviously aren't all multiplied by the same constant.
ReplyDeleteOops.
Good point about the possessions - that was pretty stupid of me! Part of the problem is that I was using this on team stats, not game stats.
ReplyDeleteI suppose I should zero the intercept! I just find it interesting that I have such a high R^2 value! Definitely inflated it with the possessions coefficient.