For all your fancy-pants statistical needs.

Praise for The Basketball Distribution:

"...confusing." - CBS
"...quite the pun master." - ESPN

Estimated defensive rating formula, with Usage% !

EDIT/UPDATE: This formula, like Dean Oliver's is based on some good theory, but as I have examined it more, it is a very poor measure of defensive success. If you need a quick fix, the following explains player defense better than the formula described:

(Points Allowed On Court / Possessions Played) - (Points Allowed Off Court / Possession Off-Court)


Woo! This one took a lot of work, but I think I have all of the theoretical errors taken care of. It's very similar to Dean Oliver's box-scoreformula, but with a few important adjustments:

-'Points allowed' are assigned individually based on estimated output per possession in units of 0, 1, 2, and 3, based on Ryan Parker's bachelor essay (Rather than only assigning players Stop values that add a marginal 'DefensivePointsPerScoringPossession' per stop)


-For each possession-allowed (0,1,2, and 3), we both estimate (via blocks, defensive rebounds, turnovers, and player fouls) the effectiveness of the player's defense, but also more intuitively adjust for our unknowns (most importantly, non-block-forced-field-goal-misses). This allows us to not rely on shoving 100% of the Team Defensive Rating into the final step of the formula.


-Defensive possessions used are calculated by the marginal-used-possessions from our estimates; The base rating still lies close to 20%, but is modified only in part by blocks/stls/pf/dr.

This is for college ball, since that's where Ryan's estimates of possession-endings come from; however, the forced free throws come from my NBA-team-estimate (which is pretty lazy currently).

quick definitions for the uninformed:
DFG% = opponent's Field Goals Made / opponent's Field Goal Attempts
DOR% = opponents Offensive Rebounds / (opp. off. reb + team def. reb)
PF=player personal fouls
dFTA = Free Throw Attempts by opponents
tmBlk = (team)blocks
DR = player defensive rebounds
dFT%=opponent's Free Throws Made / opponent's Free Throw Attempts
dFGA=opp's field goal attempts
dFGM=opp's field goals made
d3PM=opp's made three pointers
Stl = player steals
Poss = team possessions, as estimated here

Let the math begin!
tMin% (team minute %)= .2 * minutes / game minutes = minutes / team minutes
(this is our basic estimate of player defensive involvement for the whole game in places where we can't assume otherwise)

PossPI (possessions played in)= Team Possessions * tMin% * 5

FMW (forced-miss-weight) = (dfg%*(1-dor%)) / (dfg%*(1-dor%)+(1-dfg%)*dor%)
(same as Dean Oliver's formula - distributes credit of missed field goal to the one guarding and the one getting the defensive rebound. Guarding man gets FMW, defensive rebounder gets 1-FMW).

eFFTA (estimated forced free-throw-attempts) = (.6033*PF^1.2132)
(This is the basic team-level estimate I got from the NBA)

FFTA (forced free-throw-attempts) = uafFTA * (dFTA/team's Sum of(uafFTA))
(This forces the prior number to make the total forced free throw attempts equal to the actual free throw attempts)

FMstops (stops from forced misses)=(Blk + .tMin%*(dFGA-dFGM-tmBLK))*FMW*(1-dOR%) + DR*(1-FMW)


(Defensive rebounds are worth 1-FMW, blocks are worth 1*FMW, and we estimate that all other DFG% can be distributed equally. My NBA-team data showed zero correlation between Blocks and Non-Blocked-Field-Goal-Misses).

0pdp (zero-points-defensive possession)
=FMstops + .27*(fFTA-fFTA*dFT%) + Stl + tMin%(dTO-tmStl)

(Gives each player full credit for their steal, and then distributes all other turnovers equally. NBA team data also seemed to show no correlation between Steals and Non-Steal-Turnovers. This, like the rest of the 'pdp' formulas is based off the possession-ending-estimates in Parker's bachelor essay.)

1pdp
=.35*FFTA - .25*fFTA*dFT%

2pdp
=.95*tMin%(dFGM-d3PM+(tmBlk-Blk)) - Blk +.36*FFTA*dFT%

This spreads out 2-pointers made between all-players, but trades out the appropriate credit for blocks. This might look a little counter-intuitive, so I might talk a bit more about this in comments or a later post. Also, we assume that each player only blocks 2-pointers.

3pdp
=tMin%*(d3PM+.02*(dFGM-d3PM)) +.03*FFTA*dFT%

dPA=1pdp + 2*2pdp + 3*3pdp
(Defensive points allowed. 1 for 1-point possessions, etc)
dPOSS=0pdp + 1pdp + 2pdp + 3pdp
(Total defensive-possessions the player is credited for ending.)

DRTG=100*(dPA/dPOSS)
dUSG%=dPOSS/dPossPI

Whew!
Here the formula is in action (from Saturday's Carolina game):




Edit: If you're wondering how effective this really is, check out the ratings applied to NBA players with median minutes or more, and converted the ratings to defensive win shares. Compare this with basketball-reference's 2010-2011 season by Defensive Win Shares.

6 comments:

  1. To be clear, is your FMW (forced-miss-weight) going to the offensive player's counterpart defender; and if and presumably so given your text, what basis did you use for the play by play defensive position assignments with the NBA data?

    FMW (forced-miss-weight) going to the offensive player's counterpart would be a big change from Oliver's Defensive Rating as is varying the % of defensive possessions faced from an assumed flat 20%.

    Have you compared the values in your formula to Evan Zamir's EZPM?

    Would you be willing to lead in running correlations for your defensive measure, Defensive Win Shares, EZPM and defensive Adjusted +/-?

    Given that your data is in .pdf here I don't immediately know how to get it in Excel without re-doing the data-entry. Is there a way to do that or to get your data in a Excel file?

    Crow

    ReplyDelete
  2. Hey Crow.

    I don't have the excel file handy, but I can later today. FMw is spread throughout the entire team (not counterparts). I've toyed with the idea of counterparts, but have been slowed down by the difficulties in acquiring such data. I would like to, however, find a way to make this data more specific.

    My current favorite methodology of regression is to find the least squares of some statistical estimate (like WS/48 or Offensive and Defensive Ratings) in tandem with Net +/-, and regress against Adjusted +/-. This is what I plan to do next.


    I've had a hard time keeping up with Zamir's numbers (he posts too many links for my small mind!), although his metrics seem pretty theoretically sound. I was planning on running against unadjusted defensive plus-minus yesterday; however, I don't think Barzilai's APM adjusts for defense? On his site he lists Unadjusted Defensive On/Off/Net...

    I'd like to think that a non-linear metric like this tells us some things that statistical +/- metrics can't, but I'm having a hard time believing that.

    ReplyDelete
  3. *by statistical +/- I meant all +/- (On-Court, Net, APM, SPM)

    ReplyDelete
  4. Thanks for the clarification that your metric is not based on counterpart shot defense at this time.

    Bacn2newbelf's site
    http://stats-for-the-nba.appspot.com/
    has the only public and current offensive and defensive splits for Adjusted +/-.

    An Excel file of your data or your data side by side with defensive Adjusted +/- would be appreciated but when you get to it, no rush.

    ReplyDelete
  5. I now understand what defensive possessions and dUSG% are based on here; and since counterpart shot defense is not currently used, these estimated values are rough guidelines and not as precise as they appear.

    ReplyDelete
  6. I'm starting to compare my data to b2b's RAPM for total, offense, and defense, using the 3yr estimates for him and 3-yr averages for ezPM. Hopefully, something interesting comes out of it.

    ReplyDelete

Followers

About Me

I wish my heart were as often large as my hands.