vick
03-05-2013, 10:25 PM
Fair warning: this will be a stats-heavy post. If that's not your thing, that's cool, but it seems like a fair number of folks here are statistically-inclined, so this is for you.
Identifying the best basketball players in the country is an extremely difficult task. With 347 teams in NCAA Division I basketball, no one could hope to watch more than a fraction of the season, so various statistical measures have been created--Kenpom's kPOY, Basketball-reference's Win Shares, etc. Happily for me, my favorite from the world of NBA stats has now been released for college basketball.
In an ideal world, you could look at every lineup and how they perform (in terms of scoring margin) against every possible opponent, and this would tell you everything you could want to know. However, in practice, you need far too many contests to separate noise from actual data--even in the NBA, with 82 48-minute games, a full year of plus-minus data generates results with fairly large standard errors (http://www.82games.com/comm30.htm), and so people often look at two-year plus minus (or 162 games). Clearly that could not work for college basketball, and at any rate even the two-year figures are often problematic--for example, what if two players generally only play together in a lineup--how do you allocate 'credit' between the two players? Ken Pomeroy has also documented (http://kenpom.com/blog/index.php/weblog/entry/a_treatise_on_plus_minus) the problems with plus-minus at the college level.
"Advanced Statistical Plus-Minus" is one proposed improvement. This metric, described in detail here (http://godismyjudgeok.com/DStats/aspm-and-vorp/), represents the results of a regression of a large (multi-year) plus-minus data set on the equation:
a*MPG + b*TRB% + c*BLK% + d*STL% + e*USG%*[TS%*2*(1-TO%) – f*TO% – g + h*AST% + i*USG%]
Oof, that looks rough, but it's not as bad as it looks! Essentially, what he is doing is saying, "what is the statistical profile of a player who generally has a high plus-minus rating." The metrics on which he regresses are pace-adjusted (this is the "advanced" of "advanced statistical plus-minus")--so TRB% is the percentage of available rebounds the player gathers, BLK% the percentage of shots blocked, etc. The more complex term in the brackets works out to a term for points per shot minus one for turnovers plus one for assists and plus one for usage (thus "rewarding" a player for higher usage, which generally increases teammates' efficiency), all over a threshold value. He then distributes any unallocated efficiency margin to players (which will account for differences in schedule difficulty) to get a "plus minus" per 100 possessions. In order to calculate what he calls "Value Over Replacement Player," all that's necessary is to find the difference between this figure and a hypothetical "replacement player," and multiply by the percent of team minutes played--in other words, the "plus-minus" is more like a per-minute stat, and "VORP" more similar to "per team games played."
Anyway, does this procedure generate results that pass the laugh test? Well, for 2013 so far, (http://godismyjudgeok.com/DStats/aspm-and-vorp/2013-aspm/)the top players for advanced statistical plus-minus (ASPM) are Lebron James, Chris Paul, and Kevin Durant. With VORP, so accounting for minutes played, James and Durant are 1-2, which I think conforms with general perception.
But what about this formula at the college level? In order to account for different schedule difficulty, the formula uses Kenpom's "adjusted" offensive and defensive efficiencies, thus differences in schedules should be captured when making the team adjustment. The calculation was made for 2010 through 2013* (http://godismyjudgeok.com/DStats/aspm-and-vorp/ncaa-advanced-statistical-plus-minus-and-vorp/). The best player in 2013 so far? Victor Oladipo by ASPM, Trey Burke by VORP. 2012? Anthony Davis. 2011? Kyrie Irving by ASPM (sigh...) and Kemba Walker by VORP. OK, those seem pretty reasonable. And for 2010? Evan Turner by ASPM and by VORP...Jon Scheyer. For what it's worth, a similar metric, Value Add (http://valueaddbasketball.com/ballall.html), also ranked him as the best player that year (I'm less fond of this one, although I think the spirit is right, largely because of the position adjustment, which seems weird to me).
I remember from the jersey retirement debates here all the arguments made about intangibles and leadership--and all of those are important, sometimes "statheads" (among whom I count myself) can be too dismissive of those. But just maybe Jon was tangibly, statistically, the best player in college basketball in 2010.
* Spreadsheet available at the link if you want to follow along, and your computer can handle a gigantic Excel file (on the order of 25 MB).
Identifying the best basketball players in the country is an extremely difficult task. With 347 teams in NCAA Division I basketball, no one could hope to watch more than a fraction of the season, so various statistical measures have been created--Kenpom's kPOY, Basketball-reference's Win Shares, etc. Happily for me, my favorite from the world of NBA stats has now been released for college basketball.
In an ideal world, you could look at every lineup and how they perform (in terms of scoring margin) against every possible opponent, and this would tell you everything you could want to know. However, in practice, you need far too many contests to separate noise from actual data--even in the NBA, with 82 48-minute games, a full year of plus-minus data generates results with fairly large standard errors (http://www.82games.com/comm30.htm), and so people often look at two-year plus minus (or 162 games). Clearly that could not work for college basketball, and at any rate even the two-year figures are often problematic--for example, what if two players generally only play together in a lineup--how do you allocate 'credit' between the two players? Ken Pomeroy has also documented (http://kenpom.com/blog/index.php/weblog/entry/a_treatise_on_plus_minus) the problems with plus-minus at the college level.
"Advanced Statistical Plus-Minus" is one proposed improvement. This metric, described in detail here (http://godismyjudgeok.com/DStats/aspm-and-vorp/), represents the results of a regression of a large (multi-year) plus-minus data set on the equation:
a*MPG + b*TRB% + c*BLK% + d*STL% + e*USG%*[TS%*2*(1-TO%) – f*TO% – g + h*AST% + i*USG%]
Oof, that looks rough, but it's not as bad as it looks! Essentially, what he is doing is saying, "what is the statistical profile of a player who generally has a high plus-minus rating." The metrics on which he regresses are pace-adjusted (this is the "advanced" of "advanced statistical plus-minus")--so TRB% is the percentage of available rebounds the player gathers, BLK% the percentage of shots blocked, etc. The more complex term in the brackets works out to a term for points per shot minus one for turnovers plus one for assists and plus one for usage (thus "rewarding" a player for higher usage, which generally increases teammates' efficiency), all over a threshold value. He then distributes any unallocated efficiency margin to players (which will account for differences in schedule difficulty) to get a "plus minus" per 100 possessions. In order to calculate what he calls "Value Over Replacement Player," all that's necessary is to find the difference between this figure and a hypothetical "replacement player," and multiply by the percent of team minutes played--in other words, the "plus-minus" is more like a per-minute stat, and "VORP" more similar to "per team games played."
Anyway, does this procedure generate results that pass the laugh test? Well, for 2013 so far, (http://godismyjudgeok.com/DStats/aspm-and-vorp/2013-aspm/)the top players for advanced statistical plus-minus (ASPM) are Lebron James, Chris Paul, and Kevin Durant. With VORP, so accounting for minutes played, James and Durant are 1-2, which I think conforms with general perception.
But what about this formula at the college level? In order to account for different schedule difficulty, the formula uses Kenpom's "adjusted" offensive and defensive efficiencies, thus differences in schedules should be captured when making the team adjustment. The calculation was made for 2010 through 2013* (http://godismyjudgeok.com/DStats/aspm-and-vorp/ncaa-advanced-statistical-plus-minus-and-vorp/). The best player in 2013 so far? Victor Oladipo by ASPM, Trey Burke by VORP. 2012? Anthony Davis. 2011? Kyrie Irving by ASPM (sigh...) and Kemba Walker by VORP. OK, those seem pretty reasonable. And for 2010? Evan Turner by ASPM and by VORP...Jon Scheyer. For what it's worth, a similar metric, Value Add (http://valueaddbasketball.com/ballall.html), also ranked him as the best player that year (I'm less fond of this one, although I think the spirit is right, largely because of the position adjustment, which seems weird to me).
I remember from the jersey retirement debates here all the arguments made about intangibles and leadership--and all of those are important, sometimes "statheads" (among whom I count myself) can be too dismissive of those. But just maybe Jon was tangibly, statistically, the best player in college basketball in 2010.
* Spreadsheet available at the link if you want to follow along, and your computer can handle a gigantic Excel file (on the order of 25 MB).