PDA

View Full Version : Advanced Statistical Plus-Minus (or: Jon Scheyer was really good)



vick
03-05-2013, 10:25 PM
Fair warning: this will be a stats-heavy post. If that's not your thing, that's cool, but it seems like a fair number of folks here are statistically-inclined, so this is for you.

Identifying the best basketball players in the country is an extremely difficult task. With 347 teams in NCAA Division I basketball, no one could hope to watch more than a fraction of the season, so various statistical measures have been created--Kenpom's kPOY, Basketball-reference's Win Shares, etc. Happily for me, my favorite from the world of NBA stats has now been released for college basketball.

In an ideal world, you could look at every lineup and how they perform (in terms of scoring margin) against every possible opponent, and this would tell you everything you could want to know. However, in practice, you need far too many contests to separate noise from actual data--even in the NBA, with 82 48-minute games, a full year of plus-minus data generates results with fairly large standard errors (http://www.82games.com/comm30.htm), and so people often look at two-year plus minus (or 162 games). Clearly that could not work for college basketball, and at any rate even the two-year figures are often problematic--for example, what if two players generally only play together in a lineup--how do you allocate 'credit' between the two players? Ken Pomeroy has also documented (http://kenpom.com/blog/index.php/weblog/entry/a_treatise_on_plus_minus) the problems with plus-minus at the college level.

"Advanced Statistical Plus-Minus" is one proposed improvement. This metric, described in detail here (http://godismyjudgeok.com/DStats/aspm-and-vorp/), represents the results of a regression of a large (multi-year) plus-minus data set on the equation:


a*MPG + b*TRB% + c*BLK% + d*STL% + e*USG%*[TS%*2*(1-TO%) – f*TO% – g + h*AST% + i*USG%]

Oof, that looks rough, but it's not as bad as it looks! Essentially, what he is doing is saying, "what is the statistical profile of a player who generally has a high plus-minus rating." The metrics on which he regresses are pace-adjusted (this is the "advanced" of "advanced statistical plus-minus")--so TRB% is the percentage of available rebounds the player gathers, BLK% the percentage of shots blocked, etc. The more complex term in the brackets works out to a term for points per shot minus one for turnovers plus one for assists and plus one for usage (thus "rewarding" a player for higher usage, which generally increases teammates' efficiency), all over a threshold value. He then distributes any unallocated efficiency margin to players (which will account for differences in schedule difficulty) to get a "plus minus" per 100 possessions. In order to calculate what he calls "Value Over Replacement Player," all that's necessary is to find the difference between this figure and a hypothetical "replacement player," and multiply by the percent of team minutes played--in other words, the "plus-minus" is more like a per-minute stat, and "VORP" more similar to "per team games played."

Anyway, does this procedure generate results that pass the laugh test? Well, for 2013 so far, (http://godismyjudgeok.com/DStats/aspm-and-vorp/2013-aspm/)the top players for advanced statistical plus-minus (ASPM) are Lebron James, Chris Paul, and Kevin Durant. With VORP, so accounting for minutes played, James and Durant are 1-2, which I think conforms with general perception.

But what about this formula at the college level? In order to account for different schedule difficulty, the formula uses Kenpom's "adjusted" offensive and defensive efficiencies, thus differences in schedules should be captured when making the team adjustment. The calculation was made for 2010 through 2013* (http://godismyjudgeok.com/DStats/aspm-and-vorp/ncaa-advanced-statistical-plus-minus-and-vorp/). The best player in 2013 so far? Victor Oladipo by ASPM, Trey Burke by VORP. 2012? Anthony Davis. 2011? Kyrie Irving by ASPM (sigh...) and Kemba Walker by VORP. OK, those seem pretty reasonable. And for 2010? Evan Turner by ASPM and by VORP...Jon Scheyer. For what it's worth, a similar metric, Value Add (http://valueaddbasketball.com/ballall.html), also ranked him as the best player that year (I'm less fond of this one, although I think the spirit is right, largely because of the position adjustment, which seems weird to me).

I remember from the jersey retirement debates here all the arguments made about intangibles and leadership--and all of those are important, sometimes "statheads" (among whom I count myself) can be too dismissive of those. But just maybe Jon was tangibly, statistically, the best player in college basketball in 2010.

* Spreadsheet available at the link if you want to follow along, and your computer can handle a gigantic Excel file (on the order of 25 MB).

Native
03-05-2013, 10:39 PM
But what about this formula at the college level? In order to account for different schedule difficulty, the formula uses Kenpom's "adjusted" offensive and defensive efficiencies, thus differences in schedules should be captured when making the team adjustment. The calculation was made for 2010 through 2013* (http://godismyjudgeok.com/DStats/aspm-and-vorp/ncaa-advanced-statistical-plus-minus-and-vorp/). The best player in 2013 so far? Victor Oladipo by ASPM, Trey Burke by VORP. 2012? Anthony Davis. 2011? Kyrie Irving by ASPM (sigh...) and Kemba Walker by VORP. OK, those seem pretty reasonable. And for 2010? Evan Turner by ASPM and by VORP...Jon Scheyer. For what it's worth, a similar metric, Value Add (http://valueaddbasketball.com/ballall.html), also ranked him as the best player that year (I'm less fond of this one, although I think the spirit is right, largely because of the position adjustment, which seems weird to me).

Interesting that the past three national champions have had the best player by VORP on their team. Does this mean Michigan will win it all this year?

DukeCrow
03-05-2013, 10:43 PM
* Spreadsheet available at the link if you want to follow along, and your computer can handle a gigantic Excel file (on the order of 25 MB).

Save that as a .xlsb if you want to dramatically reduce its size. :cool:

vick
03-05-2013, 10:57 PM
Interesting that the past three national champions have had the best player by VORP on their team. Does this mean Michigan will win it all this year?

Well, there's the problem that the rankings include the tournament itself, so it's slightly biased to look at it that way (although, while Scheyer had a fine tournament, I don't think it was statistically overwhelming). I could see how it would be meaningful though--you want a player who's used to both "taking over" but also logging long minutes that you sometimes have to in tournament situations--I wonder about a guy like Olynyk who is crazy efficient by pretty much any measure but doesn't even average 26 mpg.


Save that as a .xlsb if you want to dramatically reduce its size.

I couldn't remember if macros still worked in binary (the macro is to download updated data)...

cptnflash
03-05-2013, 11:49 PM
Other statistically inclined analysts have found similar results by other methods - namely, that Jon Scheyer's 2010 season was one of the best by any point guard in at least a decade.

It's funny... the 2010 team still seems to get very little respect, even among Duke fans, unless you look at advanced metrics. Then suddenly the truth becomes clear, as the standard media tripe of "there were no dominant teams in 2010 / Duke wasn't the best team but they got hot at the right time / they got lucky with upsets in the NCAA tournament bracket / they were handed the trophy on a silver platter by the iluminati" recedes into the background in the face of actual data. The truth, at least according to the statistical evidence, is that the 2010 Duke team was every bit as good as the 2007 Gators, the 2009 Tar Heels, or the 2012 Wildcats.

And Jon was one of the big reasons for that. Despite his well documented senior-year shooting slump, he still posted a 127 ORtg with a 23% usage rate while playing 92% of our minutes. Those numbers, quite frankly, are absurd. They are video game quality (or would be, if NCAA basketball was still a video game).

COYS
03-06-2013, 11:05 AM
Other statistically inclined analysts have found similar results by other methods - namely, that Jon Scheyer's 2010 season was one of the best by any point guard in at least a decade.

It's funny... the 2010 team still seems to get very little respect, even among Duke fans, unless you look at advanced metrics. Then suddenly the truth becomes clear, as the standard media tripe of "there were no dominant teams in 2010 / Duke wasn't the best team but they got hot at the right time / they got lucky with upsets in the NCAA tournament bracket / they were handed the trophy on a silver platter by the iluminati" recedes into the background in the face of actual data. The truth, at least according to the statistical evidence, is that the 2010 Duke team was every bit as good as the 2007 Gators, the 2009 Tar Heels, or the 2012 Wildcats.

And Jon was one of the big reasons for that. Despite his well documented senior-year shooting slump, he still posted a 127 ORtg with a 23% usage rate while playing 92% of our minutes. Those numbers, quite frankly, are absurd. They are video game quality (or would be, if NCAA basketball was still a video game).

To me, Jon and the 2010 team cannot be praised enough simply because too many in the national media and in Duke fan circles fail to recognize that the 2010 team was a true juggernaut. I agree with everything you say. Every advanced metric points to Jon and the 2010 team as being individually (for Jon) and collectively a worthy champion. As I've mentioned many times before, the 2010 KenPom stats show Duke as having a larger efficiency margin (O efficiency - D efficieny) than the 2009 Tar Heels. That means that relative to the competition, Duke was MORE dominant in 2010 by the end of their tourney run than the Heels after their 2009 demolition of the tourney field.

I've also never understood the "easy" bracket stuff. Duke played a 3 seed in the Elite Eight and a 2 seed in the Final Four. That is hardly an easy path. What's more is that Duke had to play a long, athletic 3 seed that was playing a few hours from home in Baylor. This was supposed to be the exact team that the alarmingly unathletic Blue Devils would be destroyed by. MANY talking heads predicted our demise at the hands of Baylor in Houston. Then the 2 seed that we had to play to make it to the championship game just happened to be the Big East Champion Mountaineers. This was, if you'll recall, the team that was supposedly more deserving of our number 1 seed. So while WVU was a 2 seed in name, many believed they were the caliber of a number 1. Many also believed that they would beat us. Of course, everyone remembers the ease with which we utterly annihilated WVU. That should have put to bed any doubts as to which team was better. We lucked out a little bit by drawing a Hummel-less Purdue team in the Sweet Sixteen, but that didn't guarantee us anything past that point.

Finally, while Butler was an almost unknown on the national scene in 2010, their repeat national runner-up performance in 2011 after losing their best player from the previous team clearly established that 2010 was not a fluke.

Anyway, Jon Scheyer, whether he actually won any official award for it, was our MOP. He took a team that had not gotten consistent play from the PG spot for the past few years and turned it into the best offense in the land, all while maintaining his own offensive efficiency. The 2010 team was worthy of the championship, and Jon is worthy of all the recognition that the advanced metrics give him.

gep
03-06-2013, 11:26 PM
Anyway, Jon Scheyer, whether he actually won any official award for it, was our MOP. He took a team that had not gotten consistent play from the PG spot for the past few years and turned it into the best offense in the land, all while maintaining his own offensive efficiency. The 2010 team was worthy of the championship, and Jon is worthy of all the recognition that the advanced metrics give him.

Yes, Jon was the floor leader and "MOP"... but the 2010 TEAM did it :cool:

http://www.youtube.com/watch?v=J5DOt7QQV40 To me, THIS is the BEST part of 2010.