PDA

View Full Version : BPI > RPI, Kenpom and Sagarin? (Duke current #1)



JNort
02-10-2013, 04:36 PM
http://espn.go.com/mens-college-basketball/story/_/id/7561413/bpi-college-basketball-power-index-explained

New system that ranks teams and accounts for missing players. Goes more in depth than Kenpom. What do yall think?

Bob Green
02-10-2013, 04:46 PM
I'm sorry but all these computer based rankings don't get me excited. I'm all for the old eye test. This new BPI says Duke is #1 and Miami is #9 but we all saw Miami beat Duke by 27 points in a head-to-head match-up.

sporthenry
02-10-2013, 04:48 PM
I'm sorry but all these computer based rankings don't get me excited. I'm all for the old eye test. This new BPI says Duke is #1 and Miami is #9 but we all saw Miami beat Duke by 27 points in a head-to-head match-up.

In games with their full starting 5, Miami is actually #1 (a little bit ahead of Kelly).

Listen to Quants
02-10-2013, 04:54 PM
http://espn.go.com/mens-college-basketball/story/_/id/7561413/bpi-college-basketball-power-index-explained

New system that ranks teams and accounts for missing players. Goes more in depth than Kenpom. What do yall think?

Interesting. I like the adjustment for absent player(s). Not that it needs emphasis, but Duke's recent experience with Kelly demonstrates the importance of such an absence. I also like the 'reward' the power index gives for winning. Often a team will sacrifice points in a lead in order to run the clock out and, arguably (and argued much on this board), preserve a win. The win then contains information the final score gap does not.

Perhaps most interesting was that ESPN seems to have run their BPI off against Sagarin and RPI in the last 5 years of tourny contests. I was surprised as how well RPI did, I thought it was more flawed than that. Perhaps all these are flawed though since in a seeded tournament an awful lot of the games are 'no brainers' and the performances of the indexes, just below 75% correct is not impressive (random is of course, 50%). Sadly, nothing was said about the Vegas line. I'd bet strongly that it did much better. Also unfortunate is that they did not include a link (or I missed the link) to exact formula or explanation.

vick
02-10-2013, 04:54 PM
http://espn.go.com/mens-college-basketball/story/_/id/7561413/bpi-college-basketball-power-index-explained

New system that ranks teams and accounts for missing players. Goes more in depth than Kenpom. What do yall think?

It's tough to evaluate without knowing how they account for injuries or lower the impact of blowouts, which I think are clearly the key weaknesses of Kenpom. I would note though that Dean Oliver, the author of that article, is really, really smart--his Basketball on Paper is like the Bible of the statistical analysis of basketball (most of what Pomeroy does is derived from it, as he freely acknowledges), so if he says it's worth looking at, I sure believe him.

JNort
02-10-2013, 05:02 PM
I'm sorry but all these computer based rankings don't get me excited. I'm all for the old eye test. This new BPI says Duke is #1 and Miami is #9 but we all saw Miami beat Duke by 27 points in a head-to-head match-up.

Yeah I know what ya mean but many people love the stat side of it and all the rankings. We also saw Miami Gulf Coast beat Miami. I think Duke wins comfortably in the match up in Durham.

Kedsy
02-10-2013, 05:10 PM
Perhaps most interesting was that ESPN seems to have run their BPI off against Sagarin and RPI in the last 5 years of tourny contests.

Since RPI was used to make the seedings, there's at least a little bit of self-fulfilling prophecy going on there. And the main question I'd ask about testing the BPI on past tournaments is did they use ratings that included the tournament games? That's what a lot of people seem to do when they backtest Pomeroy and it makes for flawed analysis.

Listen to Quants
02-10-2013, 05:27 PM
Since RPI was used to make the seedings, there's at least a little bit of self-fulfilling prophecy going on there. And the main question I'd ask about testing the BPI on past tournaments is did they use ratings that included the tournament games? That's what a lot of people seem to do when they backtest Pomeroy and it makes for flawed analysis.

Absolutely right. testing on the data bed that created the model is (or should be) forbidden. However, the average team plays about 2 tourny games and 30 regular season games so if they are unweighted the touney results won't affect things too much. The basic results from my perspective though, was all these systems are pretty bad.

Wander
02-10-2013, 05:42 PM
So, they basically stole kenpom's method and added a few minor adjustments?

I've always felt that injuries/suspensions shouldn't be taken into account at all for selection and seeding purposes, though I'm probably in the minority in that one. I guess it's OK for a computer rating system, but tough to say without knowing how they do it.

Anyway, the coolest part about kenpom isn't the straight-up rankings or using it to try and predict games (as Kedsy has argued before, it's probably not that much better than the RPI or other methods, if at all), but the detailed and objective breakdown of how good a team is at various parts of the game. And that doesn't seem to be present here.

sporthenry
02-10-2013, 06:07 PM
Absolutely right. testing on the data bed that created the model is (or should be) forbidden. However, the average team plays about 2 tourny games and 30 regular season games so if they are unweighted the touney results won't affect things too much. The basic results from my perspective though, was all these systems are pretty bad.

I think Kenpom has said as much about using pre-tourney data but the problem with the data after tourney is how much value does he put on it? For the team that eventually wins it, that is 6 more wins, including 3-4 against ranked teams and 2-3 games against top 10 teams. And it is unclear how much weight he puts on these top games.


So, they basically stole kenpom's method and added a few minor adjustments?

I've always felt that injuries/suspensions shouldn't be taken into account at all for selection and seeding purposes, though I'm probably in the minority in that one. I guess it's OK for a computer rating system, but tough to say without knowing how they do it.


I have no problem with taking Kenpoms system and adjusting with it. Getting beyond that this is how most things are created/made better, Kenpom has acknowledged flaws in his system but done little to address them.

As far as injuries, I have no problems accounting for it. Not only is it the committee's job to put the best teams in the tourney but a corollary to that would be to put the better team in a better seeding position. If two bubble teams have identical resumes but one missed their leading scorer, it is clearer who the better team is. Not to mention, this doesn't even address the equity or inequity that would happen without taking into account injury. Imagine Withey gets injured in the Big 12 tournament and is done for the year. But their resume commanded a 2 seed. Whatever #1 seed in their bracket would gain an unfair advantage over other #1 seeds b/c of the weakness of their 2 seed.

Kedsy
02-10-2013, 11:27 PM
Absolutely right. testing on the data bed that created the model is (or should be) forbidden. However, the average team plays about 2 tourny games and 30 regular season games so if they are unweighted the touney results won't affect things too much.

You'd be surprised. For example, Butler in Pomeroy's 2010 ratings went from 26th going into the tournament to 12th in his final ratings. UConn in 2011 went from 17th to 10th (their defense went from 31st to 14th). Butler in 2011 went from 54th to 41st; VCU in 2011 went from 84th to 52nd. In 2009, coming into the tournament Duke was Pomeroy's #7 and Villanova was #19. In his final rankings, Duke was #11 and Villanova #14, making it look much closer than he presumably predicted pre-tournament.

And I realize the ordinal rank isn't as important as the rating, but obviously the ratings changed too. My point is if ESPN ran a system not that different from Pomeroy's and backtested it, they can't really say their model is more predictive than anybody's unless they used pre-tournament ratings.

throatybeard
02-10-2013, 11:41 PM
You'd be surprised. For example, Butler in Pomeroy's 2010 ratings went from 26th going into the tournament to 12th in his final ratings. UConn in 2011 went from 17th to 10th (their defense went from 31st to 14th). Butler in 2011 went from 54th to 41st; VCU in 2011 went from 84th to 52nd. In 2009, coming into the tournament Duke was Pomeroy's #7 and Villanova was #19. In his final rankings, Duke was #11 and Villanova #14, making it look much closer than he presumably predicted pre-tournament.

And I realize the ordinal rank isn't as important as the rating, but obviously the ratings changed too. My point is if ESPN ran a system not that different from Pomeroy's and backtested it, they can't really say their model is more predictive than anybody's unless they used pre-tournament ratings.

Well, I dunno. Six games is a lot of games out of thirty-someodd. An NCAAT run could change some things.

Kedsy
02-10-2013, 11:57 PM
Well, I dunno. Six games is a lot of games out of thirty-someodd. An NCAAT run could change some things.

I agree, but then one can't go back using the final ratings (including those last six games) and say the system correctly predicted anything, is all I'm saying.

Wander
02-11-2013, 01:57 AM
Not all games are weighted equal in kenpom - more recent games are given more value. I don't know to what degree that weighting is done, but that's part of what's going on with the NCAA tournament games affecting the ratings thing.

moonpie23
02-11-2013, 07:00 AM
I think Duke wins comfortably in the match up in Durham.

these kind of predictions seem to be falling flat.

Listen to Quants
02-11-2013, 11:25 AM
I agree, but then one can't go back using the final ratings (including those last six games) and say the system correctly predicted anything, is all I'm saying.

Technically correct. I may be beating this to death, but the two examples you give (Butler, UConn) are both 6 game runs, three times as much weight as the average team, and thus fairly extreme examples (ancedote.plural.data.not). Again, we don't know if this is an unweighted system. I don't even know if the system includes the tourney at all (?). But even if they made that mistake, the 2/35ths (or so) 'cheat' isn't massive (on average which is how they rated their systems). Then again, the detail I'm discussing isn't that massive either, I think. :)

Kedsy
02-11-2013, 11:45 AM
Technically correct. I may be beating this to death, but the two examples you give (Butler, UConn) are both 6 game runs, three times as much weight as the average team, and thus fairly extreme examples (ancedote.plural.data.not). Again, we don't know if this is an unweighted system. I don't even know if the system includes the tourney at all (?). But even if they made that mistake, the 2/35ths (or so) 'cheat' isn't massive (on average which is how they rated their systems). Then again, the detail I'm discussing isn't that massive either, I think. :)

I don't know, either. But I do know that if you apply the published Pomeroy rankings to past NCAA tournament brackets, they perform significantly better than if you use pre-tournament rankings, and that's what we're talking about for the BPI.

The "average" two game run is less relevant than it sounds, because the low-seeded first round losers would have been picked to lose anyway, but the prediction on the relatively close games has a good chance of changing if you take "future" tournament results into account. I know you know all this, but I think you're discounting the impact of using the tournament games when you backtest, at least for Pomeroy and I'd guess for BPI too.

hurleyfor3
02-11-2013, 01:16 PM
Dork polls have gotten waaaaaaaaaaaaaay too mainstream.

Indoor66
02-11-2013, 03:37 PM
Dork polls have gotten waaaaaaaaaaaaaay too mainstream.

And they are used to support each position in every discussion - as if they are Truth.

NSDukeFan
02-11-2013, 06:04 PM
And they are used to support each position in every discussion - as if they are Truth.

I am pretty sure they are only relevant when they support my position.

cptnflash
02-11-2013, 06:16 PM
Dork polls have gotten waaaaaaaaaaaaaay too mainstream.

We're certainly trying, but we're not there yet. Butler's still way ahead of Pitt in the polls, for example. So most of the mainstream media (not to mention coaches) appear to be kickin' it old school.

And regarding someone's earlier comment about the eye test... the obvious problem with it is that no one watches anything other than a tiny minority of the games. Computers watch everything.

tele
02-11-2013, 07:04 PM
I didn't think Pomeroy had ever revealed his equations for his methodology, it would be interesting to see what they are. Also the ESPN BPI group doesn't sound like they are going to reveal those "details" either. So without seeing those can't say much about their approach. While it sounds fine to account for injuries this will require collecting and evaluating additional information from teams and games specifically for use in their model, something none of the other approaches do to my knowledge, just rely on readily available stats from stat services don't they? Does Ken Pom collect any additional information just for his own use? Be interesting to know what.

Collecting and managing the injury info may prove to be more problematic than the benefits in results warrant. It will have to be done on a more or less ad hoc basis since there is no reporting requirement for teams like in the NFL, not that this requirement functions very well there. And there may have to be some arbitrary cutoffs for which players on which teams you evaluate in this manner, like top 25, top 65, or top 100 etc versus trying to track all this info for all the division one teams. Maybe they will just collect the info for the games espn covers. I'm sure vegas has a "handle" on when and how to account for available injury information.

Kedsy
02-12-2013, 12:11 AM
I didn't think Pomeroy had ever revealed his equations for his methodology, it would be interesting to see what they are. Also the ESPN BPI group doesn't sound like they are going to reveal those "details" either. So without seeing those can't say much about their approach. While it sounds fine to account for injuries this will require collecting and evaluating additional information from teams and games specifically for use in their model, something none of the other approaches do to my knowledge, just rely on readily available stats from stat services don't they? Does Ken Pom collect any additional information just for his own use? Be interesting to know what.

Collecting and managing the injury info may prove to be more problematic than the benefits in results warrant. It will have to be done on a more or less ad hoc basis since there is no reporting requirement for teams like in the NFL, not that this requirement functions very well there. And there may have to be some arbitrary cutoffs for which players on which teams you evaluate in this manner, like top 25, top 65, or top 100 etc versus trying to track all this info for all the division one teams. Maybe they will just collect the info for the games espn covers. I'm sure vegas has a "handle" on when and how to account for available injury information.

Just talking out loud here, the only two ways I can think of to take injuries, etc., into account would be (a) to assign a number to the player's value and deduct that from the team for the duration of the injury; or (b) track the team's rating separately while a player is injured and somehow meld that rating together with the team's rating at full strength. The problem with (a) of course is the subjective nature of the number assigned to the player's value so I wouldn't think they'd do that, although that's probably how Vegas does it. The problem with (b) is a small sample size and also the difficulty of dealing with multiple injuries on the same team and separating the effect of one of those injuries from the other(s).

Also, how far down to you go in determining which injuries to account for? There's certainly a difference between a 10th man going down and a starter, but how do you figure it out? Should you even bother dealing with a 10th man's injury? Marshall Plumlee's injury was obviously a lot easier for Duke to deal with than Ryan Kelly's. On the other hand, if Marshall really was one of Duke's top six guys when he went down (personally I doubt this, but K did say it), then his injury should be accounted for. But since he's played so little since he's returned, I can't imagine any algorithm that could figure out what we missed. Which leads me to the question of how would you take into account the time after a player returns from his injury but before he's up to game speed and re-integrated with his teammates? Kyrie Irving played in three NCAAT games, but he obviously didn't help the team in the same way he did during the first 8 games of that season. Finally, what if a guy is injured but plays hurt and isn't nearly as good as he would have been at full strength? And even if you could figure that out, I imagine dealing with injuries like Seth Curry's this year would burn out a few circuit boards.

Ultimately, while I'm on the side of those who think ignoring injuries is a major flaw in computerized ratings, I think there's way too many variables to do the injury thing correctly. With all due respects to the geeks at ESPN, I have to think an attempt to do so would probably make the rating system less credible.

P.S.: While I'm here, I think another flaw in computerized rating systems is using the same home court factor for all teams. Clearly playing at home is a much bigger advantage for some teams than others. Furthermore, the home team factor used by most of these systems appears to be just a few points, while for some teams (e.g., this year's Wake Forest and BC teams, among others) the factor seems like it should be much bigger. Similarly, "neutral" courts aren't all created equally, either. If you're playing Kansas in Kansas City, is that really the same as if you're playing them in Hawaii? If you're Kentucky playing in Atlanta, or Duke playing in New Jersey, should that be considered the same as Purdue playing in Anchorage? Again, you would seem to have the same three options as in injury situations: (1) do nothing, but that necessarily taints your results; (2) do something subjective, which obviously is problematic; or (3) try to run separate ratings for each variation and somehow meld them together, but then you have to deal with unacceptably low samples and too many variables.

Unfortunately, this sort of debate illustrates why computer rating systems aren't really as reliable as many of us believe them to be.

Wander
02-12-2013, 06:27 AM
P.S.: While I'm here, I think another flaw in computerized rating systems is using the same home court factor for all teams. Clearly playing at home is a much bigger advantage for some teams than others. Furthermore, the home team factor used by most of these systems appears to be just a few points, while for some teams (e.g., this year's Wake Forest and BC teams, among others) the factor seems like it should be much bigger. Similarly, "neutral" courts aren't all created equally, either. If you're playing Kansas in Kansas City, is that really the same as if you're playing them in Hawaii? If you're Kentucky playing in Atlanta, or Duke playing in New Jersey, should that be considered the same as Purdue playing in Anchorage? Again, you would seem to have the same three options as in injury situations: (1) do nothing, but that necessarily taints your results; (2) do something subjective, which obviously is problematic; or (3) try to run separate ratings for each variation and somehow meld them together, but then you have to deal with unacceptably low samples and too many variables.

It's subjective and imperfect, but kenpom does have "Semi-home" and "Semi-away" options for the type of scenarios you describe.