Lies, Damned Lies, and Statistics...

**CDu** · 04-17-2015, 02:39 PM

Originally Posted by Kedsy

But after looking at that, maybe my guess above is wrong. Very surprising to me that a team with a 9-point lead and 13 minutes to go would win more than 88% of the time. I mean, a three, a stop, and a three and it's a three-point game with 12 minutes to play and basically a toss up.

Obviously a lot of it has to do with (a) the pace of play and (b) who has possession when the lead in question is achieved.

A 9-point lead in a game between UNLV and Duke cerca 1991? Not so significant. A 9-point lead in a game between Wisconsin and UVa last year? Huge.

Still, I agree that a 9-point lead with 13 minutes to go doesn't SEEM like it should lead to an 88% likelihood of victory.

That said, the fact that we came back and made it a one-possession game in literally the least amount of time possible was very improbable. I would guess the odds of us getting back-to-back 3pt plays and them not scoring would be around 0.5-2% (~10-20% chance of scoring 3 points on any single possession for us; ~50% chance of Wisconsin scoring on any possession). So while it's true that "a 3, a stop, and a 3" gets us to within one possession, that scenario was a really unlikely scenario. And even in that scenario, it's a one-possession game with the leading team having possession and a roughly 50% chance of making it a multiple-possession game again. So we still would be less than a 50% chance of winning, even in that extremely unlikely scenario (which happened to play out for us).

So maybe 12% isn't as unreasonable as we both think.

**Bostondevil** · 04-17-2015, 02:51 PM

Originally Posted by roywhite

At the start of the tournament, Duke came in:
ranked #4 in the country
1 loss since January
road wins @Wisconsin, Louisville (full roster for the Cards), Virginia (full roster), Syracuse, and UNC.
with no major injuries

So, a 6% chance of winning? Not in my opinion, given those facts

To each his own, but I don't see much predictive value from 538 here, and even kenpom has it's limitations.

And all of those things were factored into the statistical model. If all teams are equal, each team has 1/64 chance of winning. Anything above that is added on because of things like quality wins, major injuries would be subtracted.

It's not like the model said they had no chance of winning.

We can look back after Duke wins and say, wow, the team that ended up winning was only given a 6% chance to do so! They were wrong! But they weren't wrong. They gave Duke a better chance of winning than all but a few teams. Most of those teams had quality wins and no major injuries too.

Just out of curiosity, if 538 had a model that said heads has a 50% chance of winning a coin toss and tails came up, would you say their model has no predictive value?

**Kedsy** · 04-17-2015, 02:57 PM

Originally Posted by CDu

So maybe 12% isn't as unreasonable as we both think.

Yeah, I guess, for all the reasons you state. Especially in light of the following:

Originally Posted by tbyers11

Grayson's personal 8 point barrage in 90 secs brought our WP back to about 30%

So being down 4 with almost 12 minutes to play still only gave us a 30% chance of winning? That's just amazing to me.

**bedeviled** · 04-17-2015, 02:58 PM

Originally Posted by darthur

It always makes me sad that we have so many ranking systems but nobody publishes data on how good they actually are.

tbyers11 replied with a great link. You may also be interested in an earlier post of mine.

Originally Posted by Kfanarmy

At some point though the user has to understand the utility of an individual model: does it do what is is intended to do. Given a season's worth of data, what should the performance standard for a model intended to predict the winner of the NCAA tournament be?

It seems like I often see things differently than others, so, in case it helps, I'll take the time to attempt an overly wordy answer/viewpoint to your question. I could be horribly wrong, but here's what I think:

First, there is a concept of primary importance which must be discussed. In my understanding, the given percentages (eg Kentucky has a 99.999% chance of winning this game) are NOT truly predictions of Team A's chance to win the game. Rather, they are the probability that the particular model accurately predicts which team will win the game. Those are certainly related concepts, and the media and even the model makers sometimes equate the two. But, they are not equivalent. (More on this later)

How are probabilities of winning even calculated?? IDK, but here's a guess based on things I've read.
Here's a simplified path of how I *think* the probabilities are established:
1. Calculate/assign each team a rating/ranking
2. Predict the winner of the game. I suspect that, for most models, this is the same as "Predict that the team with the higher rating/ranking wins the game." However, some models could include additional tricks.
3. Open up your system's modeled or historical data. Look at the data for which the teams had similar ratings/rankings to the teams in the current game. For what percentage of those games did this model accurately predict the winner?

This step is likely decently complex. But, for illustrative purposes, here is a quick graph of one way the modeled/historical data could look. For this basic predictive model, the winner is predicted to be the team with the higher rank and the independent variable is the difference between teams' ranks. Let's say Team A is ranked #6 and Team B is ranked #36 (a difference of 30 units). We look at the compiled data to see that our model is right only about 75% of the time when we say that the higher ranked team will win.
QuickProb copy.jpg

4. Tell everyone, "In modeling/historical data similar to Team A's and Team B's rating/ranking, we are correct 75% of the time when predicting that Team A is the winner." Or, "According to our model, team A has an 75% chance of winning." Or, just "Team A has an 75% chance of winning."

How are a model's predictions validated?
This gets more to your question about what the performance standards for a model should be. But, before talking about the tournament, let's consider the full season. How do these predictive models evaluate their "Percent chance of winning" performance over the season. From what I can piece together, the evaluation is along the lines of "with regard to the games the model predicted as having a specific probability, what was the model's actual accuracy in determining the winner?"

For example, we could look at all the games for which we predicted the favored team had a 50-55% "chance of winning" (In the graph above, this is essentially the same as looking at all the games in which the difference in teams' rankings was something like between 1 and 13 units - because those are the games in which we gave the favored team a 50-55% chance based on our modeled or historical data). So, how frequently did our model determine the correct winner? If it determined correctly somewhere around 50-55%, we would conclude that our prediction is valid.

Another example, in games we say we can predict the winner with 80-85% certainty, we should expect that we accurately determined the winner 80-85% of the time. Sooooo, the models are not faulty for only predicting the winner 80-85% of the time in those games. Indeed, the models have done exactly what they predicted!...just not what we wanted them to do

What should the performance standard be for predicting the winner of the tournament?
Finally, on to your question:

the utility of an individual model: does it do what is is intended to do. Given a season's worth of data, what should the performance standard for a model intended to predict the winner of the NCAA tournament be?

Well, these models were NOT built to predict the winner of the NCAA tournament. As mentioned above, they are indeed doing what they are intended to do...just not what we want them to do or what is being pushed upon them. The models predict individual games, not the tournament, and not the Champion. The probability of compound events (each individual game) is used to predict who the champion will be. Thus, the test to see if the model does what it is intended to do should not be how frequently the model predicts the Champion, but, rather, how well the model predicts individual games. It does not necessarily matter that 2014 UConn kept winning its way to a Championship despite having, say, a "15% chance" each game. What matters to the model is if, in all the games that the model gave the higher ranked team an 85% chance of winninng, did the higher ranked team win 85% of the time?

I *think* the models are doing what they are designed to do and are probably meeting that performance standard - they do not treat the Championship game as distinct from all the other games with similar opponents. Could a model be designed specifically for the tournament? IDK. Would it have to take into account the specific rounds of the tournament? Maybe not. Maybe the current models work fine for the tournament (ie even in the tournament, they do what they say they are capable of doing) but just need refinement to help increase their capability for tournament-type of games. For instance, I would think a tournament model would have to be based on how teams play against Top 50 or so opponents rather than comparing teams based on how well they would do against the NCAA average opponent. I mean, shouldn't a tournament model attempt to tease apart what separates a #2 team from a #12 team, rather than declaring the game a toss-up? As it is, the model IS correct in that the model is saying, "I can't predict who will win this game," and, sure enough, it does a bad job of predicting such a game, lol.

That's the rub. These aren't really predictions about a team's chances of winning. They are predictions about how well the model can predict who is going to win!!

My severely extreme analogy:
In reality, Duke has a 100% chance of beating East Chapel Hill High School and Kentucky has a 100% chance of beating Jumbo's Allstars (that's our DBR team!)
However, a model (we'll call it 'Mopnek') uses the following criteria to predict winners: there is a 50% likelihood of a team beating an opponent whose name starts with the next letter of the alphabet.

When the games D vs E and K vs J are played out, the winners are (D)uke and (K)entucky.
Mopnek predicted the winner 50% of the time, just as Mopnek said it would!
The fact that the Mopnek prediction was equal to the outcome in the sample is used to validate the system - the system predicts as accurately as it says it predicts.
BUT, that does not mean that a specific team's chances against another team are the same as the chances of the system predicting that game correctly.

The real meaning of that 50% prediction is "In those games, the model has a 50% chance of accurate prediction when choosing its team." It does NOT mean that the team actually has a 50% chance of winning. Put another way, a game predicted by Mopnek as a toss-up does not mean that the game could go either way, it just means that Penkom doesn't know which way the game will go.

Does it matter? Are there cases in which a team actually has a good chance to win a game but the models don't know that (ie declare it a toss-up game)?
The misinterpretations in the crazy analogy probably apply to real world scenarios, too. I agree with Wander in saying that Utah was overrated (because I desperately want to use KenPom to tell me who can beat whom

). In the Dork Polls thread, I tried to complain that KenPom wasn't good at predicting "who is the best team" the way that I view "best team."
http://forums.dukebasketballreport.c...309#post784309

Before Duke's 2nd win over UNC and Utah's loss to Washington, KenPom had Utah ranked #6 and Duke ranked #8. Yet, here were their average unadjusted efficiency margins versus the top KenPom teams (efficiency margin is offensive efficiency minus defensive efficiency...like, do you score more points than your opponent).

	Avg Per Game Efficiency Margin Against Top Teams in Kenpom Ratings
	vs Top 10	vs Top 25	vs Top 50	vs Top 100	vs Top 150	vs Top 200	All Games
UTAH	-13.276	-11.167	1.768	9.047	15.823	18.913	25.917
DUKE	13.369	15.211	11.975	15.876	16.058	15.709	22.523

I actually held off on posting that data at the time, in part because I feared Utah would prove me wrong (and 'cause the story was more complicated than this chart, with blowouts, recency effects, etc). Well, it turns out that we *played* Utah and beat them. Now, I look at that chart and am certain who I would pick in a battle between two Top 10 teams!

The rating of Utah (and Texas) made me consider that, while KenPom may do a good job of rating which teams are good according to certain criteria, it might not do the best job at deciding which top teams will beat other top teams. Most the time we don't notice this because
1. Good teams, according to many different criteria, tend to win
2. In games between two good teams, the predictions state that the game could go either way. So, the model looks correct when winning or losing.
Actually, in reality, the models ARE correct, we are just misinterpreting them. Sure enough, the models aren't good at predicting the games they say they aren't good at predicting (ie they predict the winner only 55% of the time in games where the model believes it has a 55% chance of predicting the winner).

Again, it does not mean that the team actually had a 55% chance of winning. Notably, Duke was 11-2 vs KenPom Top 25 teams (final ratings). Wisconsin was 10-2. Maybe it's just "luck" that KenPom can't predict their wins. Or, maybe, just maybe, there actually is an uncaptured something about certain teams that make them winners.

Anyway, I'm really, really sorry for the long post. And, again, I could totally be wrong, but that's how I see the "Chance of winning" topic.

**pfrduke** · 04-17-2015, 03:10 PM

Originally Posted by Kedsy

Yeah, I guess, for all the reasons you state. Especially in light of the following:

So being down 4 with almost 12 minutes to play still only gave us a 30% chance of winning? That's just amazing to me.

You've got to go from the baseline, though. If, with 40 minutes left to play and tied, we had a 45% chance of winning, we're necessarily going to be lower than 45% when we're down 4 with 12 minutes to play. Wisconsin has spent 28 minutes of game time improving its initial position. So, basically, the conclusion is that a team that was favored 55/45 at the tip will be favored 70/30 when it's up by 4 with 12 to play. That doesn't seem too unreasonable to me.

**Kedsy** · 04-17-2015, 03:10 PM

Originally Posted by bedeviled

Anyway, I'm really, really sorry for the long post. And, again, I could totally be wrong, but that's how I see the "Chance of winning" topic.

Don't be sorry. This is fabulous stuff.

**Mtn.Devil.91.92.01.10.15** · 04-17-2015, 03:13 PM

Originally Posted by Kedsy

Yeah, I guess, for all the reasons you state. Especially in light of the following:

So being down 4 with almost 12 minutes to play still only gave us a 30% chance of winning? That's just amazing to me.

Well, if you figured we started out the game at 45% chance, it stands to reason that regardless of the momentum, being behind would further decrease the chances of victory...

Oh, or what pfrduke said just before I posted.

**Mtn.Devil.91.92.01.10.15** · 04-17-2015, 03:14 PM

Originally Posted by bedeviled

TONS of good analysis excerpted

Wow - I can't spork you, but thanks for all this.

Just

wow.

**COYS** · 04-17-2015, 03:17 PM

Originally Posted by bedeviled

tbyers11 replied with a great link. You may also be interested in an earlier post of mine.

It seems like I often see things differently than others, so, in case it helps, I'll take the time to attempt an overly wordy answer/viewpoint to your question. I could be horribly wrong, but here's what I think:

First, there is a concept of primary importance which must be discussed. In my understanding, the given percentages (eg Kentucky has a 99.999% chance of winning this game) are NOT truly predictions of Team A's chance to win the game. Rather, they are the probability that the particular model accurately predicts which team will win the game. Those are certainly related concepts, and the media and even the model makers sometimes equate the two. But, they are not equivalent. (More on this later)

How are probabilities of winning even calculated?? IDK, but here's a guess based on things I've read.
Here's a simplified path of how I *think* the probabilities are established:
1. Calculate/assign each team a rating/ranking
2. Predict the winner of the game. I suspect that, for most models, this is the same as "Predict that the team with the higher rating/ranking wins the game." However, some models could include additional tricks.
3. Open up your system's modeled or historical data. Look at the data for which the teams had similar ratings/rankings to the teams in the current game. For what percentage of those games did this model accurately predict the winner?

This step is likely decently complex. But, for illustrative purposes, here is a quick graph of one way the modeled/historical data could look. For this basic predictive model, the winner is predicted to be the team with the higher rank and the independent variable is the difference between teams' ranks. Let's say Team A is ranked #6 and Team B is ranked #36 (a difference of 30 units). We look at the compiled data to see that our model is right only about 75% of the time when we say that the higher ranked team will win.
QuickProb copy.jpg

4. Tell everyone, "In modeling/historical data similar to Team A's and Team B's rating/ranking, we are correct 75% of the time when predicting that Team A is the winner." Or, "According to our model, team A has an 75% chance of winning." Or, just "Team A has an 75% chance of winning."

How are a model's predictions validated?
This gets more to your question about what the performance standards for a model should be. But, before talking about the tournament, let's consider the full season. How do these predictive models evaluate their "Percent chance of winning" performance over the season. From what I can piece together, the evaluation is along the lines of "with regard to the games the model predicted as having a specific probability, what was the model's actual accuracy in determining the winner?"

For example, we could look at all the games for which we predicted the favored team had a 50-55% "chance of winning" (In the graph above, this is essentially the same as looking at all the games in which the difference in teams' rankings was something like between 1 and 13 units - because those are the games in which we gave the favored team a 50-55% chance based on our modeled or historical data). So, how frequently did our model determine the correct winner? If it determined correctly somewhere around 50-55%, we would conclude that our prediction is valid.

Another example, in games we say we can predict the winner with 80-85% certainty, we should expect that we accurately determined the winner 80-85% of the time. Sooooo, the models are not faulty for only predicting the winner 80-85% of the time in those games. Indeed, the models have done exactly what they predicted!...just not what we wanted them to do

What should the performance standard be for predicting the winner of the tournament?
Finally, on to your question:

Well, these models were NOT built to predict the winner of the NCAA tournament. As mentioned above, they are indeed doing what they are intended to do...just not what we want them to do or what is being pushed upon them. The models predict individual games, not the tournament, and not the Champion. The probability of compound events (each individual game) is used to predict who the champion will be. Thus, the test to see if the model does what it is intended to do should not be how frequently the model predicts the Champion, but, rather, how well the model predicts individual games. It does not necessarily matter that 2014 UConn kept winning its way to a Championship despite having, say, a "15% chance" each game. What matters to the model is if, in all the games that the model gave the higher ranked team an 85% chance of winninng, did the higher ranked team win 85% of the time?

I *think* the models are doing what they are designed to do and are probably meeting that performance standard - they do not treat the Championship game as distinct from all the other games with similar opponents. Could a model be designed specifically for the tournament? IDK. Would it have to take into account the specific rounds of the tournament? Maybe not. Maybe the current models work fine for the tournament (ie even in the tournament, they do what they say they are capable of doing) but just need refinement to help increase their capability for tournament-type of games. For instance, I would think a tournament model would have to be based on how teams play against Top 50 or so opponents rather than comparing teams based on how well they would do against the NCAA average opponent. I mean, shouldn't a tournament model attempt to tease apart what separates a #2 team from a #12 team, rather than declaring the game a toss-up? As it is, the model IS correct in that the model is saying, "I can't predict who will win this game," and, sure enough, it does a bad job of predicting such a game, lol.

That's the rub. These aren't really predictions about a team's chances of winning. They are predictions about how well the model can predict who is going to win!!

My severely extreme analogy:
In reality, Duke has a 100% chance of beating East Chapel Hill High School and Kentucky has a 100% chance of beating Jumbo's Allstars (that's our DBR team!)
However, a model (we'll call it 'Mopnek') uses the following criteria to predict winners: there is a 50% likelihood of a team beating an opponent whose name starts with the next letter of the alphabet.

When the games D vs E and K vs J are played out, the winners are (D)uke and (K)entucky.
Mopnek predicted the winner 50% of the time, just as Mopnek said it would!
The fact that the Mopnek prediction was equal to the outcome in the sample is used to validate the system - the system predicts as accurately as it says it predicts.
BUT, that does not mean that a specific team's chances against another team are the same as the chances of the system predicting that game correctly.

The real meaning of that 50% prediction is "In those games, the model has a 50% chance of accurate prediction when choosing its team." It does NOT mean that the team actually has a 50% chance of winning. Put another way, a game predicted by Mopnek as a toss-up does not mean that the game could go either way, it just means that Penkom doesn't know which way the game will go.

Does it matter? Are there cases in which a team actually has a good chance to win a game but the models don't know that (ie declare it a toss-up game)?
The misinterpretations in the crazy analogy probably apply to real world scenarios, too. I agree with Wander in saying that Utah was overrated (because I desperately want to use KenPom to tell me who can beat whom

). In the Dork Polls thread, I tried to complain that KenPom wasn't good at predicting "who is the best team" the way that I view "best team."
http://forums.dukebasketballreport.c...309#post784309

Before Duke's 2nd win over UNC and Utah's loss to Washington, KenPom had Utah ranked #6 and Duke ranked #8. Yet, here were their average unadjusted efficiency margins versus the top KenPom teams (efficiency margin is offensive efficiency minus defensive efficiency...like, do you score more points than your opponent).

	Avg Per Game Efficiency Margin Against Top Teams in Kenpom Ratings
	vs Top 10	vs Top 25	vs Top 50	vs Top 100	vs Top 150	vs Top 200	All Games
UTAH	-13.276	-11.167	1.768	9.047	15.823	18.913	25.917
DUKE	13.369	15.211	11.975	15.876	16.058	15.709	22.523

I actually held off on posting that data at the time, in part because I feared Utah would prove me wrong (and 'cause the story was more complicated than this chart, with blowouts, recency effects, etc). Well, it turns out that we *played* Utah and beat them. Now, I look at that chart and am certain who I would pick in a battle between two Top 10 teams!

The rating of Utah (and Texas) made me consider that, while KenPom may do a good job of rating which teams are good according to certain criteria, it might not do the best job at deciding which top teams will beat other top teams. Most the time we don't notice this because
1. Good teams, according to many different criteria, tend to win
2. In games between two good teams, the predictions state that the game could go either way. So, the model looks correct when winning or losing.
Actually, in reality, the models ARE correct, we are just misinterpreting them. Sure enough, the models aren't good at predicting the games they say they aren't good at predicting (ie they predict the winner only 55% of the time in games where the model believes it has a 55% chance of predicting the winner).

Again, it does not mean that the team actually had a 55% chance of winning. Notably, Duke was 11-2 vs KenPom Top 25 teams (final ratings). Wisconsin was 10-2. Maybe it's just "luck" that KenPom can't predict their wins. Or, maybe, just maybe, there actually is an uncaptured something about certain teams that make them winners.

Anyway, I'm really, really sorry for the long post. And, again, I could totally be wrong, but that's how I see the "Chance of winning" topic.

Wish I could spork this post, but thanks for this lengthy analysis. You bring up a point that I have wondered about, and that is that the models perhaps have the most room for improvement in picking the winner between two SPECIFIC teams . . . particularly two top 10 opponents. This would require quite a bit more data mining and, given how short the college basketball season is and how unbalanced schedules are across conferences, it might prove to be impossible. However, perhaps the model could look at things such as the style of play, the effective height, or the measured athleticism of the teams against which one top team either consistently over-performed or under-performed against. I know John Hollinger put together Steal %, FT Rate, and Rebound % as athletic "marker" stats for projecting how a player would fare after making the jump to the pros. In addition, other stats like the turnover percentage of opposing guards or the usage rate for post players might shed some light on an individual matchup.

Maybe it would show that Wisconsin struggled a bit more against teams capable of getting to the line consistently, particularly guards like Tyus who have strong free throw rates. This might indicate that they struggle against quick guards and could have lent some insight on how they could lose to Duke twice.

I know that some models already try to do this, especially at the pro level. But I haven't seen any details as to how advanced models like 538 and Kenpom are when predicting specific matchups.

On the other hand, it could be that the models are already just about as good as they can be and Duke had the unlikely pleasure of beating a slightly "better" Badger team twice this year.

**freshmanjs** · 04-17-2015, 03:17 PM

Originally Posted by bedeviled

tbyers11 replied with a great link. You may also be interested in an earlier post of mine.

It seems like I often see things differently than others, so, in case it helps, I'll take the time to attempt an overly wordy answer/viewpoint to your question. I could be horribly wrong, but here's what I think:

i don't think there is any mathematical difference between

A. team x has a 55% chance of beating team y
B. team x will beat team y and that statement has a 55% chance of being correct

**CDu** · 04-17-2015, 03:19 PM

bedeviled, cool post. One question: why does the "all games" column look so weird in the efficiency margin data for Utah and Duke? Shouldn't it be a weighted average of the other categorical results rather than greater than any of those other results?

Still, your table (assuming it is otherwise accurate) shows something interesting and potentially important. And it gets back to another point others have hinted at. The thing with Duke this year is that, on the season, we had a tendency to lose our focus. As such, we rarely pummeled the teams we were expected to pummel, but we also rarely lost to top teams. In fact, we lost only twice to a team in the top-25 (Notre Dame twice), and both were potentially examples of us taking them too lightly (we blew them out when they had our full attention in Cameron). So it is quite possible that these models (which have to rely on a season's worth of data) actually did slightly underrate us, because they were based on a season's worth of data in which we didn't always play at or near our best.

Conversely, Utah appeared to be good at beating up on the teams they should beat up on, but struggled against top teams. So they may have outkicked their coverage a bit in getting a top-10 rating by virtue of taking every game (and every possession) more seriously than most. So whereas some of the teams in the 10-20 range may have slipped more often and played down to their competition, Utah may not have.

If both of the above are true, then models based on average performance of the course of a season are going to naturally consider that matchup more of a toss-up than perhaps it should be. So it may be that we really should have been a ~10% chance of winning. Or more. Or it could be that the model is assuming we might (like we did during the regular season and even during the ACC tournament) fall asleep at the wheel again thinking we could coast, and get upset.

Of course, none of this changes the overall point that none of these models are designed to accurately predict the champion. They are designed to, with a certain level of accuracy given the particular game considered, estimate the probability of correctly predicting the winner. So the best one can do is to compare the probability of correctly predicting the winner amongst games with a similar probability favorite/underdog.

In other words, they really aren't at all suited to predict the tournament champion, because all of them are going to predict that the #1 team to have the best chance, and with almost certainty even that team will have much less than a 50% chance of winning the title.

**CDu** · 04-17-2015, 03:28 PM

Originally Posted by pfrduke

You've got to go from the baseline, though. If, with 40 minutes left to play and tied, we had a 45% chance of winning, we're necessarily going to be lower than 45% when we're down 4 with 12 minutes to play. Wisconsin has spent 28 minutes of game time improving its initial position. So, basically, the conclusion is that a team that was favored 55/45 at the tip will be favored 70/30 when it's up by 4 with 12 to play. That doesn't seem too unreasonable to me.

That's an excellent point. These models are taking into account not just the time and score but also the perceived difference in quality of teams. Had we not been a 45% probability of winning, the probability of coming back from down 9 would have been slightly higher (though not much - the difference between 45% and 50% at baseline is small). So still a 30% chance of winning down 4 with 12 minutes to go given a 45% starting point is around a 34% chance of winning if we assume a coin-flip game at tipoff, which seems low.

I guess the argument is that it requires, at some point, a score/stop/score combo to win, which is unlikely even given a lot of opportunities. In a 50/50 game, the probability that you'll go two full possessions better over a span of roughly 24 possessions must be lower than I would have thought.

**Kedsy** · 04-17-2015, 03:30 PM

Originally Posted by CDu

bedeviled, cool post. One question: why does the "all games" column look so weird in the efficiency margin data for Utah and Duke? Shouldn't it be a weighted average of the other categorical results rather than greater than any of those other results?

I thought the same thing when I read his post, but then I realized the 200+ category isn't in his table. Both Duke and Utah had a big efficiency margin against sub-200 teams.

Also, we only beat Utah by 6, so it's not like the ratings were crazy wrong about our chances against them.

**CDu** · 04-17-2015, 03:36 PM

Originally Posted by Kedsy

I thought the same thing when I read his post, but then I realized the 200+ category isn't in his table. Both Duke and Utah had a big efficiency margin against sub-200 teams.

Also, we only beat Utah by 6, so it's not like the ratings were crazy wrong about our chances against them.

Oh, yeah, I totally forgot about 200+ games. Oops! Still, maybe "All Other Games" is what the last column means? No way our and Utah's efficiency margins for all games was 20+ given our results in the other categories.

Though I feel compelled to note that, for the same reasons the model isn't intended to predict the champion, saying that the 6-point game suggests the model wasn't crazy wrong isn't accurate. Pretty much outside of the model saying there was a zero percent chance of a 6-point Duke win, the game results are going to indicate the model wasn't crazy wrong. (okay, done with the nitpicking!)

The main point is that the post by bedeviled was a REALLY good one. Lots of interesting stuff to discuss in there.

**bedeviled** · 04-17-2015, 03:53 PM

Originally Posted by Kedsy

I thought the same thing when I read his post, but then I realized the 200+ category isn't in his table. Both Duke and Utah had a big efficiency margin against sub-200 teams

Kedsy's right. "All Games" does mean all games, not all others. You said, "No way our and Utah's efficiency margins for all games was 20+ given our results in the other categories," to which I respond, "WAY!!"

Sorry. I didn't much care about things after Utah passed us in performance against average opponents!

I really thought the BIG games should have been weighted more.
Interesting to me (though, again, the story is more complex with adjustments and recency effects):
- We had a better Efficiency Margin for games in Top 150 (though Utah was steadily catching up as the ranks got worse). They caught and passed us after 150.
- Yet, we also had a better Efficiency Margin for games in the 250-300 and 300+ stratifications
- In the 150-250 segment, where the final nail was placed, we only had 1 game - the close win over Virginia Tech, while Utah slayed their opponents in that range.

**COYS** · 04-17-2015, 03:54 PM

Originally Posted by CDu

bedeviled, cool post. One question: why does the "all games" column look so weird in the efficiency margin data for Utah and Duke? Shouldn't it be a weighted average of the other categorical results rather than greater than any of those other results?

Still, your table (assuming it is otherwise accurate) shows something interesting and potentially important. And it gets back to another point others have hinted at. The thing with Duke this year is that, on the season, we had a tendency to lose our focus. As such, we rarely pummeled the teams we were expected to pummel, but we also rarely lost to top teams. In fact, we lost only twice to a team in the top-25 (Notre Dame twice), and both were potentially examples of us taking them too lightly (we blew them out when they had our full attention in Cameron). So it is quite possible that these models (which have to rely on a season's worth of data) actually did slightly underrate us, because they were based on a season's worth of data in which we didn't always play at or near our best.

Conversely, Utah appeared to be good at beating up on the teams they should beat up on, but struggled against top teams. So they may have outkicked their coverage a bit in getting a top-10 rating by virtue of taking every game (and every possession) more seriously than most. So whereas some of the teams in the 10-20 range may have slipped more often and played down to their competition, Utah may not have.

If both of the above are true, then models based on average performance of the course of a season are going to naturally consider that matchup more of a toss-up than perhaps it should be. So it may be that we really should have been a ~10% chance of winning. Or more. Or it could be that the model is assuming we might (like we did during the regular season and even during the ACC tournament) fall asleep at the wheel again thinking we could coast, and get upset.

Of course, none of this changes the overall point that none of these models are designed to accurately predict the champion. They are designed to, with a certain level of accuracy given the particular game considered, estimate the probability of correctly predicting the winner. So the best one can do is to compare the probability of correctly predicting the winner amongst games with a similar probability favorite/underdog.

In other words, they really aren't at all suited to predict the tournament champion, because all of them are going to predict that the #1 team to have the best chance, and with almost certainty even that team will have much less than a 50% chance of winning the title.

CDu, this is an interesting analysis. I know Kedsy and other have also documented how our guys really did seem to focus in all of our biggest games. I wonder if this will become more common in the OAD era for objectively talented but young teams to have better than expected tournament runs. Duke started three NBA caliber freshman. Kentucky in 2011 scuffled for parts of the season before putting together a Final Four run late in year while also starting mostly freshman and sophomores. Then, UK made the Final Four last year despite struggling earlier in the season. For a while, Duke was actually known as a team that came ready to play early in the season but (supposedly) struggled later in the season as other teams caught up in terms of cohesion and preparation. I'm not sure if I buy that just yet, but there might be something to the idea that it is less likely for a team of talented freshman to play up to their potential night in and night out during the grind of the season but that when the big lights are on, that lack of consistent focus evaporates.

**CDu** · 04-17-2015, 04:07 PM

Originally Posted by bedeviled

Kedsy's right. Sorry. I didn't much care about things after Utah passed us in performance against average opponents!

I really thought the BIG games should have been weighted more.

Interesting to me (though, again, the story is more complex with adjustments and recency effects):
- We had a better Efficiency Margin for games in Top 150 (though Utah was steadily catching up as the ranks got worse). They caught and passed us after 150.
- Yet, we also had a better Efficiency Margin for games in the 250-300 and 300+ stratifications
- In the 150-250 segment, where the final nail was placed, we only had 1 game - the close win over Virginia Tech, while Utah slayed their opponents in that range.

Oh wow, I completely misread the table altogether. Please disregard my previous critiques of the table. The lesson here: I'm occasionally a dufus who can't read good.

Main point still remains (now even moreso): awesome article, bedeviled.

**tbyers11** · 04-17-2015, 04:30 PM

Originally Posted by Kedsy

Yeah, I guess, for all the reasons you state. Especially in light of the following:

So being down 4 with almost 12 minutes to play still only gave us a 30% chance of winning? That's just amazing to me.

One nitpick about this point that you, CDu, and pfrduke (suggesting that initial WP% plays a role) discussed eloquently above. By the play-by-play, we are actually only down 3 (48-45), not 4, after Grayson's barrage at the 12:10 mark. I agree that from casual observation it seems way too low. So now we can all be a little more incredulous

I also went back and found other points in the game when we were down 3 and the WP% off the KPom graph

Code:

Score	               Time	     Duke WP%
WI lead 12-9	13:28, 1st half	        39%
WI lead 36-33	19:01, 2nd half 	36%
WI lead 42-39	16:14, 2nd half	        33%
WI lead 48-45	12:10, 2nd half	        30%

If we started the game at 45% and were down 3 after 6 and half minutes, 39% seems about right. The resultant decreases in our WP% as the time remaining decreases also seem to make sense.

**bjornolf** · 04-17-2015, 05:06 PM

Never tell me the odds.

My dad always said, "You only have a 1 in 3000 chance of being struck by lightning, unless you're that one guy."
The unspoken ending to that was that then your chance is 100%. I always liked that one.

**Mtn.Devil.91.92.01.10.15** · 04-17-2015, 05:59 PM

Originally Posted by bjornolf

Never tell me the odds.

My dad always said, "You only have a 1 in 3000 chance of being struck by lightning, unless you're that one guy."
The unspoken ending to that was that then your chance is 100%. I always liked that one.

Likewise we have a 100% probability of being the 2015 NCAA Champs for the rest of time.

That's my favorite stat.

Thread: Lies, Damned Lies, and Statistics...

Thread Tools

Display

Similar Threads

Candidate lies about his heroic death?

Damned Lies, Etc. Department

Posting Permissions