PDA

View Full Version : Judging teams based on weakness of losses: now with a quantitative analysis



scottdude8
02-20-2018, 08:55 PM
So, I did a bit of a half-hearted analysis about a week-and-a-half back when we were in the doldrums following the UNC loss comparing the top teams based on their losses as opposed to their wins. (The original thread is here (http://forums.dukebasketballreport.com/forums/showthread.php?41384-Analyzing-the-top-teams-by-comparing-their-losses-or-things-aren-t-so-bad!)... also, Mods, I've made a new thread here since I'm going to do a completely different, more detailed analysis here. If it was inappropriate to make a new thread I apologize.) I've been motivated to update this analysis and do so in a more quantitative fashion given the fact that Bracketologists have Kansas as a No. 1 seed, primarily based upon the strength of their wins while ignoring the weakness of their losses. I think that's ridiculous based on how bad some of their losses have been.

By no means am I saying that this type of analysis should be the only, or even the first, thing used to analyze teams when it comes to seeding in the tourney; however, I strongly believe that this needs to be more of a factor in the decision making than it appears to be, especially when comparing teams of a similar caliber.

I'll do this analysis utilizing the Daily RPI Team Sheets on ESPN.com (you get to them by clicking a team's name on the Bracketology page). Utilizing some of the same ideas from my previous post, I'm going to make the analysis quantitative, where we'll get a definite score for each team. Here's the formula I'm going to use:

Each loss, to start with, will count as 1 point.
If the loss was to a team with an RPI between 50-100, that base score will be 2.
If the loss occurred to a team outside of the RPI Top 100, the base score will be 3.
If the loss was by double digits, the base score will get a +1. (Here is where the "base score" calculation ends).
If the loss occurred at home, the base score will be doubled.
If the loss occurred a long time ago (I'll say before January 1st, so "last year), I'll multiply the score by .75. I'll round the final score to the nearest whole number (rounding .5's down).

So the higher the final score, the worse a teams losses were, more or less. Obviously we can quibble about the measure, but I think this takes into account a lot of the feedback I got from my first try at this (thanks for all of that guys!).

Now, I'm going to go through Lunardi's current Top 8 seeds (the teams that, barring a huge run from someone else, will probably be competing for the No. 1 seeds) and calculate the "Loss Weakness Score (LWS)"... if I'm doing all this work I get to make up an acronym, lol. I'll be showing my work as best I can (if you see (1+1) that means that a top 50 loss was by double digits, for example... and you'll see where I multiply by 2 or .75).
Current No. 1 Seeds:

Virginia:1*.75+2*2=5
Villanova:1*.75+1+2*2=6
Xavier: (1+1)*.75+1+(1+1)+(1+1)*2=8
Kansas: 1*2*.75+(1+1)*2*.75+1*2+1+(1+1)+1*2=11

Current No. 2 Seeds:

Michigan State:1*.75+(1+1)+(1+1)*2=7
Duke:1*2+1+2*.75+2+2=8
Purdue: 1*.75+1*2+1+2*.75+3=8
Auburn: (1+1)*.75+1+1*2+2=6


What can we learn from this?

Virginia looks the best in this analysis, but that isn't much of a suprise.
All other teams have a score between 6 and 8. The quality of wins helps to differentiate these teams a lot, particularly weakening Auburn.
Kansas not only has the worst score, but is a major outlier, with a score of 11.


I'm planning on going back through this and continuing down the seeding lines, probably through the four seeds, later, perhaps as soon as tomorrow morning. I think that will provide some more context, particularly for Kansas (we'll see how far down we go until we get teams with a LWS of 11). However, from this we can see that Kansas' missteps are clearly worse than any other contender for a No. 1 seed. If that's the case, I'd argue for them to claim a top seed the quality of their wins would have to way better than the competition. And I just don't think that's the case, even if at the moment they have a solid "Q1" record. (I think the RPI has inflated the Big 12 and some of Kansas' non-conference competition (particularly Kentucky and Syracuse), and if that equilibrates Kansas' merit in the eyes of the committee should come back down to earth as well.)

Again, I'm curious to see the discussion this analysis raises on the board. My thesis is that, when you use not just wins, but losses, to compare the top teams, Kansas is a clear outlier and that should eliminate them from contention for the No. 1 seed barring some unusual developments. Whether or not the committee will consider that is a whole different argument, though!

CDu
02-20-2018, 09:16 PM
You shouldn’t look at just wins or just losses. You need to look at the entire body of work. So this analysis seems incomplete by like half.

Also, not sure why you are doubling the value of a home loss (I suspect because you were looking for a way to ding Kansas). But again, if you are going to do that, you should also look at wins, and double the value of meaningful road wins.

MrPoon
02-20-2018, 09:48 PM
This is a fun way of looking at it but I felt the same way CDu did, that the analysis looked incomplete.
I agree, without any quantitative research but just watching, that Kansas is over rated this year. I believe they have 4 home losses and they have been verrrrrry fortunate to win several others. It’s all in the beholder (the same could be said about Duke being fortunate) and maybe I only saw the wrong games but I’ll be having them probably out in the first weekend unless they get really favorable matchups.

Ultimately I’ll lean on KenPom for an agreed source of analytics and he has Kansas as #11. That seems about right to my untrained eye. I’ve heard the committee will be using outside sources like KenPom this year. Doesn’t seem as though the “pundits” are doing the same with KU as a #1 seed.

rsvman
02-20-2018, 10:30 PM
Interesting, but in my opinion double digits is arbitrary. A team that plays a fast pace losing by double digits is not equivalent to a slow-paced team losing by double digits. This overvalues, for example, a team like UVa that plays slowly.
A more equitable way would be to use the margin divided by the losing teams total score in the game, perhaps, or some other measure to get rid of pace as a major factor. 90-80 is not the same as 50-40. Maybe 90-81 is similar to 50-45?

scottdude8
02-20-2018, 10:38 PM
Seems like my purpose here was misunderstood. I completely acknowledge that by not including wins I’m not giving the whole picture. That was sort of the point, because I feel that most analysis of tourney resumes has seemed to only focus on WINS. This was meant to provide a compliment to that already existing analysis and show that the teams that may shine from a wins point of view don’t from a losses point of view.

To another point that was raised, I do think doubling home losses is reasonable considering how but home court advantage is in college basketball. I talked about this in more detail in my previous thread but I think bad losses at home are the loudest alarm bell that can be rung about a team. I can think of a handful of pieces of anecdotal evidence even from just Duke’s recent past that shows this. But in general, the idea that a road loss to a conference foe is almost always a possibility and can be written off. Consistently losing at home when you’re a team that is favored in those matchups is indicative of problems IMHO.

I also (jokingly) take offense to the accusation I spun this to ding Kansas. I made the measure before looking at any of the details of the teams. I’m also a mathematician so I would never do anything like that to begin with, it’s against my training, lol.

Troublemaker
02-20-2018, 11:27 PM
This is a fun way of looking at it but I felt the same way CDu did, that the analysis looked incomplete.
I agree, without any quantitative research but just watching, that Kansas is over rated this year. I believe they have 4 home losses and they have been verrrrrry fortunate to win several others. It’s all in the beholder (the same could be said about Duke being fortunate) and maybe I only saw the wrong games but I’ll be having them probably out in the first weekend unless they get really favorable matchups.

Ultimately I’ll lean on KenPom for an agreed source of analytics and he has Kansas as #11. That seems about right to my untrained eye. I’ve heard the committee will be using outside sources like KenPom this year. Doesn’t seem as though the “pundits” are doing the same with KU as a #1 seed.

Actually, the Selection Committee is still heavily using RPI, based on the recent bracket reveal (https://www.washingtonpost.com/news/fancy-stats/wp/2018/02/12/rpi-appears-to-have-too-much-influence-in-the-ncaa-tournament-seeding-process/?utm_term=.734db8a0553b). That's why the amateur bracketeers, also using RPI, did a good job predicting what the bracket reveal would look like.

The extent of KenPom being used this season seems to be that a team's kenpom rank is listed on the Team Sheets.

But, baby steps. 10 years from now, I'd be surprised if "advanced metrics" weren't the number one tool of the Selection Committee. But, as of now, I guess I'm just happy that kenpom is even listed on the Team Sheets.

(Let me add also that I would NOT want the seeding to be merely a reflection of kenpom's rankings. That's a whole other discussion, though...)

scottdude8
02-21-2018, 11:54 AM
OK, so I have some more time (i.e. I'm putting off doing some annoying busy work on my dissertation, haha), so let's continue this analysis on a few more seed lines.

Current No. 3 Seeds:

Texas Tech: (1+1)*.75+(1+1)+2+(2+1)+2=10
Cincinnati: (1+1)*.75+1+1*2+2*.75=6
North Carolina: (1+1)*.75+1+(1+1)+1+(2+1)+2*2+3*2*.75=17
Clemson: 1*.75+1+(1+1)+1+1*2+1=8


Current No. 4 Seeds:

Wichita State: 1*2*.75+(1+1)+1+2*.75+2*2=10
Ohio State: (1+1)*.75+1*.75+(1+1)*2*.75+(1+1)*.75+(1+1)+2*2+(2 +1)*2=19
Arizona: (1+1)*.75+1+1*2+2*.75+2*.75+2=9
Tennessee: 1*.75+1*2*.75+1*.75+(1+1)*2+1+(1+1)+(1+1)=12


New conclusions:

North Carolina has a horrible LWS, skewed heavily by the horrible home loss to Wofford.
Ohio State also looks bad in this analysis. Two losses to Penn State, including one at home, hurt a ton. But people have forgotten that Ohio State was BAD during the non-conference, as emphasized by all their older losses. I wouldn't be surprised to see their seed fall if they don't have a great B1G tourney.
Putting the top seeds in further context by looking two more seed lines down, there are only 3 teams that have worse LWS scores than Kansas, only two of which (North Carolina and Ohio State) are significantly worse.
Only three of the eight currently projected No. 3/4 seeds have single digit LWS scores (and one of those is artificial, in my opinion, since Cincinnati doesn't have many losses just because of their conference). 7 of the 8 currently projected No. 1/2 seeds have single digit LWS scores. So this type of analysis does provide a differentiator between the teams currently competing for No. 1 seeds and those who are clearly in the next tier.


Again, the important disclaimer: by no means am I saying that analyzing a team's losses via something like my LWS is the only important metric. I'm focusing on losses as a counterpoint to current Bracketology which largely seems to focus on quality of wins while ignoring bad losses. But I think that going down two more seed lines lends credence to my argument that Kansas is a major outlier amongst the No. 1 seed competitors when it comes to how bad their losses were, and that ideally would be taken into account more seriously by the committee come Selection Sunday.

MrPoon
02-21-2018, 01:40 PM
Math or not, I am very confident that come actual seeding time, Kansas and UNC will be ranked higher than their home losses ought to allow for.
I am hoping that the benefit of the doubt applies to Duke as well (despite no embarrassing home losses).
While BC and St Johns are losses that hang over the resume of Duke, their worst loss was by 11 points to NCST on the road. Even in the losses this team has never been blown out which is something few teams over the last five or six years can say.

CDu
02-21-2018, 01:47 PM
Okay, having read it more thoroughly, I have a few issues:

1. I think your weighting system is a bit off. Specifically:
- Home losses should not be worth double a road/neutral loss. Most simple metrics I've seen give a slight bump in the home/road aspect, but I've never seen more than a 30% relative weighting previously.
- Nor should losses to the top 50-100 be worth double losses to a top-50 team (especially if you are ignoring home/road in applying it). By your metric, a home loss to the team ranked #51 is equal to four road losses to the team ranked #49. If that home loss was by 10, it would be worth SIX road losses by as much as 9 to that #49 team. That just doesn't make sense.

2. The committee didn't put Kansas on the 1 line "primarily based upon the strength of their wins while ignoring the weakness of their losses." In fact, they've laid out the metric that they have primarily used (the quadrant system). Which considers both strength of wins and weakness of losses. They just don't share the same definition of what are bad losses.

3. As such, why not just use the metric that the committee has said they are using?

Using the NCAA's quadrant system, Kansas has a 10-4 record in quadrant 1 games, a 6-1 record in quadrant 2 games, and a single bad (quadrant 3+) loss. By comparison, Duke has a 3-4 record in quadrant 1 games, a 7-1 record in quadrant 2 games, and no bad losses. By that metric (the metric the committee has openly stated they are relying heavily on), it seems pretty clear why Kansas was slotted above Duke, no?

Compare that to the other teams on the 2, 3, and 4 lines:
Michigan St: 3-3, 5-0, no bad losses
Purdue: 6-3, 4-2, no bad losses
Auburn: 7-3, 5-1, no bad losses
Texas Tech: 5-4, 6-1, no bad losses
Cincy: 5-3, 7-1, no bad losses
UNC: 9-5, 3-1, 1 bad loss
Clemson: 3-6, 7-0, no bad losses
Wichita St: 3-2, 9-2, 1 bad loss
Ohio St: 3-5, 6-1, 1 bad loss
Arizona: 4-3, 6-3, no bad losses
Tennessee: 4-7, 6-0, no bad losses

As you can see, it very much depends on how one defines their ranking priorities. Clearly, the committee doesn't view Kansas's losses in the same way you do, as they only list one of them as bad (quadrant 3+) and only one of them as even sort of bad (quadrant 2). And when you compare their total resume to those of the other contenders, it seems pretty clear why they got on the 1 line: they have a VERY good record (third-best in the nation) in quadrant 1 and quadrant 2 games, with only the 1 bad loss. Compared to Duke, where we have a losing record in quadrant 1 games and our combined quadrant 1-2 record is still slightly worse than Kansas' record in quadrant 1 games.

CDu
02-21-2018, 02:19 PM
Just to be clear (and sorry for the multiple posts), I'm not voicing that I think Kansas is better than Duke. Just that Kansas has a clearly more impressive resume as of now based on the quadrant system that the NCAA is using.

We have a chance to change that for the good (and they do for the bad) over the next few weeks. But right now, it seems pretty clear and reasonable why Kansas is above us in the committee's head-start rankings.

scottdude8
02-21-2018, 03:30 PM
Thanks for your detailed comments CDu! The major thing I'd like to say in response is this: my intention wasn't to create a metric that "matched" the existing metrics for wins, but for losses. My intention was to create a metric that highlighted some critical aspects of what I consider "bad losses" that aren't highlighted in existing metrics. While I agree that my weighting system was probably heavy (I did things the way I did, in part, so I could just do the calculations in my head while I took a break from putting the finishing touches on my dissertation, haha), I think that it is reasonable for the top-tier teams competing for No. 1 seeds. I wouldn't argue for using this metric for, say, bubble teams. However, as I've mentioned before, I think that home losses or "bad" losses to really bad RPI teams are perhaps the easiest to see, and easiest to quantify, signs that a highly ranked team is overrated (if I was still working at The Chronicle or, better yet, someone was paying me for this, I'd definitely go through a few years of data and see how my metric, or an improved one, predicted NCAA Tournament success or failure).

So just to summarize:

I agree that this metric doesn't match existing metrics or the new "quadrant" system. I didn't intend it to.
Instead, I meant to create a measure that highlights features of a team's resume that aren't accounted for by many of those metrics.
I would only ever use this type of analysis to compare top-tier teams, or at least teams within the same general quality level, as I agree that not looking at the wins takes out a lot (I tried to higlight a couple points about that, particuarly with Cincinnati and Auburn).


I'm glad to see that my work is at least generating some discussion on the board, which was my intention! I really wish I could take the time to a rigorous analysis looking not only at this year but past years, because I think with some tweaks there could be something really interesting in a metric like this (I just think about recent Duke teams that have lost early in the tourney and their "bad" losses, for instance). Alas, I left journalism behind, haha.

CDu
02-21-2018, 04:45 PM
I guess my overarching point is that with a more accurate weighting scheme, you likely will see Kansas’ loss profile look only a little worse than the 2/3/4-seeds. And then when you look at the (currently hypothetical) complementary analysis of quality wins, Kansas will come out WELL ahead of those 2/3/4-seeds. And thus, in aggregation, you would probably get something like what we have seen with the committee’s list.

I don’t think anyone would disagree that Kansas has the worst loss profile of the top-10 teams. But outside of UVa and Nova, they also have by far the best win profile. So even if the math was a little more robust, looking only at losses is incomplete. And it is more incomplete than the metrics that the committee and others are using, which account for both quality wins AND losses.