Overfitting and How I Think About the NHL Draft
A statistical concept that might help scouts, no math required
For the past year or so, I have been researching the NHL draft. While I’ve learned a lot, I am sure there are many things I’m wrong about. Maybe this will be one of those things, but for now, let’s talk about overfitting.
Overfitting is a statistical concept that heavily informs the way I think about many concepts in life, including NHL prospect evaluation. I think it is the main reason analytical models can add value even in a world filled with fantastic scouts and why I am immediately skeptical of a lot of NHL draft analysis. So, let’s dive into what this statistical concept is and how I think it can help you think about prospects or hockey analysis in general. Without any math!
Data Detour: What is Overfitting?
Have you ever wondered why “analytics people” are seemingly obsessed with simple, often straight lines in such a complex world? It’s not that we believe weird and very precise non-linear relationships can’t exist. It’s that humans tend not to be very good at finding them (especially without incomprehensible amounts of data). As a result, less specific models can often generate more accurate predictions. Here is a classic example of the problem.
In the first example, we observed a general trend in the data, which likely will result in decent predictions. It’s very simple, but that’s the point.
In the second example, we did something that, on the surface, seemed smart. We tried to learn as much as we possibly could from the available data. Since we learned so much from this information, our second model of reality aligns much closer to what has happened than the first example did. Again this sounds like a good thing. We have captured more of what has happened.
The problem is that people consistently observed having all sorts of these specific rules we learned in the second example are often unlikely to persist in the future. As a result, the first, simpler, and at the moment, “worse” model is likely to generate more accurate predictions.
WTF Does This Have to Do With Scouting
That’s great, Chace, but what the hell does that have to do with scouting? Well, let’s start with an illustration of what statistical models currently do with NHL prospects, and then we will illustrate two types of pushbacks to these models and how that may result in overfitting. To begin, here is my beautiful illustration of what a prospect model will do at the moment. Assume we have adjusted for things like age in this image.
So some person uses their model and releases their rankings looking something like this. Using production as a proxy for how good a prospect is (I understand that’s not a perfect proxy, but go with me on the logic here). Now there are two types of criticisms of these models.
Criticism One
The first criticism is essential and, I believe, the primary point of scouting. These would be critical contextual factors the model is missing. So, if a player looks good but is propped up by a high shooting percentage, far superior teammates, more time on ice, or maybe one player is way better defensively, then you will want to adjust from these rankings. These kinds of criticisms, when done by good scouts, are probably the most valuable things in draft analysis.
So, don’t think I am suggesting any deviations from a model are dumb. These kinds of criticisms are what I want to hear when I post a chart, and you disagree; Everyone should want to learn stuff like this and become as smart as they can! These kinds of scouting inputs help inform that first model and make it as accurate as it can be by providing additional insights.
Criticism Two and Overfitting
The problem is, there’s a second kind of criticism of these models. This kind of criticism you will all be familiar with, and I think this is where the overfitting comes in. Imagine we are talking about a hypothetical prospect named Dordan Jumias. Jumias is crushing the QMJHL, scoring near the top of his entire draft class, so models love him. Yet many perceive him as, say, a third-round talent. When asked why you hear that, sure, he’s great right now, but he is short (which models can account for, btw), and also that he’s a perimeter player. This is supposedly a problem because perimeter players are unlikely to translate to the next level.
It’s statements like these that immediately make me skeptical. Or when hearing about someone like Andrew Cristall. I was told that for somebody his size (Five foot ten), he lacks the footspeed to be a great NHL player. IDK if these general evaluations persist, but you know the kind of criticism I am talking about.
Now, we aren’t talking about how good the player is at scoring more goals than they allow. Instead, we’re discussing very specific ways they achieve that goal. Since everyone prefers different tendencies and has different views on how they translate, the implicit model people present when incorporating factors like this now includes a lot of really, really rules in the predictions. Once we apply information like this to the entire draft class, the new model people are implicitly presenting will end up looking like this:
Look familiar? The inner data scientist in me can’t help but look at something like this and think the modeller (scout in this case) has massively overfit their forecasts. People may or may not know this is what they are doing, but in the most basic sense, I think it is. When you go beyond just are they good and venture into specific rules like are they good and tall but not a lanky tall so they get pushed off the puck, or are they good but lack sufficient footspeed given how tall they are? I think you're making a mistake, at least on average. All of these are really really specific rules and scream overfitting to me.
Again it’s not that I don’t believe any of these relationships exist. It’s that I know humans, even really smart ones, tend to be bad at finding them (Note: finding weird correlations like these is a promise of AI, so maybe I’ll have to circle back in a few years). But for now, I think these kinds of scouting evaluations may be actively bad because they are overfitting the model, even though they aren’t explicitly doing it with math. So the result will be worse predictions despite having more data.
That is my theory as to why I am so skeptical of more stylistic concerns, anyways.
This Logic Applies Elsewhere Too
I think this logic generally applies across many disciplines and most areas of hockey too. For example, remember the goal of hockey is to score more goals than you allow. And yet, how many times have you heard to win the Stanley Cup, you have to do “______, “ and then they list things like having at least three defenders with x amount of value and two centers with some amount of value and a goalie with x amount of value. You all know the ideas I’m talking about.
Any formula like this to win the Cup that goes beyond assembling the best collection of players is likely to run into a similar problem. Think you need specific roles filled? Maybe you do need a plethora of highly specific boxes checked because the past three cup winners did, rather than an overall collection of immense talent. Still, I am willing to bet any rules you come up with to win the cup either have already been broken at some point in the recent past or will be in the near future. That is unless your rule is collecting the best players possible. That simple rule has stood the test of time, and I am willing to bet it will continue to hold into the future. It’s the simplicity that ensures its longevity.
So overfitting may be a problem anywhere, but I think it’s likely to be a bigger problem in scouting than in many other fields because of the learning environment in which the NHL draft exists.
Why This Overfitting is Especially Likely at the NHL Draft
So even tho my thought process is generally applicable in many areas, I think it’s even harder to learn specifics in the NHL draft than in other areas of hockey/life for a few reasons.
The first and most important is the delayed feedback in the NHL draft. I published something on draft-eligible skaters for the first time in 2022. How long will it take me to get feedback on those predictions? Well, we can do our best to adjust on the fly, but if we are being honest, it’s going to take me a half-decade, or even more, before I get to know how any of my 2022 NHL draft predictions turned out. Who knows what I’ll be doing by 2027?
Worse yet, each NHL draft gives you a sample size of a few hundred predictions at best. (This ignores how many, maybe most? People scout regionally, making the sample size much, much smaller each season). Plus, because of the right skewed tendency that will be present in NHL statistics, only like 30ish? Prospects ever hit in a meaningful way. And that’s just decent NHL players. Want to learn to identify “stars”? You get like 5-10? Per draft. So not only are you getting slow feedback, you are barely getting any feedback each year.
As a result, you will need tons of NHL drafts before you can confidently infer if your process is “right” or not. This kind of learning environment is really, really punishing and makes learning specific forecasting rules very, very hard because you can be right by pure chance for half a decade pretty easily. Think of how difficult school would be if your math tests were only seven questions long. Plus, you had to wait seven years before you found out if you had the correct answer or not.
Compounding this problem is that since so few prospects hit, if you’re scouting and giving reasons why a prospect won’t turn out, you are almost certainly going to be correct whether your logic was sound or not. Max, Josh and I are collecting elite prospects data to make a prospect website. Our incomplete dataset contains like 200,000 skaters. Of all the draft-eligible skaters, less than half of one percent of them have become star players in the NHL. So if you find an oddly specific reason as to why prospects won’t become good NHL players, you will almost certainly have a correct prediction, even if your logic to get there is horrible.
I think this is the reason why most general rules you hear about prospects relate to them not becoming good NHL players rather than the other way around. Don’t like prospects with skating concerns? Pick them all to bust, and you will be right far, far more often than you’re wrong. Don’t like perimeter players in junior? Same story. Claiming that a prospect will fail is almost certainly going to be true, even if your logic to get there is flawed because the NHL is so damn hard to make. Even among the highly preselected group of observed prospects.
So notice how there are far fewer of these rules relate to a prospect succeeding? That’s probably because it’s exponentially easier to learn a false signal of failure than it is to learn a false signal of success in drafting.
Worse yet, on top of all this, the NHL is constantly changing. This compounds the problem because in a changing environment, even if you have identified a specific rule that held up for a while, it may not be true in today’s NHL.
Conclusion
Altogether, my logic is incredibly simple. Hockey games are won by scoring more goals than you allow. So how should you go about finding players to score more goals than they allow for your team eventually? I think the smartest way to do that is to find the best hockey players you can and pick them. When I hear specific criticism about a player’s tendencies despite his amazing results, the inner data scientist in me immediately becomes skeptical. Filtering out players because of rules about how they achieve their results sounds like overfitting to me. It’s not that I don’t think these rules exist. Some of them out there right now are probably true. I am just skeptical of our ability to find them, especially given the cruel and constantly changing learning environment that is the NHL draft.
Maybe I’ll look back on this in a few years and say, man, I was an idiot. Or, more likely, as I learn more, maybe there will be a few of these rules I’ll learn, and it’ll help inform my decision-making, even though it flies in the face of this post. But I think the logic here is sound. Picking people who excel at something to be the best at that thing in the future seems like a solid long-term strategy to me.
Thanks for making it to the end, and if you got this far, let me know what you think.