The Stanley Cup Playoffs magnifies all the frustrations people have with the goaltending position. Tough to predict and performances with an outsized impact on game outcomes, what can you expect over a playoff series?
Playoffs, Pads, and Positional Paradoxes
Goaltending is a volatile position. With so much out of the goaltender’s explicit control, it’s extremely difficult to consistently deliver positive results. This can be true over the course of a season, but it is especially true over the course of a playoff series. Generally starting goaltenders in the playoffs have had good seasons, so at the margins, there isn’t much separation between most starters, usually not enough to predictably manifest itself over 4 to 7 games.
But this helps frame the paradox of goaltending. Tough to project, but few positions have more control of the outcome of the game. Looking at 2017-18 Wins Above Replacement, thanks to Corsica Hockey, 11 of the top 30 contributors were goaltenders. That’s also in aggregate too, goaltenders don’t play every game of the regular season like they normally do in the playoffs – when normalizing by games played, goalies make up the majority of the top 30 (17) and 10 rank above the most impactful skater, Connor McDavid.
So to the cynical and casual fan alike, the playoffs can simply appear to be a competition in waiting to see which goaltender gets hot at the right time. And endless frustration and soul-searching when the opposite happens.
What’s the best (healthiest) way to think about what you’ll get from your goalie in the playoffs?
Taking It One Game At a Time (TIOGAT)
Most starting goalies in the playoffs have a pretty good body of work during the regular season. Some good games, some bad, but probably more good than bad if their team made the playoffs. We can calculate a game-level performance by taking the difference between the actual goals against and expected goals, the number of goals an average goalie historically would concede given what we know about those shots against, (adjusting for rebounds, which goalies have some control over) and normalizing by total shots (or a percentage between actual and expected, sometimes referred to deltaSv% or Save % Lift Over Expected). So, if a goalie faced 50 shots against, totalling 4 expected goals, but only gave up 3, that game would have been 1 goal prevented on 50 shots, or 2 per 100 shots (2% better than expected). It’s also important to note that, unlike save %, expected goals attempts to weights shots by situation, so a 5v5 shot and 5v3 shot can be each compared to their relative, historical probabilities of a being a goal (though not perfect). Using all-situation results, opposed to just even-strength, creates a more reliable metric.
We can use a histogram to visualize the distribution of John Gibson’s 2017-18 performances, where one game is placed into each bin. I highlighted his first 2 playoff performances (as of 4/15/2018). About 63% of the time he made more saves than an average goalie would (a positive Sv% Lift Over Expected). In his 2 playoff games, he’s had 1 game where his Sv% was the same as we’d expect from an average goalie, given what we can quantify about shot quality against and another about 1.7% better than expected ((3.96 xG – 3 GA) / 57 shot attempts).
However, having to bin each game is a little awkward and to compare across multiple goalies the y-axis might need some scaling. Since most playoff goalies have 50 plus games this season, we can smooth and scale the distribution using a density curve, showing the probability of each outcome, without loosing too much information. Doing this smooths over Gibson’s lack of games with only slightly (~1-2%) better than expected results, which is partially strangeness, partially a result of binning, since he missed plenty to the left and right.
Shuffling the Deck
Armed with game-level performances for each goalie, we turn to the playoffs where each game is critical. We can use each goalie’s regular season results (partially attributed to team defensive performance) as a template of what to expect in the playoffs. Think of it as a deck of cards: we draw one card for a deck of that goalie’s performances and place it on the table. Do that again for the opposing goalie and their results. What does that look like after 4 to 7 games? What is the probability your goalie outplays the opposing goalie in a series?
With game-level performances, we can attempt to answer that. Below are Connor Hellebuyck’s and Devan Dubnyk’s regular season performances. When Dubnyk was good he was about as good as Hellebuyck, but when he was bad he was worse. In sum, Hellebuyck was better than Dubnyk on the year.
We can play this like you would the card game ‘War’ (with replacement, meaning a card or game goes back into the pile randomly and can be picked again). Tracking who ‘wins’ and by about how much a few thousand times, we can figure out what percentage of the time we might expect Hellebuyck to outplay Dubnyk, or vice versa.
Using Hellebuyck’s and Dubnyk’s results, Hellebuyck outplays Dubnyk about 57% of the time. In a short series we likely wouldn’t notice the difference, and it’s entirely possible Dubnyk outplays Hellebuyck (as I’m writing this, that appears to be the case in Game 3), but in a game where the marginal probability of winning is small and possibly with an upper bound of 62% accuracy, this is probably a welcome advantage to Winnipeg. I’m assuming the management and other’s with skin in the game would be interested in that edge.
Looking across all series we can calculate the same probability. We can also overlay the 1 or 2 playoff games performances over the distribution of season results. We can see Matt Murray and Brian Elliott oscillate between their season’s best and worst and Frederik Andersen pull a card he didn’t even know he had.
What’s likely and what actually happens are 2 different things. But it helps to understand how likely something is, which can give some important context to the results a game or two, even if those results might put your season in peril.
Note: I will be updating this plot during the playoffs on my twitter account (not in low-quality gif format too).
There are a few assumptions to address with this analysis:
- The Black Swan Game – Just because something isn’t in the data doesn’t mean it can’t happen. In game 2 against Boston, Frederik Andersen was pulled 12 minutes into the game, posting a save% 40% below expected, which he hadn’t done during the season. Part of this is artificial, he likely would have worked himself back into something less extreme by finishing the game. However, other games during the regular season where Andersen or any other goalie was pulled, would functionally look the same, whether it was -40% or -20%. A loss is quite likely. We’re more interested in: how often do they shit the bed?
- Independence of Sampling – This also assumes goalies compartmentalize game performances, opposed to some sort of lagged effect of a bad game leading to a higher probability of a bad game the next time out. In the playoffs, it certainly feels this way, because if you draw 2 – 3 bad games in a row, that’s usually the end of the season. However, in aggregate the last game has little effect on the current game. Even controlling for workload, a simple linear model found no effect for one game to another. Each season for the 16 starters looks pretty flat.
However, this is still a little naive. Confidence, health, team play (with or without your best players) might mean some stretches are more favourable than others and have less relevance to the series or game at hand. Additionally, matching up against a single team might result in some otherwise minor details to be exploited, perhaps creating an even wider distribution of outcomes (to the frustration of all).
- The Playoffs and Regular Season are Comparable – Do teams really tighten up defensively in the playoffs? Do goalies generally step up and play better? Maybe, and if so it would be unwise to sample from regular season games, where shots were more likely to be dangerous due to pre-shot passing plays or screen, and the goalie hadn’t really locked in yet. While goals actually do come a little harder in the playoffs (about 1 less goal than expected per 380 shots, or about 12 games), some of this is because the remaining goalies are, almost by definition, getting good results. Comparing goalie-season’s regular season results to playoff results, there’s generally no lift from regular season performance to playoffs, but stronger goalies are likely make it a little further.
On the other side of the ledger, for every goalie that is perceived to raise their game in the playoffs, another will struggle, due to some combination of luck, health, and psychology, but they don’t last long in the sample.
Like most problems, ignoring it won’t go away. If we create a distribution of game-level results from non-rookie goalies with less than 1000 career shots (a replacement-level type goalie) and compare to a goalie with very average results (Devan Dubnyk this season), Dubnyk’s edge is about 60%. Against a good season of results, like Jonathan Quick this season, it increases to about 75%. Not impossible odds, but it seems unlikely that the $4M saved could be put to better use and make up that margin.
Goaltending will continue to frustrate and mystify. But teams and their playoff fates (and possibly reactionary franchise decisions) will always be linked. With so few games and the marginal difference between high-end goaltenders true ability being so small, differences in results will rarely manifest themselves in a series like they do on paper. However, most winning in hockey comes on the margins, so these edges are important.
And they are highly visible, everyone in the rink knows when a star goalie with a big contract has a bad game in the playoffs, it’s not as clear if a 2nd line center has a net negative game, since goalies deal with the currency of goals directly, while for skaters measuring goals over a small sample is best avoided.
I also think this is a good opportunity to re-frame how we think about goaltending results. In the past, I’ve erred and misled by posting projections as some sort of point-estimate (i.e. based on past results, my model expects a 1% save% lift over expected) but it’s fairer to frame projections as a distribution of possible outcomes. Carey Price is a supremely talented goalie, and there were no on-ice results to suggest a poor year, but it was possible. I don’t have the data to interact physiological factors with Price’s age, but that would have helped. Understanding and framing this uncertainty would be helpful when locking someone into an 8-year deal.
Goaltending analysis often gets ignored because of these projection issues, but if we can properly quantify and convey uncertainty it would be a helpful step forward. Skater projections might offer some more certainty (though it’s rarely presented with uncertainty bounds), but their impact is generally one degree removed from the actual goals. Watching the playoffs, it will be clear goalies often have actual goals, wins, and loses hanging around their neck, so understanding these edges and how they might manifest themselves in a playoff series seems warranted.
Thanks for reading! I update goalie-season data using expected goals, it can be downloaded or viewed in my goalie compare app. Any custom requests ping me at @crowdscoutsprts or firstname.lastname@example.org.