Tuesday, May 04, 2010

Blocked Shots: Luck or Skill?

Team A is on the attack in Team B's end of the ice. Team A fires a whole bunch of pucks at Team B's net. Some of these shots hit the goaltender, some go wide, and some get blocked. Does either team have control over those three outcomes? In other words, once Team A is in the offensive zone shooting pucks at Team B, are the missed and blocked shots the result of an actual skill (by either team), or are they simply the product of randomness? I set out to answer that question.


DATA

To limit scorer bias as well as playing-to-the-score bias (both of which have been well demonstrated on this blog), I looked at only "even strength on the road with the score tied" numbers. I used data from the '08-'09 season (because that's the only season I happened to have handy).

Let's establish some basic terminology. "Shots on goal" is comprised of Goals + Saved Shots. To differentiate between the attacking and defending team, we append a "For" and "Against." Goals For is abbreviated GF, and Goals Against is abbreviated GA. So if Team A is in Team B's zone and shoots the puck off the goaltender's pad, Team A gets a SSF (saved shot for) and Team B gets a SSA (saved shot against). Missed shots for and against are MSF and MSA, and blocked shots for and against are BSF and BSA. (To clarify the latter, if Team A shoots a puck that gets blocked, Team A gets a "BSF".) Cool?


MISSED SHOTS


Let's first look at missed shots. If teams have no offensive ability to control how many of their shots go wide, we'd expect the distribution of MSF% (i.e., [Missed Shots For] divided by [Goals For + Saved Shots For + Missed Shots For]) to look completely random (i.e., no different than if teams were flipping a fair coin and their differing results were due to pure luck, or "binomial chance variation" as statisticians call it). To test this, I ran 10,000 simulated seasons in which I gave each team the same "coin" weighted to the league average MSF% of 26.92%, and I gave each team the number of "coin flips" equal to the actual shot attempts (i.e. GF + SSF + MSF) they took in real life. So if the Oilers had 502 shot attempts in real life in '08-'09 on the road with the score tied, they got 502 flips of the coin in each simulated season.

I examined the MSF% spread (i.e., the results of the coin flips) of each simulated season by calculating the standard deviation every time. In 10,000 simulated seasons, the average sd was 2.08%, and the maximum sd was 3.25%.

The actual observed sd of MSF% in '08-'09 was 1.92%. In other words, the distribution of teams' MSF% looks no different than what we'd expect if they were all flipping the same coin. To put it technically, there is no evidence for offensive skill in Missed Shots beyond what we'd expect from binomial chance variation.

What about on the defensive side? Do teams have the ability to induce missed shots?

I ran the same simulations using MSA% and found similar results as above. The league average MSA% in '08-'09 was 27.69% with a standard deviation of 2.05%. The average sd in the 10,000 simulated seasons was 2.01%, and the maximum sd was 3.11%.


BLOCKED SHOTS

Now let's add blocked shots into the mix. Team A shoots a puck that's blocked by Team B. Was it just randomness, or is Team A doing something wrong that persistently allows Team B to block shots?

I examined the distribution of BSF% (i.e., [Blocked Shots For] divided by [Goals For + Saved Shots For + Missed Shots For + Blocked Shots For]). Once again, if the percentage of blocked shots amongst total offensive shot attempts differs widely by team, the distribution of BSF% should look a lot different than it would if teams were just flipping coins. But it doesn't.

League average BSF% in '08-'09 was 24.36% with a standard deviation of 1.99%. The average sd in 10,000 simulated seasons was 1.75%, and the maximum sd was 2.96%.

What about on the defensive end? Do teams have the ability to block a persistently higher number of opposing shot attempts than other teams?

I again ran 10,000 simulated seasons using the league average BSA% of 24.32%. The average sd was 1.68%, and the maximum sd was 2.74%.

The actual observed standard deviation of BSA% in '08-'09 was 2.77%.

Whoa. See the difference? The actual spread of BSA% in '08-'09 was wider than even the widest of spreads in 10,000 simulated seasons! That's completely different than what we saw with MSF%, MSA%, and BSF%.


CONCLUSION

Teams appear to have the ability to block shots, beyond what we'd expect from chance alone. While many of us hockey stat nerds have used Corsi (i.e., total shot attempts including blocks), a couple bloggers like Matt Fenwick and Gabe Desjardins often stated an intuitive preference for excluding blocked shots. Though I disagreed in the past, at this point I'm inclined to think that if we are using shot attempts as a proxy for meaningful territorial advantage, we should exclude blocked shots.

31 Comments:

Blogger Vic Ferrari said...

Just terrific stuff, Sunny.

So, as a best guess, the ability distribution of teams, re blocking shots ... has a std dev'n of about
2.2% around a mean of about 24%. That's still not a shitload, but it's obviously real.

Out of curiosity have you checked the same for all home games with the score tied?

5/04/2010 8:06 pm  
Blogger Hawerchuk said...

Sunny,

Great analysis. My preference for excluding blocked shots was because of home rink bias, but you have given us another good reason to not use them.

5/05/2010 10:49 am  
Blogger Kent W. said...

So we can reasonably assume that having a higher percentage of your shot attempts blocked is evidence of a deficiency, right?

Robert Cleave and I discussed this about the Flames this year and he did a little investigation in January that showed the Flames had the highest frequency of shots blocked versus attempts in the western conference at the time. I haven't re-checked the numbers since, but I doubt much changed in the interim.

I also haven't looked at their Fenwick vs corsi, but obviously the effect will show up there as well.

5/05/2010 10:52 am  
Blogger Sunny Mehta said...

Vic,

I haven't looked at the home games, but when I get some time I'll run the sims for those games and post the results.

5/05/2010 11:04 am  
Blogger Bruce said...

Outstanding post, Sunny.

Missed shots for and against are MSF and MSA, and blocked shots for and against are BSF and BSA. Cool?

Just to be crystal clear, a BSF is credited to the team taking the shot, and the BSA is credited to the team that blocks it? Terminology is a bit dodgy here cuz blocked shots are traditionally credited to the defensive team, which one might normally interpret as a "blocked shot For".

5/05/2010 11:10 am  
Blogger Sunny Mehta said...

Kent,

No, from an offensive standpoint, directing shots at net that get blocked appears to be random.

On the defensive side, the ability to block the other teams' shots appears to be a skill.

I think this probably jives with most of our intuitions. I'm kind of curious what the driving force is. I.e., teams that are good at blocking shots - are they good because they have players who are good at blocking shots (or generally "good defensively"), or are they good because it's something implemented in the team system by the coach?

5/05/2010 11:12 am  
Blogger Sunny Mehta said...

Thanks Bruce, and yes, you have it right on the terminology.

I agree too about how my terminology differs (and is perhaps directly at odds) with traditional vernacular, but imo it makes more sense to have all offensive shot stats be "For" and all defensive shot stats be "Against."

I think the NHL uses "A/B" (Attempt/Blocked) for what I'm calling "BSF". If a ton of you guys think my way is inferior/confusing, I can always go back and edit my original post.

5/05/2010 11:18 am  
Blogger Kent W. said...

No, from an offensive standpoint, directing shots at net that get blocked appears to be random.

Damn, that's right.

Interesting to see what happens for CGY next year then.

5/05/2010 11:35 am  
Blogger ranford said...

Data like this sheds so much light on the elements of hockey that:
a. Are Controllable
b. Contribute to winning

One question on the methodology (bear in mind that it has been a few seasons since I took a Stats class). Would increasing your sample size further refine your numbers? What happens if you include more team-seasons than just the 30 from 2008-09?

Thanks again, and keep up the tremendous work.

5/05/2010 12:15 pm  
Blogger Zack said...

Is there a difference in BSF among forwards and defensemen? My guess is that defensemen would tend to have more blocked shots (or at least, a greater percentage of their shots blocked) because they are more prone to taking long slapshots. I can also think of a few players in particular who seem to stubbornly complete shots that are almost certain to be blocked, but that could wash out at the macro level, or I could just be biased.

5/05/2010 12:44 pm  
Blogger David Johnson said...

I disagreed in the past, at this point I'm inclined to think that if we are using shot attempts as a proxy for meaningful territorial advantage, we should exclude blocked shots.

Maybe I am missing something but I don't understand this conclusion. If blocking shots is a skill, and not random, then to me it seems that it is evidence that we should include blocked shots rather than exclude. If we don't, a good shot blocking team will have a lower Corsi number than a bad shot blocking team, even though the territorial play might in fact be the same. How is a slap shot from the point that the goalie saves and different in terms of territorial play than a slap shot from the point that Hal Gill blocks. The only difference is who made the 'save' which has nothing to do with a difference in territorial play.

5/05/2010 2:33 pm  
Blogger Scott Reynolds said...

This is really interesting stuff Sunny. Thanks. I do think it's important to remember that what you've looked at here is data on the team level and not data on the individual level. I suspect that there may be individual players who get more than their share of shots blocked and others who get less. I looked at just the Oiler forwards using your criteria (min. 20 shot attempts) and found that shots from Moreau and O'Sullivan were blocked much less than average (they were also generally from the furthest away... hmmm...) with about 13% of total shot attempts (on goal, misses and blocks) blocked, whereas shot attempts from Cogliano and Penner were blocked far more often (29% for Penner and 31% for Cogliano). These are very small samples so it might not persist but it's something I may follow up on. It seems like this could be similar to shooting percentage where it doesn't really persist on the team level but at the individual level there are some shooters who are better than others.

5/05/2010 3:05 pm  
Blogger dawgbone said...

I think another thing to consider is the style of the offensive team taking the shot.

Kent brings up the point about Calgary having a lot of shots blocked. Could that be because of the fact that they tend to drive the net with 2 and sometimes 3 players, which also probably sees 2 or 3 opposition players heading to the front as well, which results in a lot of bodies in front of the net to block a shot (intentional or otherwise).

I too think that blocked shots should be included when determining territorial advantage. I mean you have the puck and you took a shot. How is it different from missing the net in terms of determining territorial dominance?

5/05/2010 3:26 pm  
Blogger SBurtch said...

If you are using blocked shots to account for territorial advantage, home bias should be completely irrelevant, with respect to possession. If a shot is attempted, whether or not a statistician records the shot as blocked or not doesn't affect the point that it was taken by the offensive team.

Similarly, despite a home bias to register shots as blocked in favour of the home team, that would only impact on which proportion are registered as missed vs. blocked. It doesn't affect the total number of attempts.

The only way home bias would affect anything here is by recording excess blocked shots in place of missed shots to individual skaters.

Even if it has been shown that home rink statisticians are somehow recording a distorted ratio of missed vs. blocked shots for teams in their given rink, I'm not sure what the point in this discussion is... or why it would have any impact upon the Corsi number.

5/05/2010 6:47 pm  
Blogger Sunny Mehta said...

Vic,

LOL, I just ran the home numbers you asked about. Check this shit out:


league average home BSA%: .2436
mean simmed sd of BSA%: .0175 (max was .0276)
observed sd is .0379

league average home MSA%: .2692
mean simmed sd of MSA%: .0208 (max was .0326)
observed sd is .0401

league average home BSF%: .2428
mean simmed sd of BSF%: .0168 (max was .0265)
observed sd is.0334

league average home MSF%: .2773
mean simmed sd of MSF%: .0201 (max was .0301)
observed sd is .0345


Um, WTF?! Holy fucking recording bias, batman? Those numbers are so screwy compared to the road numbers.


(For ease of comparison, here are the road numbers again...)


league average road BSA%: .2432
mean simmed sd of BSA%: .0168 (max was .0274)
observed sd is .0277

league average road MSA%: .2769
mean simmed sd of MSA%: .0201 (max was .0311)
observed sd is .0205

league average road BSF%: .2436
mean simmed sd of BSF%: .0175 (max was .0296)
observed sd is .0199

league average road MSF%: .2692
mean simmed sd of MSF%: .0208 (max was .0325)
observed sd is .0192

5/06/2010 10:26 am  
Blogger Sunny Mehta said...

ranford,

Yes, ideally it'd be great to have bigger samples. The only reason I was hesitant to combine consecutive seasons for something like blocked shots (as opposed to, say, team save percentage) is because I thought coaching/personnel changes could add significant bias.

5/06/2010 10:30 am  
Blogger Sunny Mehta said...

Zack,

I only looked at it at the team level, so I can't comment on the difference between F and D. Though my intuition agrees with yours that D are likely to have more BSF.

5/06/2010 10:32 am  
Blogger Sunny Mehta said...

David,

Say two teams have the same Fenwick +/- but one of the teams has a lower Corsi +/- due to having more BSA (i.e. blocking more shots in their own defensive zone). If BSA is truly a skill and the team is really good at it, they can get away with letting up a few more blockable shots. So looking at the Fenwick number will have slightly more predictive value about MEANINGFUL territorial play. As such, I think Vic mentioned somewhere recently that Fenwick did in fact correlate slightly higher to scoring chances than Corsi did.

Having said that, I think it's probably half dozen of one, 6.1 of the other. I.e. they're so close it makes little difference whether you use corsi or fenwick.

5/06/2010 10:45 am  
Blogger Sunny Mehta said...

SBurtch,

As shown in several posts on this blog, JLikens' blog, Tom Awad's articles on PP, etc, the home recording bias applies to all of the things you question - under/overcounting of shots on goal, misses, blocks, total shots, ratio of misses and blocks to total shots, etc.

See my above comment to Vic with the actual home/road numbers.

5/06/2010 10:48 am  
Blogger Vic Ferrari said...

Terrific stuff on the home data, of course these numbers seem huge because the road non-luck distributions are almost too small to be seen. I think David and SBurtch have a point.

At the risk of being a complete pain in the ass; as you have the numbers handy, for the home stuff could you run the same model with (BSA+MSA)% and the opposite?

We have some empirical evidence that N.J Devils (who will almost certainly read this article of yours as well as the comments, btw) have a shot recorder that flat out misses blocked shots and misses. We do not know that this is common in the league.

Intuitively it strikes me as a hard thing to screw up. While clearly the NJD counter manages to mess it up, I've not seen anything else that convinces me others do the same. I mean coaches that run power vs power wih their top line should expect to see higher corsi totals in home games, the opposite for guys who run old school checking lines. How do we address that? I would bet real money that some teams make a genuine effort to play a more entertaining style at home, which inevitably leads to more total shots and scoring chances in those games. What else are we missing at the strategic level in that regard, re total shots and scoring chances in home and road games by team?

My point is that using totals,it will be a very tough nut to crack.

However, we now have a lot of good reasons to believe that this model is solid, given the freakishly close pegging with road-EV-tied results and the wide gap in home numbers. And we know the non-luck distribution is comprised of scorer bias + team ability ... this is going to work out conveniently, as scorer bias is going to be the only horse left on the racetrack for 4 of the 6 things we can measure.

Makes sense, no?

I think we're getting close here, we may have to ruminate on this for a bit.

5/06/2010 10:17 pm  
Blogger JLikens said...

Vic:

The Chicago shot recorder is equally as bad at undercounting blocked and missed shots. Especially missed shots.

At least, that was the case in 07-08 and 08-09. I haven't looked at 09-10 yet.

If you (or anyone) want more specific information, I'm happy to oblige.

5/06/2010 11:50 pm  
Blogger JLikens said...

One more point.

While the recording of shots on goal may not be perfect, the recorders are much, much worse when it comes to accurately recording blocks and misses.

Looking at EV data from 08-09, I get the following standard deviations in terms of shots (both for and against) at the team level.

Road shots on goal - 84
Home shots on goal - 102

Road missed shots - 35
Home missed shots - 104

Road blocked shots - 69
Home blocked shots - 155

5/06/2010 11:56 pm  
Blogger RiversQ said...

Great post Sunny. I'm going to think about it a little more and comment again later.

So looking at the Fenwick number will have slightly more predictive value about MEANINGFUL territorial play.

Personally, I still think having my shot blocked is inherently better than blocking the opposition's shot because it means two things:

a) I had the puck.
b) I had it in the offensive zone.

The point isn't that blocking shots is a bad thing, which Matt has always fixated on. In fact it's one of the best possible outcomes when you don't have the puck. However, it seems clear to me that it's worse than the alternative, which is to have the puck and have it in the offensive zone.

Trust me, it pains me to say this because I have always been a defender in every sport I've played - I'd love to say that blocking shots was wonderful. There is a lot to be said for taking away the middle of the ice/field and pushing shots to the outside, but it's really just making the best of a bad situation. I readily recognize that I'd much rather have the puck or ball myself than block a damn shot.

As such, I think Vic mentioned somewhere recently that Fenwick did in fact correlate slightly higher to scoring chances than Corsi did.

Yes, but if I recall correctly the difference is almost nothing at all.

Anyway, I'll comment more later. Again, really good post.

5/07/2010 12:08 am  
Blogger Sunny Mehta said...

Vic,

(home numbers...)

league average home (MSF+BSF)%: .4528
mean simmed sd: .0195 (max was .0327)
observed sd is .0371

league average home (MSA+BSA)%: .4472
mean simmed sd: .0203 (max was .0314)
observed sd is .0464


(road numbers...)

league average road (MSF+BSF)%: .4472
mean simmed sd: .0202 (max was .0307)
observed sd is .0208

league average road (MSA+BSA)%: .4528
mean simmed sd: .0195 (max was .0302)
observed sd is .0299

5/07/2010 10:04 am  
Blogger Sunny Mehta said...

Vic,

Clarify a couple things for me.

Even if you're right that certain teams play a more up-tempo style at home, how does that affect what I'm looking at here? All I'm analyzing here is, "Of the total shots that a team attempts (and has attempted against them), what percent are missed/blocked shots?"

I can't see why teams would have no ability to control their missed/blocked shot percentages on the road (except for BSA%) but suddenly have that ability at home. Doesn't it make more sense that the answer is simply: every home scorer has a different definition of what constitutes a "saved shot", "missed shot", "blocked shot", and even "shot attempt"?

5/07/2010 10:25 am  
Blogger Sunny Mehta said...

RiversQ,

I totally agree with you that a BSF is more preferable than a BSA, any way you cut it. My point is that a BSA is not as bad as any of the other shots against, and more importantly, it appears predictive of future BSA.

Everyone can agree that having the puck in the other guy's end is better than the puck being in our own end. And everyone agrees that WHEN the puck is in our end, blocking the shot is a better result than the puck going on goal. However, the sharp cats say, "well, if a blocked shot is not predictive of future blocked shots, and is in fact only predictive of future shots against, it's a bad thing." And I would agree with that except for the fact that it appears blocked shots ARE predictive of future blocked shots, so we might get a slightly better metric if we leave them out.

Having said that, as I said in my response to David, there is probably very little difference (in terms of predictive value) whether you leave them in or out. If the correlation of Fenwick to scoring chances is juuuust a teensy bit better than Corsi is, that's about the magnitude I'd expect it to be.

5/07/2010 10:50 am  
Blogger David Johnson said...

Having said that, as I said in my response to David, there is probably very little difference (in terms of predictive value) whether you leave them in or out. If the correlation of Fenwick to scoring chances is juuuust a teensy bit better than Corsi is, that's about the magnitude I'd expect it to be.

But what are we trying to use Corsi for? Originally it was developed to evaluate how difficult a goalies night was, and then that was extended as an indication of territorial advantage/control. Now we are talking about using it as a predictive tool for scoring chances? That might be a reasonable goal, but then it opens up a whole set of questions.

When it comes to trying to determine territorial advantage leaving blocked shots in makes a ton of sense as I explained above. Now, when it comes to using it as a predictive tool for scoring chances, then leaving blocked shots out makes a lot of sense too because it is probably pretty rare, if ever, that blocked shots are considered scoring chances considering most of them would be associated with shots from the perimeter and don't even reach the goal.

So, as a predictive tool it makes sense to me that shots + missed shots is a better predictive tool than shots + missed shots + blocked shots. My question is whether anyone has looked at whether shots alone is a better predictive tool of scoring chances than shots + misses shots? Does adding misses shots tell us anything more about scoring chances? I don't know, but I could see that going either way.

Now if we are trying to find the best predictive tool for scoring chances we need to look at whether some kind of shot type/distance factoring would be even better. Shot type/distance has been used as a proxy for shot quality in the past, has it been proven to be less useful than any of these new methods?

So many questions, so few answers.

5/07/2010 1:25 pm  
Blogger BenHasna said...

Very interesting, sunny!

@Zack:
I've looked at some numbers of the New York Islanders lately. And as you guessed, the difference between defensemen and forwards in terms of getting shots through is pretty big. Players with fewer than 100 Corsi total events are not included.
NYI team 09-10 BSF%: 27.27%
NYI defensemen 09-10 BSF%: 34.91%
NYI forwards 09-10 BSF%: 22.01%

In terms of missed shots, the difference is very small, though.
NYI team 09-10 MSF%: 20.11%
NYI defensemen 09-10 MSF%: 19.33%
NYI forwards 09-10 MSF%: 20.64%

5/09/2010 10:28 am  
Blogger Vic Ferrari said...

Sunny

Thanks for the info, terrific stuff. I'll get back to it when time permits.

5/12/2010 6:48 am  
Blogger Vic Ferrari said...

Sunny said:

Even if you're right that certain teams play a more up-tempo style at home, how does that affect what I'm looking at here?


I'm looking at ways to determine the errors in shot counting at each rink. We need that to get a narrower likelihood distribution for individual teams' PK and PP ability.

5/12/2010 6:53 am  
Blogger Vic Ferrari said...

David Johnson:

Where you been, brother. Damn, the bus left the station on those topics two years ago. You really don't know.

I like the way you started showing your winnings against the spread, back when you first started blogging. I know that you've deleted it since, and were losing south of the hold (i.e. a randomly guessing monkey would have outperformed you). Not to be a dink, that's actually a sign that your close.

A recent review of 30 academically published English Premier League forecasting methodolgies showed that all would have outperformed a randomly guessing monkey, but that all 30 would have been outperformed by the same monkey against oddsmakers numbers.

By the same token, if you took the top 20 sabermetricians and had them wager on baseball gamelines, at oddsmaker prices, betting against Joe Morgan ... who do you think would win?

The correct answer is Joe Morgan of course. I would love that wager btw, could anything be more decadent than wagering on someone else's wagering behavior? It would be a beautiful thing.

Point is, being that bad actually means you're fairly close, David. Seriously.

5/12/2010 6:09 pm  

Post a Comment

<< Home