Jump to content

Arguments Against Advanced Stats?


RiskyBryzness

Recommended Posts


lol, didn't mean to.  thought this could be interesting, there's this whole wave of advanced stats people, and so much of it seems like useless fiddling with spreadsheets, wanted to see if there was something i was missing.

 

There are allot of advanced stats bean counters all of a sudden. Probably here to stay but right now it's the flavor of the day and many are lined up buying it.

 

You have a better set of stats they should be following though. It would be interesting to see some of that some day...

 

One last thing about Corsi stats, you don't see a coach on the bench with a lap top figuring out the possible trend lines for the rest of the game. And they don't make line up changes from game to game or in game adjustments based on what corsi says they should do. Why do we hear the word chemistry so often. Because it's true (and an intangible).

Link to comment
Share on other sites

  • Replies 223
  • Created
  • Last Reply

Advanced stats in hockey is a relatively new thing...but I think it is something that is here to stay, and the media and of course, coaching staffs and managements will reflect that more and more sooner or later.

 

Nothing really wrong with all that though.

Baseball has been doing it for a while now with varying uses for the myriad numbers.

 

Problem is, along with that, you will get the 'know it all fans' who eventually think they know more than everyone else because they can manipulate numbers better than some other people.

That's the unfortunate side effect to the whole 'Advanced Stats' movement.

 

But those of us with open minds, level heads, and common sense will survive it all.

 

Advanced stats or not, there will ALWAYS be know-it-alls.

In baseball, you have the people who never picked up a baseball bat, in hockey you will have the people who never laced up a pair of skates, in football, the people who never put on a set of pads...etc, etc...and ALL those people, because they know advanced stats, will profess themselves to be superior somehow to those who don't....or even superior to those who actually played their respective sports.

 

It's not the advanced stats themselves...we can take'm or leave'm as we see fit....but rather the sideshow of comically self important people that will follow that may bring some grief to tried and true sports fans.

 

I know some saber-nerd type fans in both hockey and baseball, and many of them are pretty personable.

They will gladly sit there and explain to you all day about any given stat, situation, etc, but not be an ass about it.

But just like your standard fans, you take the good and bad with it all.

 

I say bring on the advanced stat movement...I will become a bit more knowledgeable myself in them, but first and foremost, will always be that wonder eyed little girl rooting for her favorites.

Link to comment
Share on other sites


Why do we hear the word chemistry so often. Because it's true (and an intangible).

 

and ultimately, that is the goal of this whole "advanced stats" thing, find some measure of effectiveness outside of the pure points race.  they are trying to find something that can reflect the contribution a guy makes to the success of his team without referring directly to goals.  i'm a fan of that effort, it'd be cool to have something you could hang your hat on and say, "this guy does all The Things right, whether or not it shows up on the scoreboard at a particular moment isn't relevant because other things come into play, but he's doing The Things right."  

 

we haven't found that metric, though.  again, absolute possession numbers seem like they'd be a step in that direction...though that would still have a lot to do with linemates and not just the single player himself.  still, factor in a "relative" concept to it (line possesses the puck X% of the time with this guy, Y% without him, the difference being his contribution"), and maybe you have something.  'course, then you are tracking specific line combinations.....  i dunno.  maybe it is a windmill, but i'm ok with people tilting at it.  as long as they are thinking about it.  and the corsi people aren't.

Link to comment
Share on other sites

ok, a post entitled "Corsi correlates well to scoring chances, as well as goals", which then goes on to demonstrate how taking shots relates reasonably well to getting scoring chances, which then in turn relates reasonably well to actually scoring goals.  with graphs.

 

now, imagine a post entitled, "+/- correlates well to scoring chances, as well as goals".  there'd be a graph of +/- events (i.e., goals) on one axis, with scoring chances on the other axis, and it there would be an all-but-straight line, indicating an almost-direct relationship between the two.  then, there would be a graph that showed +/- events (i.e., goals) on one axis, and goals on another axis, and it would be a straight line at 45º, a one-for-one relationship.

 

Do you honestly believe you've just defeated the last decade of hockey analytics research with a completely hypothetical post that doesn't exist? You don't counter actual research that says shot attempts are important with a hypothetical you thought up in the shower. 

When you only look at goals, the statistic only captures 2 or 3 minutes of a game, maybe a handful of shifts. If you expand that sample size with scoring chances or shots, suddenly you are evaluating 30, 40, 50 minutes of the game, and hundreds of in game events. Traditional +/ or a stat like goalsfor% takes seasons worth of data to be an effective tool because small samples are swamped by bad luck, goaltending, and other factors. Corsi takes just a dozen or so games before the influence of luck starts to diminish.

Regarding the idea that the other 9 players on the ice impact shot differentials. Duh? Nobody who takes these stats seriously seriously believes that a player is 100% responsible for every shot attempt for which they get credited a + or - . This is why metrics like Quality of Competition, Zone Starts, Quality of Teammate, Team Effects, etc exist. And the impacts of those 9 other players have been studied, and quantified and can be taken into consideration when judging a player.

 

Corsi is a broad tool for measuring on ice impact. Under that umbrella falls a laundry list of skills and qualities that make a player valuable, including intangibles. Anything the impacts performance on the ice will influence these numbers in the positive or negative. It's up to GM's and coaching to iron out the details, but shot differentials are a check on your assumptions. 

Example of how corsi would be used in a front office or video room:  Grossmann is physical and blocks shots, we think he's valuable. But wait, his shot metrics are in the tank? I wonder why that is. His usage isn't extreme enough to justify a major adjustment to those numbers So we check the tape, and you'll see Grossmann's speed is major liability defending through the neutral zone, his inconsistency in the puck moving department makes it difficult for his team to exit the defensive zone. This is part of the reason why the Flyers are mired in the defensive zone so often this season.

 

Some reading material that addresses a lot of the common criticisms I've seen in this thread. 

 

http://www.broadstreethockey.com/2013/3/14/4104952/hockey-advanced-stats-corsi-intro-primer

http://www.sbnation.com/nhl/2013/11/21/5096220/nhl-stats-advanced-idiots-criticism

Also one final note: To the guy who thinks "stat nerds" have never played the game. I grew up playing hockey, I love this game, and just because enjoy the numbers side of the game, that doesn't make me less of a hockey fan.

Thanks.  

 

Why should we prefer using shot attempts, when goals are what we really care about? Like we talked about yesterday, shooting and save percentages vary quite a bit over small sample sizes, and the ~1000 minutes a player is on the ice for in a season aren't nearly enough for that randomness to even out.

For example, last year Sean Couturier was +18 and Brayden Schenn was -7 in traditional plus/minus. But the Flyers actually got slightly more shots with Schenn on the ice -- the difference between them was all in the shooting percentages. Those percentages are prone to large fluctuations and aren't very predictive

Link to comment
Share on other sites

I hope I didn't come across as hating advanced stats. Just use it as a tool and not an absolute. I love stats to but you gotta watch the games in front of you also. I think everyone but a few are echoing more or less the same points of view of each other. Use it to good effect but with a grain of salt.

Link to comment
Share on other sites

@BillyInFourC

 

How cute, you came here to defend your buddy.

 

http://www.reddit.com/r/Flyers/comments/2lhrlx/vandevelde_industries_chris_vandevelde_moves_to/clv3bhn

 

(underline and bold was my edit):

 

I'm usually a big fan of Ryan's stuff but implying that 7 minutes is enough to time to anoint VV, G, Voracek with chemistry and "puck possession" is way way way pushing the bounds of what this type of data can actually tell us.



Even if your just using it to say "hey this line wasn't a disaster" the numbers in that small of a sample size are a roll of the dice.

 

PART OF THE PROBLEM WITH CORSI IN THIS THREAD CENTERED AROUND THE SMALL SAMPLE SIZE AND THE FACT THAT THE STAT IS NOT THE END ALL BE ALL.

 

BTW, if you don't know, Ryan Gilbert is RiskyBryzness of Flyerdelphia.

 

Also, feel free to see him fight the good fight for Corsi on reddit:

http://www.reddit.com/user/BillyInFourC

Link to comment
Share on other sites

"You don't counter actual research that says shot attempts are important with a hypothetical you thought up in the shower."

 

and that's an order Aziz, you won't!

 

Yeah. All *good* actual research starts with a thought on the toilet.

Link to comment
Share on other sites

@BillyInFourC

 

How cute, you came here to defend your buddy.

 

http://www.reddit.com/r/Flyers/comments/2lhrlx/vandevelde_industries_chris_vandevelde_moves_to/clv3bhn

 

(underline and bold was my edit):

 

 

PART OF THE PROBLEM WITH CORSI IN THIS THREAD CENTERED AROUND THE SMALL SAMPLE SIZE AND THE FACT THAT THE STAT IS NOT THE END ALL BE ALL.

 

BTW, if you don't know, Ryan Gilbert is RiskyBryzness of Flyerdelphia.

 

Also, feel free to see him fight the good fight for Corsi on reddit:

http://www.reddit.com/user/BillyInFourC

 

I didn't see one proponent of the stats say "this is the end all be all, you may not use any other tool to evaluate players" 

The stats are tool to assist video scouting and point out broad trends. The flyers have only played 26 games but we're seeing the continuation of the same trends from last season. Namely that Grossmann, MacDonald, Lecavalier, and Umberger(going back to CBJ numbers) have had dramatic negative impacts on their team's on ice results.

 

Link to comment
Share on other sites

I didn't see one proponent of the stats say "this is the end all be all, you may not use any other tool to evaluate players" 

The stats are tool to assist video scouting and point out broad trends. The flyers have only played 26 games but we're seeing the continuation of the same trends from last season. Namely that Grossmann, MacDonald, Lecavalier, and Umberger(going back to CBJ numbers) have had dramatic negative impacts on their team's on ice results.

 

 

By refusing to acknowledge situations and roles, stats guys are implicitly ignoring everything but their numbers - "end all be all".

 

Grossmann, VLC, Umby were all injured last season, some significantly. MacDonald was on a ravaged NYI team. Deviations from the norm on the first three, although it does appear Umby is toast.

 

I agree that stats can help tell a story, and as long as we maintain that context, we can agree in principle.

Link to comment
Share on other sites

Corsi in particular has no means of measuring the quality of the shot at net (it doesn't even have to be on net does it?)

Some teams' systems rely on Allowing 40 shots that are low % clean shots from outside the "umbrella" and then capitalizing on mistakes and turning them into high % shots the other way.

Their possession time is low. Their attack time is low. Their corsi is low. Their SOG are low... But they often still win.

It really depends on the analytic and how it relates to the system you're playing.

In and of themselves they mean little without anything to apply to.

Yes and since every player on the ice gets a plus or a minus after each shot-at-net, Corsi awards plus marks to players even if they did nothing to help create the shot-at net, and the system assigns minus marks to players who made no mistake on the shot-at-net against.

Link to comment
Share on other sites


Do you honestly believe you've just defeated the last decade of hockey analytics research with a completely hypothetical post that doesn't exist? You don't counter actual research that says shot attempts are important with a hypothetical you thought up in the shower. 

 

um, ok.  you realize the post to which i was referred was created by the guy referring me, right?  and he made the graphs?  do you want a graph from me to somehow legitmize my statement?  ok.

 

Vir4Bu.jpg


Regarding the idea that the other 9 players on the ice impact shot differentials. Duh? Nobody who takes these stats seriously seriously believes that a player is 100% responsible for every shot attempt for which they get credited a + or - .

 

so...have we just defeated the people who say +/- is irrelevant because it relies too heavily on the other 9 guys?  +/- FTW?

 


This is why metrics like Quality of Competition, Zone Starts, Quality of Teammate, Team Effects, etc exist. And the impacts of those 9 other players have been studied, and quantified and can be taken into consideration when judging a player.

 

secondary metrics which themselves are based entirely on corsi.  see how the snake eats its tail on this?  could do every bit of the same qualifying with +/-, if you wanted to.  but you don't.

 


Corsi is a broad tool for measuring on ice impact.

 

no, it isn't.  corsi is a tool for measuring how often a goalie has to prepare for a shot against, under the assumption that he has to come ready for every shot that leaves a skater's stick, whether or not it actually results in a shot on goal that the goalie has to save.  it is a broad tool for measuring goaltender workload more accurately than simple shots-on-goal totals, as the goalie has to prepare to deal with far more than the ~30 shots he actually ends up having to save.  it can be further stretched to measure the tendencies of a particular player -shoot or hold- for the benefit of the goaltender facing him.  it has nothing to say about the nature of the shot or the effectiveness of the attempt.  just that it was tried.  which can mean a lot to a goalie who has to be prepared for each of them.  means less to everyone else.

 

your "decade of hockey analytics research" is a concept appropriated from a goalie coach who wanted a measure of how much work his goalies had to do against different teams and different lines.  the thought that it has something to say about the ability or contribution of a particular skater is NOT what it was meant to do, and it has only been used as such for a few years.  and poorly, at that.

 


Example of how corsi would be used in a front office or video room:  Grossmann is physical and blocks shots, we think he's valuable. But wait, his shot metrics are in the tank? I wonder why that is. His usage isn't extreme enough to justify a major adjustment to those numbers So we check the tape, and you'll see Grossmann's speed is major liability defending through the neutral zone, his inconsistency in the puck moving department makes it difficult for his team to exit the defensive zone. This is part of the reason why the Flyers are mired in the defensive zone so often this season.

 

corsi has nothing to do with this.  corsi says nothing about usage, speed, inconsistency in puck movement.  corsi says the other team manages more shots against grossmann than his extended line manages against the opposition.  43.2% corsi.  explain to me what exactly 43.2% corsi means?  unless i'm wrong, it means for every 10 shots the flyers attemptwhile grossmann is on the ice, the bad guys attempt 13.  how much insight do you have because of that number?  10:13.  i guess that isn't good, but what does 10:13 mean, specifically?

 

the rest, the actual "here's what's wrong", you are bringing that to the table by "conventional", non-advanced means.  in your example, a poor corsi was the warning sign that sent you to look at other, more meaningful things.  i guess maybe it is useful like that, as a canary in the coal mine for teams whose management is paying zero attention but has an analytics department.  see how you didn't actually get any particular meaning out of it, though?  see how the actual interpretation came from something very other than shot attempt ratios?  see how shot attempt ratios probably aren't actually required for you to think that maybe something isn't great with the flyers' defensive puck movement, grossmann in particular?  see how the corsi numbers effectively add nothing to the conversation, other than geek cred?

Link to comment
Share on other sites

Corsi in particular has no means of measuring the quality of the shot at net (it doesn't even have to be on net does it?)

 

 

no.  the shot could hit shinpads 3 feet in front of the shooter, zero threat and in no way a good shot to take...but corsi counts it as a positive event.  it measures pure shot "attempts".  things a goalie has to get on his toes for, but are otherwise not always of any relevance to the game.

Link to comment
Share on other sites

By refusing to acknowledge situations and roles, stats guys are implicitly ignoring everything but their numbers - "end all be all".

 

Grossmann, VLC, Umby were all injured last season, some significantly. MacDonald was on a ravaged NYI team. Deviations from the norm on the first three, although it does appear Umby is toast.

 

I agree that stats can help tell a story, and as long as we maintain that context, we can agree in principle.

The influence of contextual factors has been widely studied and good analysis of a player will at the very least take team context into account, which is why raw corsi is almost never used in player evaluation. In the online stats community, usage is attached to almost any in depth discussion. As a prime example, look at any conversation about Sean Couturier and you'll see stat nerds marveling at the insanely difficult usage he gets from the coaching staff. (It's almost unparalleled in the NHL) 

Here is a rough list of contextual factors in order of important. 

1. Team - Team/system effects are large, but only bear consideration when you are comparing players across teams. 

2. Linemates - By far the most influential individual component to consider. Ex. Brayden Schenn has thrived away from Lecavalier. 

3. Zone Starts - Extreme zone deployments, like Couturiers, can have an impact so it's something to keep in mind. 

4. Competition - usually disregarded because the impact is so tiny over a full season. 

 

Link to comment
Share on other sites

Clarke

Barber

Holmgren

Berube....

I wouldn't mind some nerdy guys mixed in with the Jocks without a college degree. Just a few.

I love this post.

Most advanced stats are guys that are nerds who never got on the ice or held a stick.

Link to comment
Share on other sites

Corsi in particular has no means of measuring the quality of the shot at net (it doesn't even have to be on net does it?)

Some teams' systems rely on Allowing 40 shots that are low % clean shots from outside the "umbrella" and then capitalizing on mistakes and turning them into high % shots the other way.

Their possession time is low. Their attack time is low. Their corsi is low. Their SOG are low... But they often still win.

It really depends on the analytic and how it relates to the system you're playing.

In and of themselves they mean little without anything to apply to.

 

The correlation between Corsi/Fenwick and scoring chances(high quality shots) is extremely strong over the course of a full season. If you are interested in this sort of thing, I tracked on ice scoring chances for the Flyers last season.  http://www.broadstreethockey.com/2014/4/16/5620450/scoring-chances-a-season-in-review-part-1

Incorporating shot quality into the analysis doesn't add much extra information over large samples. For a single game, I prefer looking at scoring chances and the Flyers internally track chances in a similar manner as I have. If I remember correctly, Laviolette was a big proponent of this. 

Link to comment
Share on other sites

The other problem is how to implement

The information.

Even if you excepted Corsi is helpful that doesn't and inform how you implement.

The lines you put together the way you play defense and the way you transition all affect how

Effective "throwing in on net" will be.

Essentially you can't just put a hodgepodge together and tell them to throw it on that and expect that to result in wins.

How it's interpreted and implemented is is important as the information itself.

no. the shot could hit shinpads 3 feet in front of the shooter, zero threat and in no way a good shot to take...but corsi counts it as a positive event. it measures pure shot "attempts". things a goalie has to get on his toes for, but are otherwise not always of any relevance to the game.

Link to comment
Share on other sites

only difference is one counts scoring attempts and one counts scoring, period.  how the former can be valuable while the latter is not is beyond me.

 

I interpret the difference as one being a predictive indicator (corsi) and the other a lagging indicator (+/-). Put another way, I would venture a guess that it would be an outlier to find a team that consistently outshoots (shot attempts) its opponent yet has an overall negative goal differential and most of its players in the negative +/-. 

 

And I would take it further by positing that a team that consistently outshoots its opponents is more likely to win a higher percentage of the next 10 games than a team that is consistently outshot. 

 

Now here's where statistics are less useful (at least in the hockey world) - one might not be wrong in saying that it is more likely that a high corsi team (say, the Hawks) will beat a low corsi team (say, the Flyers) most of the time when they meet, but when it comes to one particular matchup between them, it's not as predictive. This is where the stories behind the statistics are important - on the game by game picture. 

 

The advanced analysis that @RiskyBryzness has put forward and the links he's posted offer a lot of solid evidence that advanced statistics have value - they can serve as decent predictors of future events. The important thing is to remember that statistics only offer probabilities. Of course they can be wrong, but in the long run, with large sample sizes, strong statistical models are valuable and have merit.

 

Let me offer just an analogy - I make a decent amount of money in Forex trading. Many multiples more than any typical investment. And I do it consistently because I use a model that has a statistical theory behind it. It's not always right, but it is right more than it is wrong. That's really the best we can expect with statistics that have humans as its subject.

Link to comment
Share on other sites

The influence of contextual factors has been widely studied and good analysis of a player will at the very least take team context into account, which is why raw corsi is almost never used in player evaluation. In the online stats community, usage is attached to almost any in depth discussion. As a prime example, look at any conversation about Sean Couturier and you'll see stat nerds marveling at the insanely difficult usage he gets from the coaching staff. (It's almost unparalleled in the NHL)

Here is a rough list of contextual factors in order of important.

1. Team - Team/system effects are large, but only bear consideration when you are comparing players across teams.

2. Linemates - By far the most influential individual component to consider. Ex. Brayden Schenn has thrived away from Lecavalier.

3. Zone Starts - Extreme zone deployments, like Couturiers, can have an impact so it's something to keep in mind.

4. Competition - usually disregarded because the impact is so tiny over a full season.

Solid post. Thanks for not being a dick.

It doesn't take in to account the difference between a puck moving defenseman, shot blocking stay at home defenseman, and an offensive defenseman though. Role matters. That was part of my argument, but zone starts is highly relevant to paint a more complete picture. Factor in the PP and PK TOI and it'll go further. Maybe one day I'll actually come up with an advanced stat anchored on as many real NHL statistics as possible.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.

×
×
  • Create New...