Jump to content

Arguments Against Advanced Stats?


RiskyBryzness

Recommended Posts

One other comment- I've seen some people claim that Corsi is a better stat than +/- because Corsi makes use of a greater sample size because there so many more shots than goals. If people mean that literally, they don't understand statistics. Sample size is in fact the same (more or less) for both Corsi and +/-, it's playing time. For a game the sample size is 60 minutes (team) or whatever the player's TOI is. The "more or less" caveat is because +/- doesn't include all ice time obviously.

 

The "sample size" is a game, or a season, or a career. Goals and shots are the events that you are measuring.

 

Years back I had to take a statistics course when I was a grad student at Jefferson. The professor introduced the first class by talking about a friend of his, known around Philadelphia as "Super Stat". Philly sports fans of a certain age will know Super Stat is a guy named Harvey Pollack: http://en.wikipedia.org/wiki/Harvey_Pollack.

 

Pollack has probably spent more time compiling and analyzing statistics (NBA) than maybe anyone alive. My stats professor (in a friendly way) pointed out to us that what Pollack did (and still does I believe) is not "real" statistics. There are no control groups. No regression analysis, analysis of variance/covariance, etc.

 

Personally, I hate statistics but they did help me get to where I am today- I decided it would be "fun" to learn programming by writing a C program to do analysis of variance for my dissertation research (which I never did). I ended up tossing my graduate and undergraduate degrees in the waste bin and became a computer geek. But I still hate statistics- the "real" kind anyway.

 

Point being, any stats (especially sports "statistics") need to be put in the correct context. In hockey, I personally think that is extremely difficult to do.

Link to comment
Share on other sites

  • Replies 223
  • Created
  • Last Reply

The fact that his Corsi is "bad" but his +/- is "good" suggests to me that he's doing his job pretty well.

 

Grossmann's traditional +/- is high because his on ice save percentage is extremely high. OISVP is subject to a lot of random fluctuations and is generally not considered to be a repeatable talent from season to season. http://www.broadstreethockey.com/2013/7/4/4487304/save-percentage-variability-regression-defense

 

http://goo.gl/nBMY9H ( I tried directly posting the images instead of the links, but apparently I'm not allowed?) 

 

This is prime example of one of the ways that looking at individual possession metrics/scoring chances can help coaching staffs avoid recency bias. 

 

http://i.imgur.com/bjIgesM.png

 

Grossmann's problems aren't in the defensive zone. He gives up scoring chances against at about the same rates as his teammates. The problem is, the team is still getting heavily out-shot and out-chanced with him on the ice. 

While he might a serviceable player in his own zone, the Flyers offense completely dies with him on the ice because he can't move the puck. Even partnering him with Mark Streit isn't enough to buoy him offensively.

Link to comment
Share on other sites


Point being, any stats (especially sports "statistics") need to be put in the correct context. In hockey, I personally think that is extremely difficult to do.

 

And the correct context is usually in combination with other factors.

 

People in general have a tendency to want statistics "to tell the story". They don't. You tell a story that includes statistics.

 

People in general also tend to have a conclusion and back fill with whatever evidence supports that conclusion instead of reaching a conclusion by following the evidence.

 

The Grossmann example is a good one. He has a good +/- but bad Corsi. Neither of those statistics by themselves tell the story about the player but both of those statistics in combination with others (zone time, blocked shots, etc.) can begin to paint a picture about a player.

 

John LeClair and Eric Lindros leading the league in +/- didn't indicate that they were both the two best defensive players, for example. I'll wager if you "Corsi'd" both of them, you'd find that the reason their +/- was so high is a large preponderance of time spent possessing the puck in the offensive end.

Link to comment
Share on other sites

The game against the kings is a good example.  The game against the Islanders a while back too.

 

We lost in a shootout to the Isles, but beat the kings.  Both games having overwhelming shot advantages didn't put the Flyers away. 

This is a plan of a lot of teams.  Not saying it's the Flyers plan or that they're doing it effectively even if it was, but it's out there.

 


Yes, games are won by goals, not shots. However, wouldn't it be agreed that usually more shots equals a better chance of winning? Usually, obviously not all the time.
Link to comment
Share on other sites


And it does seem to break down that in general a team that possesses the puck more and takes more shots has a better chance of winning more games than those with bad puck possession and fewer shots.

 

I'm trying to stay out of this but really!? A team that has the puck more and shoots more has a better chance of scoring and winning??? I seriously hope no one needs stats to figure that out.

Link to comment
Share on other sites

@idahophilly

 

To take it even further...the team that scores more than the other usually wins.

 

LOL... and there in lies the most important stat and hell, we even know when we can predict it. It's called the final score and and when? At the final buzzer/shot during shootouts.

 

I guess when people have problems with the adv stats is when they base trading/demoting/buying out players based on stats. Stats are a good tool and should most definitely be used but it will always come down to the "human touch, the gut and simple observation" when all is said and done. And I'm willing to bet the coaches and GM's have a better grip on that than the fans. Arm chair quarterbacking is so easy and it's even easier to let stats dictate what your decision is so you don't have to make it yourself. Anyway, you get what I'm saying whether anyone agrees or not. Peace!

Link to comment
Share on other sites

@idahophilly

 

To take it even further...the team that scores more than the other usually wins.

 

unless the other team gives the officials a box of Cubans and roll of 50's to disallow all the opponents goals. Opps, see anything can happen that stats don't show (and yes, that wasn't a serious scenario)

Link to comment
Share on other sites

@idahophilly

 


Stats are a good tool and should most definitely be used but it will always come down to the "human touch, the gut and simple observation" when all is said and done.

 

Why would you say that? I think you might mean that *your* decision-making always comes down to "the human touch, the gut and simple observation" because I don't think you can claim that to be some sort of universally accepted way of making decisions.

 

Knowing their recent history and how they have used their 'gut' or 'simple observation' to make decisions, I think the Flyers offer a pretty good example of what happens when you don't use statistics and evidence as the basis of pretty significant decisions - Bryzgalov, MacDonald, VLC, Schenn. All of those could and should have been avoided by simply digging a little deeper. 

Link to comment
Share on other sites

@idahophilly

To take it even further...the team that scores more than the other usually wins.

This only holds when the teams are actually playing each other. For example, if the Flyers beat LA 2-1, and the Rangers lose to Detroit 6-5, the Rangers have scored more than the Flyers, but they didn't win.

Statistics can get VERY complicated.

Link to comment
Share on other sites

I think it's a really safe bet that GM's don't bow down to the stats, that's all. You can't prove it the other way either. My real question really doesn't have to do with the stats. 

 

My real question is why the stats supporters are so rabid that anyone who disagrees with the way it's applied are wrong no matter what while the other folks on the other side admit they must be used but logically. Seems the adv stats folks are all or nothing in their defense of it while the other folks take a measured approach to it. It SEEMS to me and I'm only speaking for me, that the adv stats folks are defensive or threatened about anyone who disagrees. See, looking at multiple arguments here and elsewhere it's apparent which side is reasonable.

 

Now, Brelic, you are good at twisting words or flat out re-wording and most definitely missing the point. I didn't say the Flyers used their gut only. I believe I said stats are important so pls get it right in the future. Now, perhaps you should LISTEN to what berube and virtually every other coach uses in their answers/statements. Based on grade school level analysis you will see them make decisions and use words that describe things like chemistry and will to win and what they are seeing and so on. You could use stats to throw together a paper roster that corsi says should win and wind up with a horrible team. You could use zero stats and end up with a horrible team. Hence a near 50-50% approach seems logical. If you indeed do believe coaches and GM's make their decisions on stats alone then I'm worried. It must be a generous mix of both stats and intangibles, as I have been stating from the start. 

 

And when I said always, this is what I meant. The stats can lead you in a direction but it's a human that makes the final decision and once that element is involved it comes down to weighing the data and what you see, flawed or otherwise, sometimes at that moment and sometimes historically.  Whoa the day when charts and graphs replace the reasoning of the GM or coach, whether they get the decisions all right or all wrong. Be a boring game when you could just have a computer for a GM and coach... Like I said, they have a better grip on stats and gut feelings and how to use them than we do. All these "discussions have bared that out for sure.

 

And I assume you have a report filed to you from the Flyers on all the steps the Flyers made on bringing in those players? I don't. They didn't work out so far but I'm not going to pretend I know what they did or didn't do...

 

Just my opinion on all that...

Link to comment
Share on other sites

This only holds when the teams are actually playing each other. For example, if the Flyers beat LA 2-1, and the Rangers lose to Detroit 6-5, the Rangers have scored more than the Flyers, but they didn't win.

Statistics can get VERY complicated.

lol...

Link to comment
Share on other sites

This only holds when the teams are actually playing each other. For example, if the Flyers beat LA 2-1, and the Rangers lose to Detroit 6-5, the Rangers have scored more than the Flyers, but they didn't win.

Statistics can get VERY complicated.

 

Damn you Jackstraw!

Link to comment
Share on other sites

So then, the Flyers should have gotten 2 pts per game that the other teams lost, which of coarse in every game there is a loser so the Flyers get all those point no matter what! Now if that isn't an argument for bring back ties then nothing is...

Link to comment
Share on other sites

I'm trying to stay out of this but really!? A team that has the puck more and shoots more has a better chance of scoring and winning??? I seriously hope no one needs stats to figure that out.

 

It's not whether one "needs stats to figure that out" but rather whether the stats being discussed display what we know/think to be true.

 

IF puck possession and shots ARE a good determination of who has a better chance of winning AND there is a statistical analysis which can be used to show that, THEN the stat is a useful metric in terms of judging performance. That SEEMS to be the case with the Advanced Stats we have been discussing.

 

IMO, the stats we're talking about here have more value over the course of a season or a team than for an individual game ("using a computer to determine line combinations") or player ("his Corsi sucks, but his +/- is good"). And, just using last season's playoffs, there seems to be about a 75% correlation between teams with good advanced stat numbers (Corsi, for example) and both making the playoffs and winning in the playoffs.

 

That could indicate some degree of causation along the lines you have identified that puck possession and shots tend to have a significant effect on the outcome of games.

 

That said, there have been a number of people who have pointed out individual games where teams get outshot and yet win ("LA outshot the Flyers and the Flyers won!") as a means of discrediting the statistics which show that puck possession and shots are good determinants of outcome so, yes, it appears there are some who question whether puck possession and shots have a direct bearing on who has a better chance of scoring and winning.

Link to comment
Share on other sites


but rather whether the stats being discussed display what we know/think to be true.

 

Thanks for actually proving my point. You look at the stats which you should but then it comes down what we think. Do the stats add up to what we think? In the end it's a human who judges not just the stats but what they see and I defy anyone to tell me that coaches/GM's only consider stats and don't have a "what if" lurking in the back of their mind when ever they make a tough decision. Has to be that way or it wouldn't be a tough decision.

 

Again, why is so hard for the adv stats crowd to get what the other side is saying. Use the stats, just don't rely on them as the gospel. That's all I'm saying yet I keep getting arguments that "this" is 75% correct and playoffs show this and that guy had 38.234175 and half % of possession time so he sucks and if you don't believe it your an idiot. Maybe some fellow fans get tired of being told they are wrong for their opinion and the other side is always right no matter what. No wonder this is a hot button topic league wide.

 

I love it when the graph thing gets thrown up. I've been in charge of bays worth 120 million in equipment in R & D. You want to see graphs!? I've seen more graphs than I care to ever admit. Made allot of them. And guess what, you could make them say just about anything you wanted. I even had my team make their own stats and graphs on the same data a few years back. Out of 21 people I got about 14 different interpretations of the same data. Funny thing was, they all could make a case for it. So, corsi just doesn't impress me. Use it to good ends every time possible but don't let it dictate things.

 

besides, that would be like Berube saying "hmmm, I got a good feeling about this guy. He is having a bad run right now but his compete level is there. Good in the locker room to. Unfortunate i had to switch him to his off wing but he really is a team player. Hasn't complained. Opened up more room for that other line to get that new chemistry going since we moved so and so into his spot. Oh well, to bad his corsi rating says 72% chance he will suck in the future. I'm benching him!".

Link to comment
Share on other sites


it appears there are some who question whether puck possession and shots have a direct bearing on who has a better chance of scoring and winning.

 

It appears they may be right. Not all the time but we will talk of this again later in the year as the realities get fleshed out. Later on...

Link to comment
Share on other sites

It will be interesting to see down the road what role advanced stats play in scouting.

I'm thinking specifically on how teams may come to rely on them as an alternative to traditional scouting (direct viewing), either as a cost-saving measure or because it is seen as more objective.

Will it matter?

Link to comment
Share on other sites

@brelic 

@radoran

 

really quickly, if possible, how can corsi be used to "evaluate" an individual player. there seems to be a lot of macro factors to drill through.  I can see where team corsi can tell a guy that can't watch all the games which teams seem to have the puck more, I don't see where Corsi can be used to point to a individual players weakness or strengths.  

Link to comment
Share on other sites

@brelic 

@radoran

 

really quickly, if possible, how can corsi be used to "evaluate" an individual player. there seems to be a lot of macro factors to drill through.  I can see where team corsi can tell a guy that can't watch all the games which teams seem to have the puck more, I don't see where Corsi can be used to point to a individual players weakness or strengths.  

 

Mojo, you know more %'s and graphs that are subjective are gonna get thrown at you right? Just an FYI but you know that anyway. Enjoy them though! :ph34r:

Link to comment
Share on other sites

It will be interesting to see down the road what role advanced stats play in scouting.

I'm thinking specifically on how teams may come to rely on them as an alternative to traditional scouting (direct viewing), either as a cost-saving measure or because it is seen as more objective.

Will it matter?

 

It is already being used in scouting...but not just the advanced stats alone...it is being used alongside the traditional scouting methods, and that is fine.

 

If I were an owner or GM, and a scout came up to me with some prospects possibilities, and I asked him "How did he look"?  "Did you talk with him?" "What do his current coaches think"  "Is he coachable?".....and that scout told me, "Well, I didn't actually see, talk, or watch him play, but I got a GREAT set of numbers that says he will bring us a Stanley Cup sooner than later".....then I am firing his arse quicker than you can say "Saber Nerd!"

 

Seriously though.....the advanced stats ARE great tools...ESPECIALLY for scouts, GM's and other people whose jobs are to find the best players, fix issues within an organization's competitiveness, and just generally ice a winner.

But those guys should also be smart enough to know, that while numbers are nice, sports, hockey especially, are still a heart, determination, and talent driven thing and the players on the ice are real humans with human strengths and weaknesses, not computer generated EA Sports video game figures that are completely ruled by numbers.

 

And just like anything else in pro sports, all it takes is for one or two franchises to be successful in using advanced stats to consistently build winners (along with, I am sure, the aforementioned traditional methods), and others who haven't already, will take harder looks at the value of them and try to incorporate them even more into their scouting routines.

Link to comment
Share on other sites


Now, Brelic, you are good at twisting words or flat out re-wording and most definitely missing the point. I didn't say the Flyers used their gut only. I believe I said stats are important so pls get it right in the future.

 

If I twisted your words, I apologize. 

 

Here's what I read:

@idahophilly : Stats are a good tool and should most definitely be used but it will always come down to the "human touch, the gut and simple observation" when all is said and done.

 

I interpreted that as meaning that all decisions, regardless of what the stats say, will come down to "human touch" (not sure what that means), gut, and observation. Yes, humans make the decisions in the end, but I interpreted what you said as meaning a person will use his gut. 

 

That's where I disagree. Sometimes, when stats and all factors considered don't really paint a clear picture, a person might use his gut. 

 

But when you have a strong indication through stats and other information, I'm arguing that high-level decision makers like GMs will not go with their gut. Like I said, maybe in some cases they will, but it's usually when it's just not really clear. Successful organizations aren't usually run by making gut decisions. If they do go with their gut more than anything else, chances are they will not be successful.

 

Did I miss the point again?

Link to comment
Share on other sites

+/- is irrelevant because it's dramatically swamped by noise and has no predictive value from season to season. As the post by Eric T. explains. 

Shot attempt +/- side steps these issues by increasing the sample size of events, giving it predictive value over much smaller samples. 

 

+/- is dramatically swamped by noise due to the manner of the data collection.  it takes data generated by 12 players and attempts to derive meaning for an individual.  the noise of +/- are the goals scored for and against that have nothing at all to do with the individual player.  a guy can be on the ice for 4 even strength goals in a game and not actually be in any way responsible for any of them.  or maybe all of them.  either way, +/- counts them the same.  because of this extreme outside interference, it is all but impossible to come away from the +/- statistic with anything more than a very rough idea about a given player.

 

the identical weaknesses exist with corsi and fenwick.  they, too, take data generated by a number of people (10, this time) and try to expose some essential truth about one of those people in particular.  a guy can be on the ice for 30 shot attempts in a game, and only have had any real connection to 6 of them.  corsi, fenwick and plus minus all record events that a majority of the time are not causally related to a specific player being studied.  they all have the same amount of noise, because they all use the same methodology for collection:  add up everything that happens when this guy is playing and publish the total.

 

BUT SAMPLE SIZE, you scream.  no.  increasing sample size helps when you are talking about occasional soft goals or weird-and-short-lived scoring streaks.  increasing sample size dilutes statistical outliers, a big enough sample size effectively makes them go away.  when the problem is statistical outliers.  that isn't what is going on with +/- or corsi/fenwick, though.  the "noise" with +/- isn't noise.  "noise" is unexplained oddities in data collection.  henrick lundqvist giving up a really soft goal, for example.  it happens, but it doesn't happen often, and a full season's sample will drown out the noise, leaving you with a fairly solid statistic of how often king henrick stops the puck.  what you call "noise" in +/-, though, isn't unexplained oddities or statistical outliers.  it is the reality that the statistic captures the results of 12 individuals' efforts and choices.  the same is true of the shot-attempt-based stats.  a bigger sample size doesn't lessen the effect of the other 9 skaters, because the effect of the other 9 skaters is literally what you are measuring, in addition to the small overall contribution of the player being studied.  as you increase your corsi sample size, you increase the number of times someone other than the person you are interested in does something that causes an event.  after 5 games, a player could be on the ice for 150 shot attempts total, but have only played part in the generation or allowance of 30.  you have 120 other shots in your data collection with no way of distinguishing them.  in the same way a player could be on the ice for 15 goals over 5 games, but only have had anything to do with 3 of them.  the other 12 goals are not noise, the statistic is just flawed in that the subject of interest is a minor participant in the generation of the data being collected.  looking at more data collected by the same flawed means doesn't solve anything.

 

 

QoC, OoT, etc can all be measured with or without corsi. This is why TOI% metrics exists, to remove that sampling bias. 

 

 

ok, true.  the other computation of QoC uses +/-.  so how do you feel about that?  by OoT, i assume you mean QoT, which i suppose could be measured with +/- just as QoC is...but i've only ever seen it derived from corsi.  in either case, we are back to corsi and +/- and nothing else.  as for TOI% removing sampling bias........i mean, that's such a huge stretch.  there is just SO MUCH variance in the relevance of the data.  i guess relative TOI goes some distance, but when you are talking about a base number to which Player X contributes an unpredictable and unmeasured amount, sampling bias isn't the issue.  essential relevance of the data collected is.  i mean, Player X, who was on the ice for those 150 shot attempts total over 5 games, changes lines.  he is on the ice for 146 shot attempts over the next 5 games.  changes lines again, is on the ice for 161 the next five.  he was involved in 30 for the first 5 games, 32 the next 5, and 38 the last 5.  457 events contributed to his corsi totals, he had influence over an even hundred of them.  22% of the data has something to do with that one guy.  78% is completely irrelevant, but we have no way of removing it, other than to run through a jungle of comparative filters that are themselves derived from this same incredibly vague collection approach.

 

 

 

This a complete non sequitur. What is was developed for and how is used now are completely separate applications with entirely different justifications.

 

 

it was not a complete non sequitur.  corsi is a very clever and a really very accurate measure when used for its intended purpose: evaluation of goaltender workloads per opposing team and opposing line.  it measures very well how often a team tries to shoot, and -if lines stay stable- what the tendencies of individual lines are.  this is an approach that asks about everyone on the ice over a given period of time, and the statistic reports on that exactly.  when you take that and try to use it to ask about 1/10th of the skaters on the ice, it is simply the wrong use for the tool.  i get that there are tens of thousands of people lining up to be big fans, and teams are now paying people to help them guess the lotto numbers two years from now, but that doesn't make corsi and fenwick less of a mess to use.

 

ok, dinner now.

Link to comment
Share on other sites

Just as blocked and missed shots should not be judged the same as those put on the goalkeeper, shots on net are not created equally.

 

A shot from outside the zone or a soft wrist-shot from the blue will not have the same quality as a one timer from the slot or a breakaway shot.

 

Typically during a thirty shot performance only about ten of those shots are even considered scoring chances, or quality shots.

I get where you are coming from.

 

However, Corsi is usually used in conjunction with Fenwick(Fenwick is the same thing, but does not count blocked shots and counts more quality scoring chances) as well as some sites that try to track the stats for "Quality scoring chances"

 

As well as Zone starts(More offensive zone starts than defensive zone, etc) and quality of competition to figure out who is being sheltered and who are getting the tough competition and relied on for defensive zone draws.

 

So you can somewhat weed in and out the weak wristers from the point from the breakaway shots, etc and start to paint a picture over many games.

 

I can tell you that the Corsi in regards to Shark players and showing who has better Corsi with certain linemates and defensive pairings is usually damn close to bang on.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.

×
×
  • Create New...