Sabermetrics and Language: This is where we are wrong.

There are two types of statistics that we can generate: those that are good at telling a story and those that are good at making predictions. There are some statistics that originated with sabermetrics that have started making their way into the mainstream--OPS is the most popular--but there is only so much room for sabermetric statistics in the vernacular. The reason is that the true goal of sabermetric analysis has been to move away from the existing paradigm --what happened in a game--to a new paradigm--what might have happened in a fantasy world.

Our stats don't work well for stories for three reasons:

1. Statistically improbable outcomes that actually happen are just as important to the emotional experience of the game.

2. Old, crappy statistical approaches work just as well to tell us that Roy Halladay is a fantastic pitcher.

3. The power of our statistical approaches is in discerning marginal differences between marginal players.

The existing paradigm generates statistics based on events. This is why you have big, fat, discrete outcomes: W-L, HR, RBI, K, and the best ever, the single most important statistic for telling how clutchy someone is: gwRBI. These are counting stats, and they do a terrible job of predicting, but they capture the imagination--and more importantly, the emotion--in a real, almost physical way. I still remember growing up, watching Dr. K's K corner, and every strikeout, you got to see another fan hang a K on the wall. The number of wins is the number of times you go "oh, thank God" and turn off the radio and go to bed. This matters.

Because event-based statistics are so terrible for prediction, some were converted to rates--ERA and AVG--in sort of an ad hoc way, and those rates were the only even halfway decent predictor that existed, so when I was a kid, low ERA and high Batting Average were the apex of good team construction.

This paradigm works great for telling stories. Good stories are discrete. The difference between a one-hitter and a no-hitter is real and significant in the mind. A grand slam is different in some important way from other home runs, and TWO GRAND SLAMS IN ONE INNING? Fernando Tatis is immortal.

The perfect example of this type of statistic is the Error. The guy gets to the ball and screws it up. That. Looks. Stupid. And we all think he's stupid.

The new paradigm tries to generate statistics based on latent variables. This is a paradigm that asserts the following: the events we see are all based on some, vague, invisible quality called "ability". We can't observe ability, so we try to tease it out by looking between events, by using difference-in-difference methods. Fielding Independent Pitching is a perfect example--we don't want to know what a pitcher did, we want to know what he would have done, under tightly controlled conditions.

This paradigm relies on assumptions of continuity. Under this paradigm, there is essentially no difference between a one-hitter and a perfect game, because much of the time this is whether or not a second-baseman got eaten up by a grounder, and when we're evaluating pitchers, do we really care about bad hops? We should not.

Unless you can establish that a hitter hit significantly differently with the bases loaded than with them empty, then the difference between a solo home run and a grand slam are unimportant in assessing the players underlying ability.

The counterpart to the Error is sabermetric fielding statistics. Sure the guy flubbed a few that he got to, but he got there, and that's because he's super fast and that matters way more than the fact that he looks like a moron sometimes. Look at how many "out of range" plays he made!

You can try to tell that story, and we should, but the people who are most likely to listen are the people with money on the line, not the people who want things to make narrative sense. And even those who are making decisions won't necessarily listen. Hell, the American Economic Review published a paper laying out pretty clearly that NFL coaches should go for it on fourth down far more often than they do. There has been an increase in fourth down attempts, but they still are really, really low relative to the optimal level (~10% vs. ~40%). Why? Who knows. My best guess is risk aversion, and in a non-competitive industry, that can last a while.


So what to do? It's tough to say. We have a few options, I guess. One, we can acknowledge the limitations of our statistics to analytical and predictive circumstances, and be fine with that. I think at some level, we have to be satisfied there, because if we can't sell ISO as a measure of power or BABIP as a measure of luck, we certainly won't be able to sell the more nuanced stats or the stats of the future.

Two, we can try to nudge the better storytelling stats into place. OPS can and should replace AVG as the go-to stat for hitters. To a large extent, it has--my father-in-law knows what OPS is, and while I still can't resist the urge to think "is he a .300 hitter?", I have hope that my kids will think "does he have an .875 OPS?" Similarly, it might be possible for FIP or WHIP to replace ERA in the public mind, or for some kind of +/- system to replace fielding percentage (seriously, fielding percentage? damn.).

Third, we could use our super statistical powers to come up with storytelling stats, hermetically sealed off from the predictive claims of our other stats, but that do a good job--better than those out there--of telling the story. I still remember the first time I saw a Win Probability graph. Win Probability captures very little in the way of the sabermetric paradigm I talked about--but you can see every fan's heart break in a line graph in a way that captures the feeling of a game. There have to be other possibilities here--the streakiness of a hitter (easy enough to model a time series), or the variance on a pitcher's starts as a way to gauge how dangerous he is for your blood pressure.

I think a combination of all three is best. Acknowledge that not everybody is going to take the time to figure out what the hell these mean:

rSB    rGDP    rARM    rHR    rPM    DRS

Try to pick the stats that make good storytelling sense that are also predictive, and implant those in the minds of your friends, family members, and local high school sports broadcasters, and finally, figure out where the story lies and tell that by the numbers.

This FanPost was contributed by a member of the community and was not subject to any vetting or approval process. It does not necessarily reflect the opinions, reasoning skills, or attention to grammar and usage rules held by the editors of this site.

Log In Sign Up

Log In Sign Up

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

By becoming a registered user, you are also agreeing to our Terms and confirming that you have read our Privacy Policy.

Join Amazin' Avenue

You must be a member of Amazin' Avenue to participate.

We have our own Community Guidelines at Amazin' Avenue. You should read them.

Join Amazin' Avenue

You must be a member of Amazin' Avenue to participate.

We have our own Community Guidelines at Amazin' Avenue. You should read them.




Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.