Sabermetrics: A Preface


Baseball, happily, is a game of numbers, mainly 1, 2, 3, and 9. It isn't football; it isn't soccer. That is, it isn't a game where men run every which way doing all kinds of things. It involves intelligible, swift plays and it leaves breathing space between them. For this reason, what happens on the diamond can be made into numbers, which tug on our imagination.

True, there is not much numerical about the swing of a bat or a diving stop. That stuff is the heart of baseball. But where would this athleticism lead without a 1st base, or a 2nd, or a 3rd? It would be stuck in a confused field house, where there are no three outs, no nine batters each in his slot, and no way to build up and organize our passions into a contest of nine numbered innings. This stuff is baseball's whole nervous system.


Imagine that Major League Baseball did not exist, and that television each night showed a different pick-up game from somewhere in the country. The game would be just as beautiful. The game outside the game would be completely lost.

I refer to "following baseball." One has his or her team, and, more than that, one talks and talks about players. This isn't football, though; this isn't soccer. The follower of baseball says things like .300 and 40 HRs, and those numbers feel like adjectives. He can do this, first, because of the above. The game is tidy, clean. Second, because there are a lot of baseball games played. Third, because a hitter is alone up there, and a pitcher is alone up there. It isn't perfect, but a lot of things that happen end up saying a lot about one or two men.


There are 2,430 regular season games each year. There have been almost 350,000 since 1900. Seeing as every game produces a lot of nifty, clear data, you'd imagine that if you looked into it you could find out interesting things. Sabermetrics is neither more nor less than this.

There is a lot that we can't know, of course, but no road leads to sure knowledge. Looking at the past can be a helpful guide, especially in something as comparatively simple as baseball. Statistics are the past.

Imagine you were playing poker with loose cards you found in a drawer. There could be 9 jacks in there or there could be none. Imagine furthermore that you were not allowed to inspect the deck, but you were able to make a note of every card dealt. With all the shuffling, the deck would reveal its secrets slowly, but after a certain very large number of hands, with the help of your notes, you could gain a sense of how the irregular deck worked. Then, you could calculate odds and play smart poker.

Baseball is a bit like this. No one believes that a player or a team is a standard deck. But, after a very long amount of time, one can draw inferences from past performance, and thus learn something about the player or team. This is what sabermetrics people work at. They discover ways to get a read on the odds.


What make sabermetrics exciting -- and they are exciting -- is the idea of chasing down the truth. This is the drama. This is what has people riled up. This promotes both good and bad behavior, as you would probably guess. What's most fun for me is the managerial side.

When is it smart to bunt? Are there really "clutch" hitters? Should your best reliever only pitch in the ninth? When should you intentionally walk a guy? What's an acceptable stolen base percentage in a x or y situation? How important is lineup construction? etc. etc.

I love this, because it teaches me more about the 1, 2, 3, and 9 side of baseball. Are there simple, definite answers to all these questions? No, of course not. But for the life of me I don't see what's wrong with receiving patient, nuanced answers to life's questions.

There are three guys named Tango, Lichtman, and Dolphin, for instance, who wrote a book called The Book. It uses statistics -- records of the past -- to explore a lot of baseball issues, and you'd be amazed by the sophistication and subtlety of their methods. They can say things like, "Since 1999, this has happened 10,000 times and it's come out like X and it's come out like Y."

Where the authors are comfortable laying out a general guideline or principle, they'll tell it to you. But we all know that everything is adaptable to circumstances.


I'll use math just once, because it's the bit of math that opened things up for me. It's simple and it uses statistics you know.

1. Take a team's hits + walks.

2. Multiply that by their total bases. (In total bases, you get 0 for a walk, 1 for a single, 2 for a double, etc.)

3. Divide that by the team's (at bats + walks)

4. You're done. You've just calculated a team's expected runs -- the number you'd guess they'd score over the course of a season.


Makes sense, kinda, but does it work? Well, it works well enough to teach us something.

Let's try it out for with the 2006 Mets. The formula guesses 819 runs. The actual total is 834 runs. That's a 1.8% error, which is extremely close. How about a bad year, 2009? The formula says 718 runs; the actual was 671. That's a 7% error, and things don't go much farther than that. The expected runs will almost always be within 5% of the actual runs. In fact, statisticians have other, better formulas with much higher success rates.

This is interesting in itself, but it's a great innovation in thinking about players. If we can express runs as a function of hits, walks, and power, we can analyze one player's hits, walks, and power, and guess at their contribution to those runs.  In 2006, David Wright contributed 181, or 14.4% of his team's runs -- about the same as Beltran, about double Paul LoDuca. It isn't clear to the naked eye, but on a season-long level this is pretty close to how baseball works.

There are steps beyond this, and if you're interested there are many places that explain them. Eventually, statisticians have made formulas that work out the components of "Wins" over a long season, factoring in aspects of hitting, defense, and pitching. Similar to the above, "Wins" can then be attributed to individual players. Similar to the above, it all seems to work extremely well.


Because you are not a manager or GM, there is no reason you have to care about these things, but chances are that you do care. Most baseball fans like to talk about strategy, and you'll find great discussions about that. Most baseball fans like to talk players -- who's good, who's bad, who's similar to what old timer. Well, some of these tools are amazing ways to learn.

Maybe you like the old time lingo, though, and you don't like alien and technical language. That's valid, but know that it's not all like that. There are warm, funny people among us, and there are great graphics and write-ups.  There are also small-minded cads -- just like anywhere.  What I'm trying to say is we're inviting you in.  Ignore anyone who shouts you down.  I'm sorry when it happens.  When we talk baseball, we use language, and language needs diversity. Help us build on the old time lingo, and we'll all talk better baseball.

This FanPost was contributed by a member of the community and was not subject to any vetting or approval process.