One thing we can all agree on is that plate discipline is pretty important in being good at baseball, even if we don't really have a good definition of it. Most people reference BB% to indicate that a player has good plate discipline -- "a good eye." To be fair, walk rate does seem to indicate that a player is good at drawing walks, which means he's also good at not swinging at balls.
But BB% rate seems a bit clumsy in the modern Pitch/FX era. While walks may indicate a good amount of plate discipline, we have much better stats to measure things like plate discipline.
This post is designed to really look at plate discipline using public Pitch/FX data and is heavily inspired by this post at Hardball Times. Some of the stuff here is going to get a bit technical, but hopefully you'll stick around for the ride.
What is plate discipline anyways?Some guys walk because they get pitched around, some guys walk because they never swing. Others might have such good bat control, that they put the ball in play so often, they never really need to walk. In general, we'd like a number or statistic that says player A has better plate discipline than player B, but we're going to need to define it first.
The basic idea behind what follows is that plate discipline means command of the strike zone. A player has good plate discipline if he swings at strikes and takes balls.
Plate discipline indexWith this definition in hand and a little probability, I can come up with a metric that comes close to what I want. Swing data is available from Fangraphs. I'm going to use Juan Lagares as an example, since anyone who knows him as a hitter knows Juan Lagares has almost no plate discipline whatsoever. Thus, he should show up poorly on my metric. Sorry Juan, but I'm calling you out.
Basically what I want is a number that reflects a player's natural agreement with the strike-zone. Let's pretend Pitch/FX is a baseball player, who is watching Juan hit. Pitch/FX swings at only strikes and takes only balls.
So, 162 and 159 times Juan agreed with strike zone by swinging at strikes and taking balls. Thus the probability of agreement is thus sum of the diagonal boxes, divided by all the pitches, i.e.,
So Juan generally agrees with the strike zone at a 59% rate. To put this into perspective, David Wright comes in at 71%, so this seems like a legitimate score (probability of strike zone agreement) to work with.
However, the probability of agreement does not adjust for the probability of agreeing by chance. That is, if I closed my eyes and swung randomly, a certain amount of agreement would occur, but we'd have no idea it was by chance. That is, if Juan got a ton more strikes in general, and swung at everything, we'd say he had good plate discipline by this measure. But he wouldn't.
To rectify this situation, we'll use a score called Cohen's Kappa.
Cohen's KappaOk, so named numbers are a little obfuscating. Cohen's Kappa, or usually just Kappa, is a "statistical measure of inter-rater agreement," according to Wikipedia. However, it's easier to think of it by introducing the probability of agreeing by pure chance.
By pure chance, Juan could close his eyes and swing randomly at the plate. We'd like to avoid measuring this as some sort of skill. So we have to quantify this. Looking at the table above, looking at the columns, we see that Juan swings at (162+97)/542 = 48% of pitches. He takes (124+159)/542 = 52% of his pitches.
Meanwhile, Pitch/FX calls (162+124)/542=53% pitches strikes, and (97+159/542)=47% of his pitches balls.
If these rates are random, then the probability of agreeing by pure chance is the probability that Juan swings and the ball is a strike (48%*52% = 25%), plus the probability that Juan takes and the ball is a ball (51%*49% = 25%). Thus the probability of agreeing by pure chance is 50%.
So the adjusted probability (after removing the chance that occasionally Juan guesses and is right), is the probability of agreeing (59%) minus the probability of pure chance agreement (50%), or 9%. David Wright is 21%!
Of course, for different players, the probability of agreeing by chance is different (because they might get more balls or more strikes). To get a normalized score for comparison, Cohen's Kappa is defined as the adjusted probability divided by the probability of disagreeing by pure chance.
OK, I bet everyone is confused by my writing. To back up, 1 - the pure chance agreement is the probability of not agreeing by pure chance. This score is always larger than or equal to the adjusted probability, so we get a score between 0 and 1. Kappa just makes sure that we can compare.
Juan's Kappa with the strike-zone is then
To put that into perspective, David Wright's Kappa is .41!
ResultsIndeed, Juan Lagares has the worst plate discipline on the team with those over 50PA, as measured by Pitch/FX Kappa on the team, as shown in the table below.
Rick Ankiel! Why is he so high! Well, basically the simple explanation is that he never saw a lot of pitches in the zone. He did take quite a few of them at least in raw totals. Mostly, what he did do was swing at an absurd amount of the strikes he did get -- 96 out of 126. Unfortunately, his contact rate was equally absurdly bad.
Turner and Buck might seem a bit high. Although, .38 is fairly average strike zone Kappa, based on my quick look around the league. But regardless, they are surprising, considering their low walk rates. Turner gets a ton of pitches in the zone and swings at a lot of them. Buck has a similar story with Ankiel -- extremely aggressive in the zone, very low contact. It's important to note that Kappa accounts for the chances of doing this randomly, so these three guys seem to be fairly good at distinguishing between strikes and balls.
For Buck and Ankiel, the problem is that they can't do anything with their major league plate discipline, as they're both abysmal at making contact. Buck's 71% Contact% is absolutely dwarfed by Ankiel's 62%. Turner, on the other hand, sees a lot of strikes, makes excellent contact, but doesn't have enough power to make pitchers pay.
On the other end of things, Duda walks a lot but he doesn't swing often enough at strikes. He and Ike Davis, to a degree, are simply too patient for Kappa's taste. Satin's otherworldly ball-taking ability is hampered by his unwillingness to swing, like, ever. But Lagares is the total opposite. He's basically swinging randomly (9% adjusted strike-zone agreement.) Neither Lagares nor Valdespin have any kind of plate discipline, according to this metric, they both not too far from swinging randomly at balls and strikes.
ConclusionKappa is flawed, I accept that. Perfect agreement means that every single strike is swung at and every ball is taken. The first part seems to be a bit of an issue, as not all hitters can hit every strike. That 2-strike curveball to a certain Met seems to be one of those cases. You just have to take what you can't hit, even if it's a strike.
Still, I think it's a bit better than O-swing% or BB% because it accounts for players making the most of the pitches they do get. Anyways, food for thought and discussion.
As a fun thing, I've added a list of players with 500 pitches this year (2013) [download]. Have fun, and don't be too mean to me or each other in the comments. I know it's not a perfect stat -- yet.