Instant replay has already provided a host of confounding decisions. Turns out the human eye watching video - while more reliable than the human eye watching live action – still stumbles now and again.
But at least we’ve got PITCHf/x to tell us when our pitchers are getting robbed by the ump right?
Not so fast.
I’ve always been curious about the reliability of the system and have never really gotten answers to questions in AA threads so I turned to Google and found this Baseball Prospectus article by Mike Fast from 2011.
It’s the first of a series. This one focuses on the error margins for the horizontal axis or inner and outer corners of the plate. The most important take-away, which really shouldn’t be surprising is that indeed there are error margins. Cameras shift over time and getting them properly calibrated requires you to make assumptions about what constitutes proper positioning. The article is pretty math heavy and while I followed most of it I won’t pretend I grasp the methodology fully.
But the fact that they’ve got to test multiple methods to establish a range of likely errors confirms what shouldn’t be shock to anyone: that the system is not actually completely precise though it does seem to be pretty close on the horizontal axis.
Since that article didn’t address my biggest question I e-mailed the author Mike Fast who gracious enough to reply promptly with a link to another piece in his series found here.
My biggest questions have always been how or whether the system accounts for the different heights and stances of players that affect the top and bottom boundaries of an individual’s strike zone.
I urge you to read this article if you have any interest in this question.
Here’s an excerpt:
The top of the zone (sz_top) and bottom of the zone (sz_bot) in the PITCHf/x data come from measurements made from video by the Sportvision PITCHf/x operator. Just before each pitch, as the batter takes his stance, the operator marks lines on the center field camera video corresponding to the height of the hollow of the batter’s back knee and to the batter’s belt. The line at the batter’s knee is reported in the data as sz_bot, and the system adds four inches to the height of the batter’s belt and reports that value as sz_top.
The key take away here is that the top boundary is established by adding 4 inches to the height of the batter’s belt. That may work for lots of players but I’m guessing that the actual height between 6’4" tall Lucas Duda’s belt and rule book top of zone might differ from that of 5’10" Eric Young. Both saw considerable time in left field for the Mets last season. So that’s one concern that’s not accounted for. He goes on to detail anther issue:
However, these values collected by the PITCHf/x system can vary quite widely from day to day for a given batter. For example, look at the sz_top and sz_bot values recorded by the PITCHf/x system for Brian McCann from 2007 through May 2011.
You can see the chart at the source article site. He continues:
The top of McCann’s strike zone varies by as much as a foot and the bottom of the strike zone by half a foot! Batters may make changes in their stances that result in small tweaks to the height of their strike zones, but surely nothing approaching that magnitude.
He then walks through several ways to arriving at a more perfect strike zone based on a height based estimate, what umpires call and where pitches land. Fairly technical but even for a non-math guy like me the rigorous and imaginative nature of his search was fascinating.
One of his conclusions has far reaching ramifications:
The utility and accuracy of a Zone Evaluation system that is used to grade major-league umpires based upon the unreliable PITCHf/x sz_top and sz_bot measurements is also called into question.
And he closes the piece with:
Short of having a more reliable method for measuring the actual stance of batters from video, it seems to make the most sense to set the top and bottom boundaries based upon the height of the batter and the average height of the pitches that the batter sees, as scaled to the average umpire zone. That is probably not a workable solution for the robotic home-plate umpires that many fans desire. Such a system would be slow to capture changes in batter stances and could be subject to manipulation. However, given the current data, it seems to be the best approach for analysis of strike zone data, and it is much more accurate than using the boundaries supplied in the PITCHf/x data.
So while it may not be as much fun, it probably makes sense to take Game Day’s strike zone with a grain of salt. While it’s got a lot better chance of being accurate on the corners than at the knees and letters some parts of it are just as subject to human error as the guy standing behind the catcher.