The case for and against Keith Hernandez's enshrinement in the Hall of Fame (HoF) has been described in several on-line forums, for instance in a well-researched blog by Chris Bodig[i]. I enter this interesting discussion from the perspective of having "no horse in the race." I am not a rabid Mets fan, in fact I did not know much about Keith Hernandez prior to this project. I was aware he played an important role in the Mets' 1986 World Series, appeared in a Seinfeld TV episode, and is a well-regarded color commentator for Mets baseball games. That's about it. However, I was intrigued by the Bodig article that stated the complex case for and against his enshrinement. That got me wondering if I might be able to add something to the conversation, perhaps through a (relatively-speaking) objective analysis. Another aspect that drew me in was the observation that every HoF controversy-type article's approach seems to consist of piling on one numerical statistic after another to make the argument. This led me to speculate that a data synthesis and visualization tool described by Dertinger and Dertinger[ii],[iii] might shed some light on the case.
The Dertinger and Dertinger approach involves the repurposing of software known as ToxPi[iv] to analyze MLB players' performance. The initial reports demonstrated that the program and its output are particularly useful for ranking, categorizing, and generally comparing the performance of multiple players. The Hernandez controversy encouraged me to investigate whether this flexible data analysis tool might help the baseball community objectively consider HoF candidates, including those that, rightly or wrongly, have been passed over. Thus, the current project involved comparing Hernandez's performance to his contemporaries, as well several HoF first basemen that did not overlap with him in order to provide additional benchmarking opportunities. To my knowledge, Keith Hernandez is the first such player to be evaluated in this way, for this purpose.
The analyses presented here used data from the FanGraphs site[v]. Analyses focused on MLB first basement that were active in a similar timeframe to Hernandez (i.e., between 1970 and 1995), and played at least 5 seasons. To qualify as a season, the player needed to appear in a minimum of 100 games. As noted above, several HoF first basemen were included in these analyses that primarily played before 1970 (e.g., Orlando Cepeda) or later (e.g., Frank Thomas) in order provide additional context. In total, 30 first basemen were evaluated, 7 of which are in Cooperstown: Harmon Killebrew, Orlando Cepeda, Tony Perez, Willie McCovey, Eddie Murray, Jeff Bagwell, and Frank Thomas.
Two types of analyses were conducted. The first relied on what we'll refer to here as "traditional" statistics. These analyses considered the slash line statistics AVG, OBP, and SLG. Additionally, the number of times a player was selected to an All-Star team, won a Gold Glove, or was voted league MVP was also tabulated. These statistics were all considered at the career level. Note that since many players' seasons pre-date the beginning of the Silver Slugger Award (1980), it was not factored into the analyses.
A second type of analysis is referred to here as "advanced" statistics. These analyses considered Def, Off, and WAR. As explained at the FanGraphs site, Def is "Defensive Runs Above Average", and measures a player's defensive value relative to league average. Unlike some other advanced statistics that provide information about a player's value relative to league average, Def adds in a positional adjustment to facilitate comparisons across positions. Off is "Offensive Runs Above Average." Off combines FanGraphs' park-adjusted Batting Runs with Weighted Stolen Base Runs, Weighted Double Play Runs, and Ultimate Base Running in order to give players credit for the quality and quantity of their offensive performance. WAR is "Wins Above Replacement." It is a widely utilized statistic that attempts to summarize a player's total contribution to their team in one metric. According to FansGraphs, WAR was devised to estimate an answer to the question: if a particular player in question got injured and the team needed to replace them with a freely available triple-AAA minor leaguer, how much value would the team be losing?
Let's briefly go over ToxPi and its output. The ToxPi interface is based on a Java-executable script that is freely available at the Toxicological Prioritization Index website[vi]. A version based on the R statistical programing language became available in February 2022, and can be found at the same site. In addition to the main Java script, the single, compressed download includes a user manual, all libraries, and example data files.
Visually, ToxPi profiles are represented as circles that have been divided into a user-defined number of "slices." For the analyses described here, each circle represents a different MLB player, and each slice within a circle represents a particular performance statistic. The distance each slice protrudes from the center of a circle is proportional to a player's performance. The lowest value found for any particular statistic within a dataset under consideration (in our case, 30 players), is given a value of zero, and the highest value is given a value of 1. All intermediate values are scaled proportionally. In this way, the worst-performer for any particular statistic will show a slice with no protrusion, and the best-performer for any particular statistic will show a maximal protrusion, one that extends all the way to the circle's perimeter. Other players will exhibit intermediate-length slices. Besides calculating individual slice scores, the program also calculates an overall "ToxPi Score." This is a summation of slice scores, and is also rescaled to a value between 0 and 1.
As described in previous articles, it is possible to give different "weights" to each slice/statistic, meaning they are assigned different levels of importance. This is accomplished by assigning the slices different arc angles (widths). However, this feature was not utilized for the analyses described here.
Figure 1 shows two player profiles and is provided to familiarize readers with the basic ToxPi structure. Thus, besides the software's ability to distill complex statistics into informative summary graphics, it provides quantitative results in the form of Slice Scores, which corresponds to an individual performance metrics, as well as an aggregate performance metric, the ToxPi Score.
Results and Discussion
The traditional statistics AVG, OBP, and SLG, as well as Gold Gloves, All-Star appearances, and MVP awards, were evaluated for 30 MLB first basemen using ToxPi software. Resulting ToxPi Profiles for every player are shown in Figure 2. Slice Scores and aggregate ToxPi Scores accompany each image. These images are arranged from highest to lowest ToxPi Score as one would read a page of a book—from left to right, top to bottom. Figure 3 plots the ToxPi Scores in ascending order. From these graphics, it is apparent that Hernandez had a very high aggregate ToxPi Score. Indeed, it ranked 2nd, only Frank Thomas had a higher value. It is also noteworthy that Hernandez's traditional statistics-based ToxPi Score is greater than six HoF first basemen in this group of 30 players.
ToxPi software has a hierarchical clustering module that automatically groups similar ToxPi Profiles. This can be useful for evaluating the degree to which players' performance characteristics are similar. Hierarchical clustering results are provided in Figure 4. Generally speaking, Keith Hernandez clusters next to Hall of Fame first basemen such as Harmon Killebrew and Frank Thomas. Interestingly, at least according to this analysis, Hernandez is most similar to the well-rounded performance of Don Mattingly, another first basemen that is currently experiencing contentious Hall of Fame voting.
ToxPi was also used to analyze, synthesize and visualize the advanced statistics Def, Off, and WAR for 30 MLB first basemen. Resulting ToxPi Profiles for every player are shown in Figure 5. Slice Scores and aggregate ToxPi Scores accompany each image. Figure 6 plots the ToxPi Scores in ascending order. Using these advanced statistics, some players rankings were significantly changed compared to the traditional statistics analyses. For instance, one notable change is that Jeff Bagwell replaced Frank Thomas at the 1st position, who fell to number 7. While Thomas's prodigious offensive contributions are still apparent, these ToxPi Profiles make it clear that his defensive skills were the weakest among this group of 30 players, and this affected his overall ranking. Interestingly, Hernandez's relative ranking using these advanced statistics was the same as it was when we considered the traditional metrics. That is, in both cases, he exhibited the second highest ToxPi Score.
ToxPi software's hierarchical clustering results based on three advanced statistics are provided in Figure 7. As with the traditional statistics, these analyses also show Hernandez clustering alongside HoF first basemen. More specifically, this analysis places Hernandez's profile in a subgroup that includes Tony Perez and Orlando Cepeda. That being said, Hernandez's profile is somewhat offset, owing to his remarkable Def.
ToxPi-based player performance analyses provided interesting insights into the Keith Hernandez HoF case. A key component of this methodology involved synthesizing carefully chosen performance metrics into composite scores. Even while player performances were distilled into single values, the associated ToxPi visuals provided a clear indication of where they excelled, and where they were subpar.
Hernandez's ToxPi Profiles revealed defensive excellence compared to his contemporaries, and importantly, compared to HoF first basemen in general. While many fans acknowledge Hernandez may very well be the best defensive first basemen in the history of the sport, ToxPi analyses also make it clear that his offense was sustained at a high level over the course of his career, and also contribute to his case for the HoF.
In conclusion, whether traditional or advanced statistics are considered, ToxPi-based integrated analyses support the contention that Keith Hernandez belongs in the HoF. This is clearly something that should be addressed, and corrected, by the Veterans Committee. Note that while the ToxPi-based assessments made no attempt to address sportsmanship, leadership, and other intangibles, it is evident that these Hernandez qualities are also of HoF caliber and support his enshrinement (e.g., see Bodig article).
[i] Chris Bodig, "Why Keith Hernandez Belongs in the Hall of Fame," July 9, 2022, accessed March 3, 2023. https://www.cooperstowncred.com/keith-hernandez-belongs-hall-fame/.
[ii] Benjamin Dertinger, Stephen Dertinger, "Using the Toxicological Prioritization Index to Visualize Baseball," July 7, 2021, accessed March 3, 2023. https://community.fangraphs.com/author/sdert/.
[iii] Benjamin J. Dertinger, Stephen D. Dertinger, "Baseball, Hot Dogs, and ToxPi: An Approach for Visualizing Player Performer Metrics," Baseball Research Journal, Spring, 2022, accessed March 3, 2023. https://sabr.org/journal/article/baseball-hot-dogs-and-toxpi-an-approach-for-visualizing-player-performance-metrics/.
[iv] Skylar W. Marvel, Kimberly To, Fabian A. Grimm, Fred A. Wright, Ivan Rusyn, David M. Reif. "ToxPi Graphical User Interface 2.0: Dynamic Exploration, Visualization, and Sharing of Integrated Data Models," BMC Bioinformatics 19 (2018) 80. Accessed March 3, 2023. https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2089-2/.
[v] FanGraphs, accessed March 3, 2023, https://www.fangraphs.com/.
[vi] ToxPi: Toxicological Prioritization Index, accessed March 3, 2023, https://toxpi.org/.