New Stat Alert: SIERA
Those money-grubbing sabermetricians are at it again! Furthering their journey down that money trail, talented Baseball Prospectus writers Matt Swartz and Eric Seidman unveiled a new pitching statistic, "Skill Interactive Earned Run Average" (SIERA, for short). It is rooted in the DIPS concept, like FIP and tRA, and was explained in a series of posts this past week:
Part 1
Part 2
Part 3
Part 4
Part 5
The formula for the new stat should instantly convert flat-earthers to saber-lovers:
SIERA = 6.262 – 18.055*(SO/PA) + 11.292*(BB/PA) – 1.721*((GB-FB-PU)/PA) +10.169*((SO/PA)^2) – 7.069*(((GB-FB-PU)/PA)^2) + 9.561*(SO/PA)*((GB-FB-PU)/PA) – 4.027*(BB/PA)*((GB-FB-PU)/PA)
Ouch. Commence blog posts longing for simpler times when a math degree wasn't necessary to enjoy the crack of the bat, smell of the grass and taste of a cold beer at the old ballpark.
In all seriousness, this seems like a useful new stat to use side-by-side with FIP and the like, but who knows. I'll let people smarter than I examine it, offer criticisms and determine how much value it adds to what is already available. That's one of the great aspects of the sabermetric community -- when something new is introduced, there's a lineup of usual suspects ready to frisk it. If the new stat is bogus, it will be exposed and forgotten.
Swartz and Seidman nicely outlined what SIERA should accomplish in their Part 1 piece -- click through to check them all out. Here is a preview, in which they used Johan Santana as an illustration for one of the SIERA attributes:
2. Allows for the fact that a low fly-ball rate (and therefore, a low HR rate) is less useful to pitchers who strike out a lot of batters (e.g. Johan Santana's FIP tends to be higher than his ERA because the former treats all HR the same, even though Santana’s skill set portends his bombs allowed will usually be solo shots).
Johan's ERA has historically outperformed DIPS and perhaps this is a reason why. For more on Santana, check out Part 5.
Courtesy of the SIERA calculator at Braves blog Capitol Avenue Club (BP hasn't yet put up SIERA leaderboards), here are the 2009 SIERA's for potential 2010 Mets pitchers, minimum 50 big league innings. Various other stats are also listed:
| Pitcher | SIERA | FIP | xFIP | tERA | ERA |
|---|---|---|---|---|---|
| Pedro Feliciano | 3.07 | 3.55 | 3.05 | 3.44 | 3.03 |
| Johan Santana | 3.69 | 3.79 | 4.13 | 3.49 | 3.13 |
| Sean Green | 3.76 | 4.42 | 4.31 | 4.18 | 4.52 |
| Francisco Rodriguez | 3.88 | 4.01 | 4.32 | 3.88 | 3.71 |
| Nelson Figueroa | 4.13 | 4.31 | 4.50 | 4.88 | 4.09 |
| Bobby Parnell | 4.49 | 4.30 | 4.74 | 3.97 | 5.30 |
| Mike Pelfrey | 4.57 | 4.39 | 4.52 | 4.56 | 5.03 |
| John Maine | 4.92 | 4.57 | 5.09 | 4.90 | 4.43 |
| Oliver Perez | 5.24 | 6.40 | 6.08 | 6.40 | 6.82 |
| Pat Misch | 5.59 | 5.37 | 5.19 | 5.82 | 4.12 |
Looks like we found Ollie's new favorite stat.
22 comments
|
0 recs |
Do you like this story?
Comments
Seriously, man
they need to get out of their mothers basement, stop reading spreadsheets and get out into the real world.
"We're investigating the investigative procedure of the investigation of Tony Bernazard"---Omar Minaya (he really didn't say it but he would"
interesting
i’ve been reading the updates all week and i’m looking forward to the sabr community’s response
they should really replace the coefficients with letters
And define the letters below the equation.
Do you really have 1-in-1000 level confidence in those regressions?
And written in LateX format, it wouldn’t look as much like flamebait
No truer comment than this one
I couldn’t stop laughing when I saw those coefficients. Not to mention the fact that I haven’t seen that kind of data mining since pre-qualifying exams in graduate school. Is this projection is going to get us much over the 65-70% projection accuracy hump? We would get a lot further by developing better models for projecting playing time.
The coefficients shouldn't matter for a defense of the thing.
The question isn’t whether the coefficients have 7 decimal places, it’s whether the regressors make good theoretical sense. And this model isn’t actually that complicated—it’s just a per-plate-appearance model with 3 variables, some exponents and some interaction terms.
That’s right, this model, as complicated as it looks, says that there are only 3-5 things (3-5!) that can explain a pitchers ability: strikeouts, walks, and the relative rate of groundballs, popouts, and flyballs)
I'm not even gunna try...
by wrightttxgirlllx3 on Feb 13, 2010 11:52 AM EST via mobile reply actions
this
This is the type of thing that makes it so hard to get into SABR stats.
Goodbye Sir Dr. Sen. Brain SOCKS! D.D.S.R.S.V.P
Eh, I just don't have the patience.
I’ll slowly learn by reading how you guys use it in arguments. That’s how I learned all the other ones, anyway.
by wrightttxgirlllx3 on Feb 13, 2010 2:07 PM EST up reply actions
it's saber
SABR is the society for American Baseball Research.
Sabermetrics is the term coined by Bill James.
"We're investigating the investigative procedure of the investigation of Tony Bernazard"---Omar Minaya (he really didn't say it but he would"
by firejerrynow on Feb 13, 2010 2:22 PM EST up reply actions
Two things I struggle with
1. I’m still have trouble with the weight given to each of these factors. How are the multipliers determined and why? Is this a subjective weight based on the creators opinion of the goodness/badness of certain outcomes? Or is there some generally established principle that drives these weights?
2. When assembling a pitching staff (beyond the top tier guys), is it better to look at a numbers like this and get the most well rounded pitchers, or is it better to look at individual components of a pitchers make up to see who fits best with defense and park factors they will be most affected by?
Multipliers
These particular coefficients for SIERA, as I understand, are determined through linear regression. That is, they’re empirically derived from previous years’ data and not logically deduced in some manner.
Why?
To be blunt, our goal was to beat everyone at predicting park-adjusted ERA in the following season, regardless of HR/FB treatment, and beat everyone but FIP and tRA in terms of same-year predictive value.
So you adjust your coefficients to match your goal as best you can.
From the Book Blog:
“The real issue is that Santana has a career BABIP of .263 with men on base and .295 with bases empty.
The real issue is that Santana gives up 0.59 HR per 20 bases empty PA, and 0.44 per 20 men on base PA."
That is the main reason Santana outperforms his FIPs.
The only difference between SIERA and xFIP
Is that SIERA tries to find relationships between events, while xFIP keeps everything independent. Unfortunately, the only way to do that is through multi-variable regression, and that’s not going to be as intiutive or as easy to break down as Linear Weighs (which is what FIP and xFIP are based on)
Correction...
I just got off the phone with Bill James, this formula is already outdated:
SIERA = 6.262 – 18.055*(SO/PA) + 11.292*(BB/PA) – 1.721*((GB-FB-PU)/PA) +10.169*((SO/PA)^2) – 7.069*(((GB-FB-PU)/PA)^2) + 9.561*(SO/PA)/PA) – 4.027(BB/PA)*((GB-FB-PU)/PA)
Adjusting for higher than average humidity in 2009, the first parameter should now be 6.268 instead of 6.262.
I know this is controversial, but we must advance the cause. Carry on.
by Mex_17 on Feb 13, 2010 8:38 PM EST reply actions 1 recs
They've arrived
"We're investigating the investigative procedure of the investigation of Tony Bernazard"---Omar Minaya (he really didn't say it but he would"
by firejerrynow on Feb 14, 2010 7:00 AM EST up reply actions
I like how this stat handles the relationship of groundballs and walks
it’ll take me a little while to fully absorb it all, but seems like a welcome addition to the pitching stat team, although I’ll be honest, i’m glad I don’t have to calculate it myself. I’m a little surprised that Pelf has such a high Siera projection, as I figured it would be a bit more kind to his pitching style.
by KeithsMoustache on Feb 16, 2010 11:09 AM EST reply actions
Seriously?
This is a relatively simple stat. It looks complicated because of all the decimals, but
Pitcher’s ability = f(SO,BB,GB,FB,PU,PA) isn’t that complicated. You can do this in Excel. And I don’t mean “one can do this in Excel”, I mean ANYONE can do this in Excel.

by 






























