Team WAR vs Actual Wins
This is a MUST-SEE for all you WAR (wins above replacement) fans out there. For those of you who are skeptical, skepticize no more!
Correlation between WAR and actual wins: +0.74.
(Best possible correlation = +1.00)
For those of you familiar with statistics, you know that this is a very solid correlation. Therefore, the WAR statistic is just as reliable as it is credible.
WAR is actually a measure of how many projected wins a player contributes to the team (judging by the findings), though not entirely accounting for the dynamics of a teams' performance on the part of the multiple players involved - that would be primarily responsible for the deviation between WAR and actual wins. It would be somewhat impossible, in every case, to determine which players involved in a play contribute the most - i.e. who should get more credit for throwing the runner out at first on a 1-3 play? The pitcher or the first baseman?
Taking into account that the team is greater than the sum of its players, much of the contributions to WAR on the part of a player are actually implied, rather than measured, based on circumstance (as there are far too many factors and variables involved to make the exact measurements possible!) - that could account for a few of the inaccuracies of the WAR statistic. But overall, it is valid enough to be considered a gold standard, in case you wanted to find out about how many games a team would win with a random assortment of players, though not taking the entire scope of teamwork dynamics into consideration (i.e. lineup protection).
WAR, though a great and reliable statistic most of the time, must in certain situations be taken with a grain of salt.
Reference:
WAR / games won = projected win shares
about 1 month ago
sj10689
46 comments
0 recs |
Comments
I'd be interested
in knowing which teams the outliers on the graph are.
You don't cheer for the Mets. You drink for the Mets.
by Kevin H on Oct 7, 2009 9:27 AM EDT reply actions 0 recs
Here's a FanGraphs Link on the Subject
I figured it didn’t need its own FanShot.
The correlation this year was .84 which suggests the methodology could be improving. IMO the biggest improvement with be when hitfx starts to get used for defensive metrics like UZR.
Interesting not from the FanGraphs piece, the Rays were the most variant from their WAR expected wins (at 13ish).
by Balagast on Oct 7, 2009 1:33 PM EDT reply actions 0 recs
Bizarre the Link didn't work I'll Try again
by Balagast on Oct 7, 2009 1:34 PM EDT up reply actions 0 recs
Agreed...I think the offensive metrics are far more accurate at this point than the defensive ones.
It’s getting better, but UZR could certainly benefit from better data.
"We're just as bad as the old Mets, but this time nobody's laughing"
-Dallas Green
by Schmidtxc on Oct 7, 2009 9:27 PM EDT up reply actions 0 recs
The correlation between wins and WAR has almost no bearing on how good WAR is
by vivaelpujols on Oct 7, 2009 11:14 PM EDT up reply actions 0 recs
It's still a nice correlation that puts mouthbreathers in their place.
"I dunno. I never smoked any Astroturf"
-Tug McGraw
by squid92 on Oct 8, 2009 12:16 AM EDT up reply actions 0 recs
Another point to concider.
Doesn’t replacement level change every year. In that analysis they use a constant 48.6 wins as replacement. I would think it would affect the data if they didn’t use the replacement level win expectancy for that given year.
by Balagast on Oct 7, 2009 9:34 PM EDT reply actions 0 recs
I disagree with this
Here was my response:
http://www.hardballtimes.com/main/blog_article/war-vs.-wins/
There is no doubting that WAR is a good stat, but saying it has a high correlation to wins doesn’t mean anything.
by vivaelpujols on Oct 7, 2009 10:57 PM EDT reply actions 0 recs
Hmm
I’m not sure what the problem is.
Let’s hypothesize there is a linear relationship between WAR and the “true” number of wins. We assume that the actual number of wins will have some variance from the “true” number, which we call luck or whatever.
So you wouldn’t want the correlation to be 1, since that means that the variance in wins from luck is zero, i.e. the actual number of wins is the true number of wins. But if WAR has a high correlation, that would still be okay, depending on much variance you’re assuming actual wins have from the “true” wins.
by mnbv on Oct 8, 2009 12:12 AM EDT up reply actions 0 recs
That's right
You would want WAR and wins to have some correlation so you would know it’s in the right range, but you wouldn’t want it to have too high of a correlation either.
by vivaelpujols on Oct 8, 2009 12:17 PM EDT up reply actions 0 recs
One of these days someone is going to have to explain to me what "replacement" really means.
to me, baseball is an organic game with ups and downs for each individual player. trying to put a hard numeric value on a player is something I have trouble understanding.
Please don’t flame, and don’t point me to a link I won’t understand. I’m just a confused old man, and I really don’t have the energy. I just don’t get why people get so hung up on SABR. Sure, they have some useful stats, but I say there is more to baseball than just numbers. (and no….I’m not talking about grit or heart….I am talking about ebb and flow).
by fxcarden on Oct 8, 2009 9:47 AM EDT reply actions 0 recs
There's definitely more to baseball than numbers.
Hell, like mnbv noted above, there’s an element of luck involved (see Tampa Bay’s 2009 season, for one). However, as fans sitting in our mom’s basements tapping away at keyboards, we’re not privy to a lot of the other stuff that goes into the game. We can speculate about how Player X is in the locker room, but we don’t really know. We can speculate that Player Y is a headcase, but we don’t really know. We can speculate that Player Z is selfish and cares only about his owns numbers, but we don’t really know. We get “hung up” on statistical analysis because it’s really all that we have to go by.
I’m not saying you or I won’t notice when guys are dogging it in the field, but at least that’s something we can see. But besides the numbers and what we see with our own eyes watching the games, there really isn’t much else for us to knowledgeably talk about.
One question for you: what do you mean by “ebb and flow”?
"He's definitely mixing it into his repertoire. That's French for 'repertoire' " - Keith Hernandez
by Catsmeat Potter-Pirbright on Oct 8, 2009 10:19 AM EDT up reply actions 0 recs
What I mean is...
A guy has a great week followed by the same guy having a shit week
or month, or year, or even individual games. How on earth do you quantify that with a single number ?.
in other words…….take the shittiest player (let’s say Francoeur for argument’s sake), and let’s say he wins you 5 games in 2010 with solo walk off HRs. Does that mean he is now a stud based on WAR ?.. I mean, he WON you 5 games. His potential replacement (let’s say Pagan), does the opposite and grounds into 5 game ending DPs…does that mean he (Pagan) sucks ?. He LOST you 5 games.
In other words, how can a single number project what a player will or will not do ?.
I understand that it is mostly an educated guess, but a lot of times those are a long way from reality.
Another example that comes to mind: Ollie P’s 15 wins probably made his value rise, and now he sucks ass.
That is what I mean by Ebb and Flow.
Shit, now I gave myself a headache.
Also, I hope your mom’s basement is warm and dry. Just kidding.
by fxcarden on Oct 8, 2009 11:33 AM EDT up reply actions 0 recs
You're thinking of WPA with that right there.
That would show you how much someone contributes on a game by game basis in terms of performance. However, over a long period of time, that performance CAN be attributed to a single number, showing the context-neutral performance of the player. I mean, if a better player were in for Francouer, maybe they’d be winning before the opportunity arose to hit the homerun.
"I dunno. I never smoked any Astroturf"
-Tug McGraw
by squid92 on Oct 8, 2009 11:51 AM EDT up reply actions 0 recs
OK I think I get it now
the key words in your explanation are “context neutral”.
Thanks.
by fxcarden on Oct 8, 2009 2:37 PM EDT up reply actions 0 recs
I don't think you understand the concept
Does that mean he is now a stud based on WAR ?.. I mean, he WON you 5 games. His potential replacement (let’s say Pagan), does the opposite and grounds into 5 game ending DPs…does that mean he (Pagan) sucks ?. He LOST you 5 games.
WAR is used to evaluate the entire body of work based on a season sample size of data. Individual events like hitting the HR to win the game doesn’t give you +1 WAR because that HR didn’t win that game. Players can’t control (precisely) single events like that walk off HR.
That HR instead counts almost the same as another HR, because the perceive value of that HR might be greater only due to the situation. Is that player value (individual contribution) greater because the other events of the game made that HR’s situation seem more important … no.
Also, stats like WAR are only really relevant when using a large sample to account for that eb and flow nature you are describing. Yes even in large samples there can be anomalies that skew data, but for the most part over a large enough sample those things will balance out.
WAR is a popular number to sum a players contribution up because it is a stat that is designed to encompass all the things a player contributes (hitting fielding baserunning). Here’s an example to help illustrate what I’m saying. Think of the value of your car. That overall value is based on the sum of various characteristics of the car that can individually be quantified (mileage, physical condition, age, original value). Similarly WAR looks at individual metrics that measure specific things and groups them up into an overall number.
by Balagast on Oct 8, 2009 12:01 PM EDT up reply actions 0 recs
So..
How do you measure a player’s WAR over his career ?. Is it averaged, or cumulative ?.
Also, how does the walkoff HR not win the game ?.
by fxcarden on Oct 8, 2009 2:41 PM EDT up reply actions 0 recs
Yes a walk off HR “wins the game” in the sense that it ends the game, but that singular effort was not the reason the team won. They had to have enough combination of pitching fielding hitting etc by the entire team to put them in that position.
The best thing that player can do in that spot individually is to hit that HR, but does that mean that that player’s contribution to winning is less (by hitting a HR), if they are down by 3 and there is no one one base … no.
WAR isn’t situational, and most assume that “clutch hitting” doesn’t really exist. I personally would value a player less if (with a large sample size) they have demonstrated that they hit poorly under pressure, but that is another issue.
The general purpose of WAR is to have an objective overall value of a player quantized.
by Balagast on Oct 8, 2009 5:13 PM EDT up reply actions 0 recs
makes sense
so how is the baseline “replacement” value arrived at ?.
by fxcarden on Oct 8, 2009 7:58 PM EDT up reply actions 0 recs
By my understanding.
The replacement level is supposed to emulate the production of a player that could be called up from AAA as a fillin type. Basically a AAAA type player as people like to call them.
by Balagast on Oct 8, 2009 9:57 PM EDT up reply actions 0 recs
Player Y = Jose Reyes?
"[The Giants] beat us down. We were beat by a grown-man team, a team we want to be like one day. They came in here and took it to us. Out-manned us, out-gunned us. ... It wasn't even close." - Raheem Morris, 9/27/09
by cjmulrain on Oct 8, 2009 11:36 AM EDT up reply actions 0 recs
Could be.
It might also be Mike Pelfrey or Ollie Perez.
"He's definitely mixing it into his repertoire. That's French for 'repertoire' " - Keith Hernandez
by Catsmeat Potter-Pirbright on Oct 8, 2009 12:24 PM EDT up reply actions 0 recs
Lemme take a stab at it.
Baseball is, in the final analysis, a team game where results are measured in wins. What we’re trying to find is the relationship between individual performance and team wins.
When you say:
in other words…….take the shittiest player (let’s say Francoeur for argument’s sake), and let’s say he wins you 5 games in 2010 with solo walk off HRs. Does that mean he is now a stud based on WAR ?.. I mean, he WON you 5 games.
That simply isn’t accurate. That home run in isolation did not “win” those games; some combination of the offense, pitching and defense surrounding Francoeur kept the game tied until that point. If Jeff Francoer tried to play baseball by himself he’d lose every time.
So when we say a player is worth X number of wins, what we’re really referring to is the the difference in the number of games a team would win without that player.
So if the Mets didn’t have Francoeur in right field, who WOULD they have? They wouldn’t necessarily have an average baseball player – there are really more below-average baseball players than there are average and above players. But they wouldn’t simply field no baseball player – someone would have that playing time instead.
What we call “replacement level” is simply an rough sketch of what the bare-minimum alternative looks like on the typical team – a non-prospect in AAA, or a veteran that can be signed to the minor league minimum.
by cwyers on Oct 8, 2009 5:05 PM EDT up reply actions 0 recs
Exactly ... well put
Just to add onto the replacement level concept.
Its essentially just a baseline value that you can use to compare players against. The actual level of replacement level doesn’t really matter as long as you use that same level when calculating each players WAR.
by Balagast on Oct 8, 2009 5:16 PM EDT up reply actions 0 recs
OK, so what I think I understand is..
“replacement” is just an arbitrary starting point, and everyone is measured off the same starting point .
So…in the simplest terms, if the base was 500 AB / 285 Avg / 340 OBP, etc. Those with numbers higher than that would have + WAR, and those below would have
- WAR ?. I assume the final value is weighted to some degree. Yes ?.
And last…(sorry to be a pain) how do you assign a $ value to a WAR value ?.
by fxcarden on Oct 8, 2009 8:03 PM EDT up reply actions 0 recs
$4.5 mil per win
Just as the average contracts per year, that’s what the money works out to.
"I dunno. I never smoked any Astroturf"
-Tug McGraw
by squid92 on Oct 8, 2009 9:09 PM EDT up reply actions 0 recs
I have a SERIOUS problem with the $4.5 million per win concept!!!
A linear progression of salary that is deserving of a player based on his WAR value is ALL WRONG! If payroll efficiency has any say in it, it should, at least, be more of a quadratic progression. Based on the RS-RA pythagorean formula and total payroll, it should go something like this:
….$4.5 million per win (average)
x 2,430 wins
-——————————————-
$10.95 billion (estimate)
Avg amount of players on team roster (including Sept. 1 callups): 27.5 = 825 players total (not accounting for roster movements)
Avg. salary: $6.65 million
Avg. WAR: 2.95
Avg. salary per win: $4.5 million per win [(2 x avg. salary) / WAR]
Adj. WAR $ value Formula:
$928,000 x (WAR^1.82) = est. salary
Example, under “ideal” conditions:
$928,000 x (2.95^1.82) = $6.65 million
I mean, come on! Doesn’t my formula make a lot more sense!? Especially when you consider that better players are given more money at a rate of progression that is clearly NOT linear! If it were linear, it would not follow the Borasian Baseball Business model (or BBB model*). Of course, my formula is far from perfect (many little unaccounted factors/variables involved), but it’s MUCH closer to reality than that 4.5 million per win bullcrap!
* not to be confused with fiscal projections based on customer service
Another conclusion: According to my model, a player whose WAR is 0.553 earns the league minimum of $316,000 instead of $2.49 million! How’s THAT for realistic!?
"The picture looked like I was in the dugout, but they got it all wrong. I absolutely was never in the dugout."
- Mr. B.V. Incognito
by sj10689 on Oct 10, 2009 1:25 AM EDT up reply actions 0 recs
CONCLUSIVE STATS
Mean WAR: 2.95
Median WAR: 2.39
Estimated percentile quantitative progression (based on WAR levels):
[√(WAR/9.56)] x 100%
Note: WARs of 9.56 or higher are rounded out to being in the 100th percentile. This progression formula is more of a general guideline than an actual scientific method of evaluating skill percentile – it is just an estimate.
"The picture looked like I was in the dugout, but they got it all wrong. I absolutely was never in the dugout."
- Mr. B.V. Incognito
by sj10689 on Oct 10, 2009 3:09 AM EDT up reply actions 0 recs
Use your formula and the WAR formula to predict player salary...
…and see what does a better job.
by cwyers on Oct 10, 2009 2:39 PM EDT up reply actions 0 recs
I'm up for the challenge!
Let’s use the Mets’ roster according to fangraphs.com
I must note, that for a normally productive player whose season was ended shortly by injury, that this does not project well in a real-world sense. For that matter, a “Projected WAR” must be used, which could correlate with WAR $ value.
Let’s demonstrate in a face-off between me and the other guy in the WAR $ value.

The player values based on performance and playing time can be argued to a certain extent, but do you honestly think and/or believe that the Mets have played like what a $101.5 million team is supposed to play like? Yeah, neither do I! $56 million is much, much closer to the team’s performance based on value. There’s no question about that.
My system wins, by technical KO.
"The picture looked like I was in the dugout, but they got it all wrong. I absolutely was never in the dugout."
- Mr. B.V. Incognito
by sj10689 on Oct 11, 2009 4:49 AM EDT up reply actions 0 recs
The ugly truth deciphered
The statistics in devilish red are hard to read, so here they are, deciphered:

"The picture looked like I was in the dugout, but they got it all wrong. I absolutely was never in the dugout."
- Mr. B.V. Incognito
by sj10689 on Oct 11, 2009 4:59 AM EDT up reply actions 0 recs
He's asking you to "predict" not to compare
What you should do is make a projection of each player who will be a free agent this year. I would suggest doing a 5,4,3 average of previous years WAR (weighted by plate appearances) then add some regression and a simple aging model. Use your model to predict what each of those players will eventually be signed for, then compare that with FanGraphs model.
by vivaelpujols on Oct 11, 2009 5:38 AM EDT up reply actions 0 recs
That'll take a while
I’ll work on it tonight, then see which results I get. For purposes of foresight, I will scale up player progression up to 27 years of age, and then scale it down from 33 onward. But considering the circumstances, it’s definitely worth a shot.
"The picture looked like I was in the dugout, but they got it all wrong. I absolutely was never in the dugout."
- Mr. B.V. Incognito
by sj10689 on Oct 11, 2009 12:29 PM EDT up reply actions 0 recs
Regardless
The question shouldn’t be how well WAR correlates with wins within a given season. The big question about WAR should be how predictive it is year over year. Obviously with the best and worst players it’s more likely to be constant. But what about the large group in the middle? It seems that there are a lot of players who have significant fluctuations from season to season. Is WAR just another stat that tells you how well a player has done in the past? Or is it a more predictive true talent evaluator?
by Reg Dunlop on Oct 8, 2009 2:25 PM EDT reply actions 0 recs
I would like to see a WAR that was based only on the running projection of a player
per 150 games. WAR is a mix of measuring performance and predicting future performance. It is context nuetral, but it doesn’t use xFIP, or weight the data.
by EtSuKe on Oct 8, 2009 9:06 PM EDT reply actions 0 recs
Exactly
WAR isn’t good at predicting performance, because it is purely results based. However, if you look at underlying reasons for success or lack thereof (BABIP, LOB%, LD%, etc.), it’s easier to understand how the fluctuation may occur.
"I dunno. I never smoked any Astroturf"
-Tug McGraw
by squid92 on Oct 9, 2009 12:25 AM EDT up reply actions 0 recs
Any measure of past performance is going to measure future performance to some extent.
At least, so long as the past performance is somewhat skill based.
People tend to view WAR as a “skill” metric sometimes because of FIP, but it’s not supposed to be one. FIP is used not because BABIP isn’t a pitcher skill but because it’s a fielder skill – if you don’t seperate a pitcher’s run prevention from that of his defense you’re double-counting the defense’s skill in run prevention.
by cwyers on Oct 9, 2009 1:29 AM EDT up reply actions 0 recs
Any measure of past performance is going to measure future performance to some extent.
Tell that to the guys on Wall Street.
by fxcarden on Oct 9, 2009 9:39 AM EDT up reply actions 0 recs
WAR/650 PAs or WAR/215 IP (SP), 70 IP (RP) is more accurate
Though to account for one’s average batting order (regular everyday player), I suggest this formula for projected number of PAs
(569 + [18 x (9 – avg. lineup position per PA)]) x (team OPS+ / 100) x (games played / 154)
i.e. An average everyday cleanup hitter (154 games in the 4 spot) on the Washington Nationals (team OPS+ = 95) would get about 626 PAs in 2009, assuming that player is a regular. Adam Dunn has 668 PAs in 159 games played this season, so he has about 22 more PAs than projected.
As for pitchers, their numbers can simply be scaled based on their workload. It’s easy to adjust for RPs, but not as much for SPs. I suggest the following one for SPs
(IP*/GS) x 32.5
* Don’t count the number of IP in relief outings, if any!
"The picture looked like I was in the dugout, but they got it all wrong. I absolutely was never in the dugout."
- Mr. B.V. Incognito
by sj10689 on Oct 12, 2009 5:56 PM EDT up reply actions 0 recs
test
"The picture looked like I was in the dugout, but they got it all wrong. I absolutely was never in the dugout."
- Mr. B.V. Incognito
by sj10689 on Oct 12, 2009 4:07 PM EDT reply actions 0 recs
















