Baseball Musings
Baseball Musings
July 12, 2006
New Deal Statistics

The All-Star game gives me a chance to criticize one of my least favorite statistics, Win Probability Added (WPA). The All-Star game graph show my concern with this stat, and how some people are using it to evaluate players. Notice how little the early game homer counts for Vlad vs. how much the late game triple counts for Young. Now, each gave the AL a one run lead, but because Young's hit came late, it's more valuable. If Vlad's homer had been the only run in the game, it's value would not change. He would have had the most valuable hit in the game, but a late inning lead off triple by the NL (that never results in a run) would likely be more valuable.

WPA is great if you're a gambler and trying to bet on the game as it progresses. But it's lousy if you're trying to buy talent. Let's say some GMs decide this is something worth using to evaluate players. Players then would try to maximize their value in this system. Since most plate appearances, positive or negative result in very little change, the strategy for a player would be to only try in the late innings of close games! A player with a few hits in key situations looks like a star, while someone who contributes throughout the game but fails late looks like a loser. I just don't buy it.

This statistic is a tool to explain what happens, not a measure of a player's ability. To use it as such is wrong.


Posted by David Pinto at 11:27 AM | Statistics | TrackBack (0)
Comments

sounds similar to the drawbacks and quandries of GWRBI's.

Posted by: Devon at July 12, 2006 12:24 PM

I think the value of WPA is to try to measure "clutch" - according to the old fashioned "late and close" situations and the like. Some people may take it too seriously, but I don't think people are evaluating players purely based on WPA. If they are, they're morons.

I think it does give you a pretty good affirmation of A-Rod's clutch production compared to other players (like Jeter). But you can't use it to say Derek Jeter's a better player than A-Rod, because that's not true. But you could use it to say "I'd rather have Jeter up with the game on the line." It's a useful tool if you don't overvalue it.

Posted by: Sully at July 12, 2006 12:28 PM

If people are using WPA to evaluate players, they are misguided and should be set straight. I wouldn't characterize WPA as a "misleading" stat. It's only when used out of context that it becomes misleading.

Keep up the good work Dave.

Posted by: Matt at July 12, 2006 12:34 PM

With respect to Sully's comment about using WPA to examine "clutch" hitters, may I risk blatant self-promotion by mentioning that I have been doing exactly that.

http://clutchiness.blogspot.com

And I have to certainly agree with Sully that I don't think anyone is or has suggested basing management-level personnel decisions on WPA. It doesn't make any claims to a predictive element. But player's performances during a game change the outcome of that game; WPA measures those changes. Will a player continue to affect the game in the same way? Who knows. The bottom line is that the effect actually occurred.

That's what WPA does: it tells the story of those in-game performances, one by one, from when it's anybody's ballgame to when one team gets a win and the other a loss.

Posted by: Dan at July 12, 2006 12:38 PM

It's about the best measure I know of for describing what happened and how each play contributed towards each team. I wouldn't mind one bit if WPA was used as the sole basis of the MVP award (though that does leave out character plusses and minuses).

The one use of WPA as a predictive stat that I like is that over a large sample size (3 years, say), it can show you the impact of players who hit into double plays frequently. There was an article a while back on someone named Eric Van, I believe, who wrote up a nice piece on how Bellhorn's strikeouts weren't such a bad thing as in many cases, they came at a time when a batter would be likely to hit into a double play if he didn't K. Van ended up getting hired by the Red Sox.

WPA also debits players who do inefficient things like sac bunt with a man on first and no outs in the first inning (or, like Beltran, try to steal third base with two outs in an inning). The down side again being that I'd bet it takes about 3 years for the randomness of the importance of the situations a player bats in to even out to somewhere near average. (and the randomness of the success/failure of the sac bunts and steal attempts to get near their true mean)

Posted by: Mike at July 12, 2006 02:48 PM

I think Sully is exactly right.
Who would you rather have: the guy who hits a two run homer in the second inning of every game you win by one run, or the guy who his the two run homer in the ninth inning of every game you win by one run? Obviously, they are equally valuable. Sure, the second guy is more 'clutch', no disputing that; but they're equally valuable.

Here's the theoretic defect in WPA, I believe: it uses the probability distribution that is known at the time of the at-bat. This seems natural, but from the point of view of measuring value of a player, it's completely arbitrary.

Posted by: Jamie at July 12, 2006 09:03 PM

Some of this will be redundant, but I'm going to say it anyway.

To use WPA to say "Player A is better than Player B" is flawed. To use WPA to say "Player A had a better game/season/career" is, I believe, reasonable.

More importantly, WPA tells the story of a game fantastically well. It nearly achieves the impossible: Turning emotion and drama and passion into a quantifiable, tangible thing.

Posted by: Vince at July 12, 2006 10:15 PM

Why don't you look at the situation from the perspective of the pitcher? In the hypothetical 1-0 game, Vlad's home run is the most important event and he is the MVP, and the MVGoat would then be Brad Penny. Does that seem fair?

As for the actual game, does anyone think that the Least Valuable Player was not Hoffman?

Posted by: Chris M at July 12, 2006 11:33 PM

I think this is one of those stats that helps you evaluate what *did* happen, but has little predictive power about what is likely to happen in the future. It's a bit like Wins or RBI that way. I think its value, besides maybe determining MVPs, may lie more in evaluating game strategies... Sully's right about the randomness factor meaning it only really makes sense over very large sample sizes. In cases like that, where maybe you're gauging the difference between a play that makes you 15% more likely to get you a win vs. one that makes you 14% more likely to get a win, I'm not going to fault a manager or player by playing it as he sees it.

Posted by: Adam Villani at July 13, 2006 01:24 AM

I think we have a strong consensus: it's great to describe what has happened, as it quantifies the feelings of fans.

It should not be used by upper management, as the amount of noise is rather overwhelming.

You can click my name to see my comments on my blog.

Posted by: tangotiger at July 13, 2006 10:27 AM
Post a comment









Remember personal info?