November 24, 2006
Probabilistic Model of Range, Rightfielders, 2006
It seems every year I run the PMR for rightfielders I encounter the same problem, and it has to do with Ichiro Suzuki:
Probabilistic Model of Range, Rightfielders. Model is Based on 2006 Data Only. Minimum 1000 Balls in Play. Uses Velocity for Fly Balls.
Player | In Play | Actual Outs | Predicted Outs | DER | Predicted DER | Difference |
Reggie Sanders | 1942 | 170 | 150.73 | 0.088 | 0.078 | 0.00992 |
Carlos J Quentin | 1156 | 96 | 85.83 | 0.083 | 0.074 | 0.00879 |
Casey Blake | 2586 | 210 | 191.62 | 0.081 | 0.074 | 0.00711 |
Damon J Hollins | 1440 | 134 | 124.64 | 0.093 | 0.087 | 0.00650 |
Mark DeRosa | 1654 | 125 | 115.00 | 0.076 | 0.070 | 0.00605 |
Kevin Mench | 1541 | 112 | 102.93 | 0.073 | 0.067 | 0.00588 |
Ryan Freel | 1122 | 101 | 94.94 | 0.090 | 0.085 | 0.00540 |
Jose Guillen | 1774 | 164 | 154.51 | 0.092 | 0.087 | 0.00535 |
Jay Gibbons | 1107 | 97 | 91.66 | 0.088 | 0.083 | 0.00482 |
J.D. Drew | 3472 | 284 | 267.61 | 0.082 | 0.077 | 0.00472 |
Alex I Rios | 2862 | 218 | 205.27 | 0.076 | 0.072 | 0.00445 |
Juan Encarnacion | 3085 | 219 | 208.81 | 0.071 | 0.068 | 0.00330 |
Vladimir Guerrero | 3258 | 253 | 243.64 | 0.078 | 0.075 | 0.00287 |
Emil Brown | 1349 | 110 | 106.51 | 0.082 | 0.079 | 0.00259 |
Jacque Jones | 3476 | 275 | 266.55 | 0.079 | 0.077 | 0.00243 |
Austin Kearns | 3928 | 346 | 337.89 | 0.088 | 0.086 | 0.00206 |
Moises Alou | 2026 | 154 | 150.84 | 0.076 | 0.074 | 0.00156 |
Russell Branyan | 1163 | 87 | 86.07 | 0.075 | 0.074 | 0.00080 |
Bobby Abreu | 4047 | 293 | 292.60 | 0.072 | 0.072 | 0.00010 |
Trot Nixon | 2700 | 212 | 211.95 | 0.079 | 0.079 | 0.00002 |
Joe Borchard | 1060 | 84 | 84.06 | 0.079 | 0.079 | -0.00006 |
Jeff B Francoeur | 4434 | 317 | 317.93 | 0.071 | 0.072 | -0.00021 |
Brad B Hawpe | 3769 | 280 | 281.06 | 0.074 | 0.075 | -0.00028 |
Jay Payton | 1173 | 89 | 89.42 | 0.076 | 0.076 | -0.00036 |
Ichiro Suzuki | 3252 | 250 | 251.21 | 0.077 | 0.077 | -0.00037 |
Shawn Green | 3393 | 220 | 222.29 | 0.065 | 0.066 | -0.00068 |
Jason Lane | 2049 | 155 | 156.74 | 0.076 | 0.076 | -0.00085 |
Randy Winn | 1996 | 184 | 185.72 | 0.092 | 0.093 | -0.00086 |
Milton Bradley | 2518 | 191 | 194.41 | 0.076 | 0.077 | -0.00136 |
Jermaine Dye | 3915 | 305 | 310.61 | 0.078 | 0.079 | -0.00143 |
Nick Markakis | 2843 | 240 | 244.33 | 0.084 | 0.086 | -0.00152 |
Geoff Jenkins | 3333 | 247 | 254.04 | 0.074 | 0.076 | -0.00211 |
Michael Cuddyer | 3637 | 245 | 259.18 | 0.067 | 0.071 | -0.00390 |
Jeromy Burnitz | 1988 | 120 | 128.64 | 0.060 | 0.065 | -0.00435 |
Bernie Williams | 1347 | 98 | 104.01 | 0.073 | 0.077 | -0.00446 |
Jeremy R Hermida | 2003 | 157 | 166.44 | 0.078 | 0.083 | -0.00471 |
Xavier Nady | 2560 | 187 | 202.29 | 0.073 | 0.079 | -0.00597 |
Magglio Ordonez | 3893 | 258 | 281.26 | 0.066 | 0.072 | -0.00598 |
Brian Giles | 4169 | 298 | 332.48 | 0.071 | 0.080 | -0.00827 |
That's right, Ichiro is very slightly negative (actually, I'd call him neutral). But people who watch him disagree with this finding. He ranks at the top in centerfield, indicating he can chase down balls.
My belief is that Ichiro plays deep in rightfield to take away the long hits. He's making a tradeoff between catching balls that might go as doubles, triples or home runs and giving up short singles that a fielder playing at normal depth levels would catch. When he goes to center, he plays more conservatively there since he's not used to the position, but in right he takes chances.
One suggestion over the time I've presented this data is to use the actual distance of balls rather than the velocity of the ball as a parameter for outfielders. I've always felt velocity was a pretty good proxy for distance, and it allowed me to have the same model for infielders and outfielders. But I thought of a way to incorporate the distance without changing the model. I simply divide the distance by 100, except on ground balls and low line drives. Basically, on balls that infielder have a chance to field, use velocity. On balls that are too high for them to field, use distance. Here's a table using a model that mixes the two.
Probabilistic Model of Range, Rightfielders. Model is Based on 2006 Data Only. Minimum 1000 Balls in Play. Uses Distance for Fly Balls.
Player | In Play | Actual Outs | Predicted Outs | DER | Predicted DER | Difference |
Ryan Freel | 1122 | 101 | 92.89 | 0.090 | 0.083 | 0.00723 |
Carlos J Quentin | 1156 | 96 | 87.98 | 0.083 | 0.076 | 0.00694 |
Damon J Hollins | 1440 | 134 | 125.21 | 0.093 | 0.087 | 0.00610 |
Jay Payton | 1173 | 89 | 82.55 | 0.076 | 0.070 | 0.00550 |
Juan Encarnacion | 3085 | 219 | 206.96 | 0.071 | 0.067 | 0.00390 |
Jose Guillen | 1774 | 164 | 157.88 | 0.092 | 0.089 | 0.00345 |
Moises Alou | 2026 | 154 | 147.83 | 0.076 | 0.073 | 0.00305 |
Reggie Sanders | 1942 | 170 | 164.27 | 0.088 | 0.085 | 0.00295 |
Ichiro Suzuki | 3252 | 250 | 241.19 | 0.077 | 0.074 | 0.00271 |
Mark DeRosa | 1654 | 125 | 120.88 | 0.076 | 0.073 | 0.00249 |
Alex I Rios | 2862 | 218 | 210.94 | 0.076 | 0.074 | 0.00247 |
Jacque Jones | 3476 | 275 | 266.89 | 0.079 | 0.077 | 0.00233 |
J.D. Drew | 3472 | 284 | 276.20 | 0.082 | 0.080 | 0.00225 |
Joe Borchard | 1060 | 84 | 81.92 | 0.079 | 0.077 | 0.00196 |
Emil Brown | 1349 | 110 | 108.41 | 0.082 | 0.080 | 0.00118 |
Randy Winn | 1996 | 184 | 181.86 | 0.092 | 0.091 | 0.00107 |
Vladimir Guerrero | 3258 | 253 | 249.59 | 0.078 | 0.077 | 0.00105 |
Austin Kearns | 3928 | 346 | 342.73 | 0.088 | 0.087 | 0.00083 |
Casey Blake | 2586 | 210 | 208.24 | 0.081 | 0.081 | 0.00068 |
Geoff Jenkins | 3333 | 247 | 245.22 | 0.074 | 0.074 | 0.00054 |
Milton Bradley | 2518 | 191 | 190.14 | 0.076 | 0.076 | 0.00034 |
Bobby Abreu | 4047 | 293 | 292.75 | 0.072 | 0.072 | 0.00006 |
Jermaine Dye | 3915 | 305 | 305.35 | 0.078 | 0.078 | -0.00009 |
Nick Markakis | 2843 | 240 | 240.60 | 0.084 | 0.085 | -0.00021 |
Jeff B Francoeur | 4434 | 317 | 318.59 | 0.071 | 0.072 | -0.00036 |
Brad B Hawpe | 3769 | 280 | 281.37 | 0.074 | 0.075 | -0.00036 |
Trot Nixon | 2700 | 212 | 214.27 | 0.079 | 0.079 | -0.00084 |
Jason Lane | 2049 | 155 | 157.68 | 0.076 | 0.077 | -0.00131 |
Russell Branyan | 1163 | 87 | 88.64 | 0.075 | 0.076 | -0.00141 |
Jeremy R Hermida | 2003 | 157 | 160.05 | 0.078 | 0.080 | -0.00152 |
Michael Cuddyer | 3637 | 245 | 251.88 | 0.067 | 0.069 | -0.00189 |
Xavier Nady | 2560 | 187 | 191.96 | 0.073 | 0.075 | -0.00194 |
Shawn Green | 3393 | 220 | 226.92 | 0.065 | 0.067 | -0.00204 |
Magglio Ordonez | 3893 | 258 | 268.74 | 0.066 | 0.069 | -0.00276 |
Jay Gibbons | 1107 | 97 | 100.99 | 0.088 | 0.091 | -0.00361 |
Kevin Mench | 1541 | 112 | 119.34 | 0.073 | 0.077 | -0.00476 |
Brian Giles | 4169 | 298 | 318.55 | 0.071 | 0.076 | -0.00493 |
Bernie Williams | 1347 | 98 | 104.84 | 0.073 | 0.078 | -0.00508 |
Jeromy Burnitz | 1988 | 120 | 137.33 | 0.060 | 0.069 | -0.00872 |
As you can see, Ichiro moves up the rankings. I'd be curious to know what people think of each of these methods. Does one ranking strike you as more correct that the other?
Looking at Encarnacion and Sanders, whom I am most familiar with (with the caveat that I'm familiar with Sanders circa 2005 on the Cardinals) the second table makes seems more accurate to me.
Some other NL Central RFs that I see alot, I think Ryan Freel could quite possibly be the best RF and Geoff Jenkins probably isn't as bad as the first model or as good as the second.
I'm inclined to prefer the second method as I would have a bias toward the methodology that would consider more data. It strikes me that, if anything, knowing distance would be more important and easier to track than velocity.
On another note, I would appreciate it if you would post Wily Mo Pena's numbers for 2006. I'm curious if the numbers continue to support the notion that he plays better in center than in right (subjectively that is what I see). Thanks.
John, it would be interesting if you made another chart of the differences of the two charts. This way we could see who moved up and down with the different formula. I noticed that Russell Branyan moved up quite a bit on the 2nd formula, too. I never really thought of him as a decent fielder at any position, but I could be wrong on that. Anyone have an opinion of his fielding in RF?
By the velocity model:
Pena is CF: -0.00312
Pena in RF: -0.00483
By the distance model.
Pena in CF: -0.00211
Pena in RF: -0.00653
Yes, he plays better in center, but I wouldn't call him good.
I like the second system better for one reason: it dislikes Shawn Green's defense more than the first system. As a big Mets fan who had to avert my eyes whenever the ball was hit to the right side of Carlos Beltran, I am confident that Green is the slowest outfielder I've ever seen. I know that the purpose of Sabermetrics is to downgrade the analytical weight accorded to first-person impressions, but I'm fairly confident in this one.
I'd also prefer the second model. From the games I watched this year, it seems to concur with what my eyes told me more accurately than the velocity based one. It passes the sniff test.
The second model made much more sense Bernie Williams was flat out attrocious this year in RF, there's no way he could be better than anyone except Jeromy Burnitez.
Also, Jacque Jones is pretty bad out there too, so the first model's high ranking of him just don't make much sense.
I agree that the second model makes more sense, but I base this purely on observing Jay Gibbons play RF. The second system rates him dramatically below his replacement, Nick Markakis, and seems to honor everyone in Baltimore's impression that Gibbons, while a standup guy, moves like a tree in RF.
well assuming your hypothesis is correct, the next step is to find out whether some "outs" are worth more than others. saving a double is more valuable than saving a single, but at what cost? it's the defensive equivalent of obp vs. slug. Defenders might also be compelled to position themselves differently based on the score/runners on base/etc. Since even just five more caught balls make a significant difference in these models, it's difficult to just "give away" an out, but playing shallow in order to try to make a play on a winning run in the 9th, or the opposite and playing deep to prevent a double, these must happen to players at least a few times over the course of a season, and that would make quite a difference to these rankings. Perhaps the argument is that it pretty much averages out for everyone, but I think you'd still need to show that then.
Very interesting to see Mench move from one of the better RF's with the first method all the way down to the bottom with the 2nd one.
Also, Jacque Jones is pretty bad out there too, so the first model's high ranking of him just don't make much sense.
Actually...not really.
Jones has very good range and tracks a lot of balls down. His major weakness is his pathetic arm. But in terms of range, he has been quite good.
i didnt even know jose guillen played this year.
I agree that I find it hard to believe that Gibbons outranks Markakis.
Though I've only seen Markakis and Abreu play a single time, I'm surprised that both are ranked so low. Both seemed to get a measure on the ball right off the bat and have a good sense where to play the ball on a carom if it was out of their range.
There's no question that distance travelled is an absolute requirement, and I would not believe that "hardness of hit" ball can even be close to a proxy.
You also don't want "distance/100", unless you take the absolute value from the normal position. That is, if a corner OF plays 275 feet from home plate, then you need to take the absolute difference from 275 feet.
I would also combine the distance ball travelled and zone, since all you care about is how many feet the fielder has to run, and how long does it take him to run that distance.
While keeping the same model for IF and OF may be nice, it's hardly an overriding goal.
In short, scrap it, in favor of two models.
And, of course, you need to track XBH as well, since a guy who positions deep is purposely doing that tradeoff.
You *may* have similar issues with 3B.