Baseball Musings
Baseball Musings
November 28, 2006
Probabilistic Model of Range, Third Basemen, 2006

There's been a suggestion to present the data in a different format, so I'm going to try that with the third basemen. I'm also just reporting the mixed velocity/distance model here. People seem to like that model better. At some point, I'll redo the tables for the positions posted ealier. Here's the ranking of the third baseman based on difference in DER.

Probabilistic Model of Range, Third Basemen. Model is Based on 2006 Data Only. Minimum 1000 Balls in Play. Uses Distance for Fly Balls.
Player In Play Actual Outs Predicted Outs DER Predicted DER Difference
Joe Crede 3962 436 397.55 0.110 0.100 0.00971
Freddy Sanchez 2527 285 265.88 0.113 0.105 0.00757
Pedro Feliz 4278 420 391.93 0.098 0.092 0.00656
Brandon Inge 4278 506 479.75 0.118 0.112 0.00614
Adrian Beltre 4159 416 393.60 0.100 0.095 0.00539
Maicer E Izturis 2069 182 171.49 0.088 0.083 0.00508
Scott Rolen 3788 390 371.79 0.103 0.098 0.00481
Mike Lowell 3990 429 411.96 0.108 0.103 0.00427
Morgan Ensberg 2917 289 276.96 0.099 0.095 0.00413
Ryan W Zimmerman 4383 382 365.01 0.087 0.083 0.00388
Andy M Marte 1348 141 135.81 0.105 0.101 0.00385
Corey Koskie 1847 189 182.02 0.102 0.099 0.00378
David Bell 3716 347 334.10 0.093 0.090 0.00347
Willy Aybar 1388 106 102.60 0.076 0.074 0.00245
Eric Chavez 3607 362 353.27 0.100 0.098 0.00242
Nick Punto 2256 217 212.22 0.096 0.094 0.00212
Miguel Cabrera 4010 349 342.51 0.087 0.085 0.00162
Vinny Castilla 1755 161 158.59 0.092 0.090 0.00138
Chad A Tracy 3930 339 337.78 0.086 0.086 0.00031
Hank Blalock 3374 293 292.07 0.087 0.087 0.00027
Melvin Mora 4109 372 372.59 0.091 0.091 -0.00014
David A Wright 4041 356 359.00 0.088 0.089 -0.00074
Troy Glaus 3586 324 326.88 0.090 0.091 -0.00080
Aramis Ramirez 3934 333 336.63 0.085 0.086 -0.00092
Chipper Jones 2811 247 250.06 0.088 0.089 -0.00109
Mark T Teahen 2954 286 289.22 0.097 0.098 -0.00109
Abraham O Nunez 1876 182 184.40 0.097 0.098 -0.00128
B.J. Upton 1326 114 115.79 0.086 0.087 -0.00135
Mark DeRosa 1098 97 99.17 0.088 0.090 -0.00197
Alex Rodriguez 3968 330 338.71 0.083 0.085 -0.00219
Wilson Betemit 1831 142 146.67 0.078 0.080 -0.00255
Garrett Atkins 4385 358 375.87 0.082 0.086 -0.00408
Edwin Encarnacion 2908 252 265.44 0.087 0.091 -0.00462
Aubrey Huff 2133 193 203.79 0.090 0.096 -0.00506
Aaron Boone 2748 221 235.26 0.080 0.086 -0.00519
Tony Batista 1354 114 124.03 0.084 0.092 -0.00741
Rich Aurilia 1109 101 112.09 0.091 0.101 -0.01000

As you can see, Joe Crede earned that gold glove. Now here's the same list using just outs, and sorted by 100*Actual Outs/Predicted Outs.

Probabilistic Model of Range, Third Basemen. Model is Based on 2006 Data Only. Minimum 1000 Balls in Play. Uses Distance for Fly Balls. Sorted by Out Ratio.
Player InPlay Actual Outs Predicted Outs Out Difference Out Ratio
Joe Crede 3962 436 397.55 38.45 109.67
Freddy Sanchez 2527 285 265.88 19.12 107.19
Pedro Feliz 4278 420 391.93 28.07 107.16
Maicer E Izturis 2069 182 171.49 10.51 106.13
Adrian Beltre 4159 416 393.60 22.40 105.69
Brandon Inge 4278 506 479.75 26.25 105.47
Scott Rolen 3788 390 371.79 18.21 104.90
Ryan W Zimmerman 4383 382 365.01 16.99 104.65
Morgan Ensberg 2917 289 276.96 12.04 104.35
Mike Lowell 3990 429 411.96 17.04 104.14
David Bell 3716 347 334.10 12.90 103.86
Corey Koskie 1847 189 182.02 6.98 103.84
Andy M Marte 1348 141 135.81 5.19 103.82
Willy Aybar 1388 106 102.60 3.40 103.32
Eric Chavez 3607 362 353.27 8.73 102.47
Nick Punto 2256 217 212.22 4.78 102.25
Miguel Cabrera 4010 349 342.51 6.49 101.89
Vinny Castilla 1755 161 158.59 2.41 101.52
Chad A Tracy 3930 339 337.78 1.22 100.36
Hank Blalock 3374 293 292.07 0.93 100.32
Melvin Mora 4109 372 372.59 -0.59 99.84
David A Wright 4041 356 359.00 -3.00 99.17
Troy Glaus 3586 324 326.88 -2.88 99.12
Aramis Ramirez 3934 333 336.63 -3.63 98.92
Mark T Teahen 2954 286 289.22 -3.22 98.89
Chipper Jones 2811 247 250.06 -3.06 98.78
Abraham O Nunez 1876 182 184.40 -2.40 98.70
B.J. Upton 1326 114 115.79 -1.79 98.46
Mark DeRosa 1098 97 99.17 -2.17 97.82
Alex Rodriguez 3968 330 338.71 -8.71 97.43
Wilson Betemit 1831 142 146.67 -4.67 96.82
Garrett Atkins 4385 358 375.87 -17.87 95.25
Edwin Encarnacion 2908 252 265.44 -13.44 94.94
Aubrey Huff 2133 193 203.79 -10.79 94.71
Aaron Boone 2748 221 235.26 -14.26 93.94
Tony Batista 1354 114 124.03 -10.03 91.91
Rich Aurilia 1109 101 112.09 -11.09 90.11

As you can see, the order is almost exactly the same. From this chart, the Indians should be happier with Marte at third than Boone. And Alex Rodriguez must have made up for all those errors someplace else, since he's only down 8 outs. Freddy Sanchez did it all, winning a batting title and playing a great third base. The Joe Randa injury was the best thing to happen to Pittsburgh last year.

Please let me know which presentation you like better in the comments.


Comments

I like the "100 is average" scale a LOT better. Interesting stuff! Thanks!

Now if only Double-E could quit making so many E's for my Reds!

Posted by: Dan at November 28, 2006 06:55 PM

beltre may be a huge bust with the bat but he plays stellar defense.

Posted by: tony at November 28, 2006 07:32 PM

Your positive rating of Miguel "The Butcher" Cabrera is perplexing to say the least.

Posted by: bmc at November 28, 2006 10:22 PM

I like the order of magnitude difference - the percentages are much easier on the eyes than the raw DER difference, especially for individuals.

I like more what it does to the data itself. Using the out ratio removes bias for players who got easy-to-field balls (high predicted DER).

Example:

Third basemen A and B both see 2500 balls in play.

Player A sees many balls hit toward third base and weakly, has a high predicted DER and is expected to make 300 outs. He fields well and makes 320 outs. His DER delta is .008 and his out ratio is 106.67.

Player B sees few balls hit toward third base and hard, has low predicted DER and is expected to make 200 outs. He fields extremely well and makes 217 outs. His DER delta is .0068 and his out ratio is 107.5.

DER Difference rates player A higher but Out Ratio (correctly) rates player B higher. Player A got mroe "extra" outs but player B's outs are more impressive given the opportunities he got.

Math:

Oa = Actual Outs
Op = Predicted Outs
B = Balls in Play

Out Ratio = Oa/Op

Out Ratio is a rate statistic of how many outs made per outs the player "should have made."

DER Difference = (Oa-Op)/B

DER Difference is a rate statistic of how many extra outs are made (or lost) per ball seen by the *team*. It ignores some of the information we have available: how many balls the player should have caught at his position.

Posted by: pmaynard at November 28, 2006 10:30 PM

I like the "100 is average" tweak as well. That is one of the reasons why I have always been fond of ERA+ and OPS+.

On another note, I thought you were partial to using both distance AND velocity, whereas now you are using only distance... May I ask why?

Posted by: Tom G at November 29, 2006 07:05 AM

One more vote for the second chart... love the 100-base scale.

Posted by: Mike at November 29, 2006 08:15 AM

Not surprisingly, I like the 2nd table as well. Now if you drop the decimals in the last 3 columns (which are distracting and imply far mor precision than is real), I think you'll have a really great presentation of the data.

Posted by: Guy at November 29, 2006 09:49 AM

I think you're insinuating that Crede won a Gold Glove this year, and he didn't. Chavez robbed him.

Posted by: Gregg at November 29, 2006 10:41 AM

Veteran White Sox watchers will agree, Crede does have solid defense. It is to their credit that they have stuck with him while his bat catches up. With his contract coming to an end, and Boras as an agent, he's going to be a very rich young man. I like the ranking using "100" easy to follow and understand.

Posted by: Jim at November 29, 2006 11:12 AM

The "out difference" should be rounded to the whole number.

If we use that as a precision-level, then the out ratio needs to be rounded to the first decimal place.

That is:
400/400*100= 100.00
401/400*100=100.25
402/400*100=100.50
403/400*100=100.75
404/400*100=101.00

Each one is one out more, so no need to show it to the second decimal place. You can reasonably arguy that the out ratio should be to the zero-th decimal place, since a guy who is +0 and +4 are probably the "same" when you consider the uncertainty level of the metric.

***

As for maynard's reasoning of differential and ratio, it doesn't apply as much as he's stating. The extreme among the 3B "predicted" is .080 to .112. If you have 4000 balls in play, one guy has 320 expected outs, and the other guy has 448 expected outs. If they are both 10% above (i.e., out ratio of 110), one is +32 and the other is +45.

However, in terms of significance, the first guy, through no fault of his own, actually did have fewer opps, and therefore, his 110 score is actually less meaningful than the second guy's 110 score.

The "true" answer is to do what I do, and convert the scores into z-scores. You will find that the "true" answer will lie somewhere between the "out difference" and the "out ratio".

Nonetheless, since David is now showing both side-by-side, we have no issue. Great job to David.

Posted by: tangotiger at November 29, 2006 11:19 AM

yeah, Crede was robbed again for Gold Glove, but somehow he won 3b Silver Slugger over A Rod.

Posted by: Rob at November 29, 2006 11:40 AM

"yeah, Crede was robbed again for Gold Glove, but somehow he won 3b Silver Slugger over A Rod."

Cash wise, this probably evens out. Chavez better watch out: unless he hits and fields well next year that award is Crede's.

Posted by: MikeQ at November 29, 2006 05:22 PM

I'm inclined to like the first table a bit better. What about sorting the data by out difference, however? That would give the clearest picture of who the most valubale defenders actually were.

Posted by: Phil at November 29, 2006 10:25 PM
Post a comment









Remember personal info?