Baseball Musings
Baseball Musings
November 05, 2007
Probabilistic Model of Range, Defense Behind Pitchers

One thing PMR can measure is the luck of pitchers by looking at the predicted DER and actual DER behind them. The following table rates pitchers with at least 300 balls in play against them:

Probabilistic Model of Range, Defense Behind Pitchers, 2007. Visit Smoothed Distance Model. 2007 Data Only
Pitcher Team In Play Actual Outs Predicted Outs DER Predicted DER Ratio
Chien-Ming Wang NYY 643 448 414.94 0.697 0.645 107.97
Jeremy Guthrie Bal 527 375 356.60 0.712 0.677 105.16
Dustin McGowan Tor 484 346 330.21 0.715 0.682 104.78
Sean Marshall ChC 330 231 221.12 0.700 0.670 104.47
Roger Clemens NYY 307 215 205.94 0.700 0.671 104.40
Brian Bannister KC 540 393 376.52 0.728 0.697 104.38
Jarrod Washburn Sea 627 440 422.32 0.702 0.674 104.19
Mike Bacsik Was 414 291 279.42 0.703 0.675 104.14
Tom Glavine NYM 674 474 455.80 0.703 0.676 103.99
Jason Hirsh Col 340 252 242.53 0.741 0.713 103.91
Ted Lilly ChC 586 427 411.59 0.729 0.702 103.74
Braden Looper StL 581 416 401.23 0.716 0.691 103.68
Chris Sampson Hou 414 292 281.66 0.705 0.680 103.67
Cole Hamels Phi 495 348 336.00 0.703 0.679 103.57
Brad Penny LAD 643 450 435.62 0.700 0.677 103.30
Dontrelle Willis Fla 667 442 428.96 0.663 0.643 103.04
Yovani Gallardo Mil 318 216 209.67 0.679 0.659 103.02
Jesse Litsch Tor 371 259 251.51 0.698 0.678 102.98
Jason Bergmann Was 332 248 241.02 0.747 0.726 102.90
Anthony Reyes StL 332 236 229.52 0.711 0.691 102.82
Curt Schilling Bos 485 338 328.75 0.697 0.678 102.82
Chuck James Atl 484 352 342.43 0.727 0.707 102.80
Nate Robertson Det 573 389 378.44 0.679 0.660 102.79
Aaron Cook Col 572 401 390.44 0.701 0.683 102.70
Tim Lincecum SF 389 277 269.87 0.712 0.694 102.64
Jon Garland CWS 705 493 480.70 0.699 0.682 102.56
Steve Trachsel Bal 491 351 342.47 0.715 0.697 102.49
Daisuke Matsuzaka Bos 555 384 375.16 0.692 0.676 102.36
Noah Lowry SF 502 349 340.97 0.695 0.679 102.35
Tim Hudson Atl 722 504 492.65 0.698 0.682 102.30
C.C. Sabathia Cle 701 476 465.40 0.679 0.664 102.28
Chad Durbin Det 417 304 297.37 0.729 0.713 102.23
Carlos Zambrano ChC 610 439 429.45 0.720 0.704 102.22
Micah Owings Ari 461 332 324.88 0.720 0.705 102.19
James Shields TB 615 435 425.93 0.707 0.693 102.13
Erik Bedard Bal 431 306 299.70 0.710 0.695 102.10
Jake Westbrook Cle 481 329 322.43 0.684 0.670 102.04
John Lackey LAA 668 459 450.14 0.687 0.674 101.97
Oliver Perez NYM 483 341 334.57 0.706 0.693 101.92
Justin Verlander Det 577 407 399.34 0.705 0.692 101.92
Barry Zito SF 608 441 432.73 0.725 0.712 101.91
Roy Halladay Tor 722 497 488.79 0.688 0.677 101.68
Jason Marquis ChC 626 440 432.86 0.703 0.691 101.65
Zack Greinke KC 350 239 235.18 0.683 0.672 101.63
Buddy Carlyle Atl 335 229 225.46 0.684 0.673 101.57
A.J. Burnett Tor 414 301 296.47 0.727 0.716 101.53
Johan Santana Min 555 394 388.14 0.710 0.699 101.51
Jake Peavy SD 571 409 403.20 0.716 0.706 101.44
Kyle Kendrick Phi 401 284 280.03 0.708 0.698 101.42
Greg Maddux SD 681 466 459.63 0.684 0.675 101.39
Tim Wakefield Bos 600 425 419.24 0.708 0.699 101.37
Fausto Carmona Cle 654 463 456.92 0.708 0.699 101.33
Kelvim Escobar LAA 572 387 382.00 0.677 0.668 101.31
Joe Blanton Oak 750 520 513.28 0.693 0.684 101.31
Rich Hill ChC 527 378 373.19 0.717 0.708 101.29
Odalis Perez KC 494 325 320.93 0.658 0.650 101.27
Matt Morris SF 473 315 311.16 0.666 0.658 101.23
Carlos Silva Min 699 485 479.14 0.694 0.685 101.22
Adam Eaton Phi 525 356 351.83 0.678 0.670 101.19
Felix Hernandez Sea 567 372 367.73 0.656 0.649 101.16
Wandy Rodriguez Hou 536 366 361.86 0.683 0.675 101.14
Vicente Padilla Tex 407 270 266.96 0.663 0.656 101.14
Aaron Harang Cin 642 451 446.11 0.702 0.695 101.10
Livan Hernandez Ari 704 488 482.76 0.693 0.686 101.08
Orlando Hernandez NYM 388 299 295.82 0.771 0.762 101.08
Jamie Moyer Phi 633 432 427.41 0.682 0.675 101.08
Ian Snell Pit 606 413 408.93 0.682 0.675 101.00
Andy Pettitte NYY 690 457 452.68 0.662 0.656 100.96
Tom Gorzelanny Pit 642 439 435.75 0.684 0.679 100.75
Matt Albers Hou 362 247 245.52 0.682 0.678 100.60
Lenny DiNardo Oak 430 302 300.28 0.702 0.698 100.57
John Danks CWS 427 289 287.39 0.677 0.673 100.56
Mark Hendrickson LAD 395 262 260.58 0.663 0.660 100.55
Jorge Sosa NYM 361 256 254.94 0.709 0.706 100.42
Brandon Webb Ari 692 480 478.35 0.694 0.691 100.34
Carlos Villanueva Mil 318 229 228.36 0.720 0.718 100.28
John Maine NYM 527 377 376.07 0.715 0.714 100.25
Justin Germano SD 426 302 301.31 0.709 0.707 100.23
Chad Billingsley LAD 400 279 278.70 0.697 0.697 100.11
Ben Sheets Mil 431 307 306.74 0.712 0.712 100.09
Roy Oswalt Hou 675 456 456.10 0.676 0.676 99.98
Jered Weaver LAA 514 348 348.13 0.677 0.677 99.96
Mike Mussina NYY 512 335 335.31 0.654 0.655 99.91
Josh Beckett Bos 566 385 385.40 0.680 0.681 99.90
Matt Chico Was 548 380 380.44 0.693 0.694 99.88
Matt Belisle Cin 570 378 378.52 0.663 0.664 99.86
Shaun Marcum Tor 456 329 329.69 0.721 0.723 99.79
Jeff Weaver Sea 511 340 340.84 0.665 0.667 99.75
Derek Lowe LAD 604 412 413.67 0.682 0.685 99.60
Kameron Loe Tex 464 305 306.28 0.657 0.660 99.58
Joe Saunders LAA 358 235 236.04 0.656 0.659 99.56
Brad Thompson StL 451 307 308.45 0.681 0.684 99.53
Josh Fogg Col 556 381 383.08 0.685 0.689 99.46
Horacio Ramirez Sea 361 231 232.31 0.640 0.644 99.44
Jeff Francis Col 662 447 449.57 0.675 0.679 99.43
Miguel Batista Sea 615 415 417.51 0.675 0.679 99.40
Paul Byrd Cle 686 465 467.91 0.678 0.682 99.38
Gil Meche KC 663 459 462.21 0.692 0.697 99.31
Claudio Vargas Mil 419 281 283.02 0.671 0.675 99.29
Mark Buehrle CWS 648 455 458.82 0.702 0.708 99.17
Boof Bonser Min 539 359 362.02 0.666 0.672 99.17
Javier Vazquez CWS 583 409 412.68 0.702 0.708 99.11
Edwin Jackson TB 516 333 336.02 0.645 0.651 99.10
Bartolo Colon LAA 328 205 206.87 0.625 0.631 99.09
Tony Armas Jr. Pit 305 208 209.93 0.682 0.688 99.08
Jorge de la Rosa KC 431 285 287.91 0.661 0.668 98.99
Jason Jennings Hou 319 214 216.25 0.671 0.678 98.96
Edgar Gonzalez Ari 324 228 230.41 0.704 0.711 98.96
Chris Young SD 448 336 339.55 0.750 0.758 98.96
Julian Tavarez Bos 455 307 310.39 0.675 0.682 98.91
Woody Williams Hou 632 443 448.01 0.701 0.709 98.88
Daniel Cabrera Bal 608 415 419.74 0.683 0.690 98.87
Bronson Arroyo Cin 661 449 454.60 0.679 0.688 98.77
Kyle Lohse Cin 426 293 296.71 0.688 0.697 98.75
Cliff Lee Cle 317 216 218.74 0.681 0.690 98.75
Paul Maholm Pit 583 391 396.00 0.671 0.679 98.74
Chad Gaudin Oak 603 413 418.34 0.685 0.694 98.72
Ervin Santana LAA 457 302 306.05 0.661 0.670 98.68
Doug Davis Ari 597 400 405.62 0.670 0.679 98.61
Sergio Mitre Fla 522 343 347.92 0.657 0.667 98.59
Adam Wainwright StL 654 441 447.57 0.674 0.684 98.53
Byung-Hyun Kim Fla 316 212 215.40 0.671 0.682 98.42
Ramon Ortiz Min 324 217 220.56 0.670 0.681 98.39
Kevin Correia SF 306 217 220.82 0.709 0.722 98.27
Kevin Millwood Tex 571 364 370.63 0.637 0.649 98.21
Jeremy Bonderman Det 533 354 360.70 0.664 0.677 98.14
Scott Baker Min 454 302 308.06 0.665 0.679 98.03
Dan Haren Oak 661 457 466.27 0.691 0.705 98.01
Randy Wolf LAD 309 205 209.32 0.663 0.677 97.93
Jeff Suppan Mil 708 472 482.96 0.667 0.682 97.73
Josh Towers Tor 347 229 234.38 0.660 0.675 97.71
Matt Cain SF 571 409 419.20 0.716 0.734 97.57
John Smoltz Atl 586 400 410.60 0.683 0.701 97.42
Brandon McCarthy Tex 340 232 238.54 0.682 0.702 97.26
Taylor Buchholz Col 305 207 212.87 0.679 0.698 97.24
Andy Sonnanstine TB 408 272 280.02 0.667 0.686 97.13
Brian Burres Bal 378 249 256.88 0.659 0.680 96.93
Brett Tomko LAD 339 219 226.04 0.646 0.667 96.89
Joe Kennedy Oak 346 242 250.10 0.699 0.723 96.76
Scott Kazmir TB 534 346 358.19 0.648 0.671 96.60
Chris Capuano Mil 456 297 307.78 0.651 0.675 96.50
Robinson Tejeda Tex 302 204 212.16 0.675 0.703 96.16
David Wells SD 416 271 282.44 0.651 0.679 95.95
David Bush Mil 594 395 412.87 0.665 0.695 95.67
Zach Duke Pit 399 246 258.54 0.617 0.648 95.15
Jose Contreras CWS 647 420 441.74 0.649 0.683 95.08
Kip Wells StL 522 342 360.50 0.655 0.691 94.87
Scott Olsen Fla 578 366 387.16 0.633 0.670 94.53

Chien-Ming Wang comes out on top by far, not surprising given the Yankees overall defensive rating. What bothers me about Wang, however, is the low level of his predicted DER. You would think that someone who gets a lot of ground balls would be somewhat higher. The following chart breaks down Wang by ball in play type:

CM Wang by Batted Ball Type, 2007
Batted Ball Type In Play Actual Outs Predicted Outs DER Predicted DER Ratio
Fly 112 101 98.85 0.902 0.883 102.18
Liner 92 29 16.14 0.315 0.175 179.66
Grounder 377 291 269.40 0.772 0.715 108.02
Bunt Grounder 6 4 4.20 0.667 0.700 95.24
Bunt Fly 1 1 1.00 1.000 1.000 100.00
Fliner (Fly) 29 13 14.12 0.448 0.487 92.09
Fliner (Liner) 26 9 11.23 0.346 0.432 80.12

Notice that the defense behind Wang caught a lot more line drives than predicted. Line drives tend to fall for hits, so by adding thirteen extra outs with liners, the Yankees really helped Wang. So Chien-Ming got a bit lucky that way. The grounders, however, is where the defense really shined. They picked up about twenty one more outs than expected on ground balls. How did they do that? The Yankees made a lot of plays on low probability vectors:

Wang Ground Balls by Vector, 2007
Vector In Play Actual Outs Predicted Outs DER Predicted DER Ratio
28 8 6 7.02 0.750 0.877 85.52
29 17 13 12.05 0.765 0.709 107.90
30 29 21 17.57 0.724 0.606 119.49
31 28 27 24.76 0.964 0.884 109.04
32 19 18 18.43 0.947 0.970 97.66
33 32 29 26.75 0.906 0.836 108.40
34 17 12 9.48 0.706 0.558 126.59
35 11 9 7.38 0.818 0.671 121.97
36 23 14 13.01 0.609 0.566 107.58
37 22 12 13.66 0.545 0.621 87.82
38 27 24 23.07 0.889 0.854 104.04
39 31 30 25.58 0.968 0.825 117.26
40 22 19 17.41 0.864 0.792 109.11
41 34 24 17.12 0.706 0.504 140.19
42 27 17 19.83 0.630 0.734 85.73
43 11 9 9.71 0.818 0.883 92.67
44 10 5 4.56 0.500 0.456 109.71

The vectors go from a low of 28 at the third base line to a high of 44 at the first base line. By looking at the Predicted DER column, you can see where the holes are in the infield. Vector 30 represents the hole between third and short, vectors 34-37 the area around second base where ground balls go into centerfield, and vector 41, the hole between first and second. Note that Wang does well in the holes, as if the defense were shifted a bit toward first base. Both the line drive and ground ball data make me wonder if someone was doing a very good job of positioning the Yankees fielders. I don't know who was in charge of that, but in the case of Wang, they did a very good job.

That brings up a point I haven't made in a while. Range is probably a poor word for the ability measured here. Range implies that the fielder can move a long way to get a ball. But sometimes anticipating where the ball gets hit is just as important. So the ability to move and the ability to position are two factors in what the model means by range.

On the other end of the spectrum, Matt Cain not only received no run support, he didn't get much defensive support either. And the defense behind Kazmir was just ridiculous. Here's a pitcher who keeps balls in play to a minimum, and his defense can't turn the few hit to them into outs.

I'll start on individual positions tomorrow.


Posted by David Pinto at 06:20 PM | Defense • | Probabilistic Model of Range | TrackBack (0)
Comments

David,
you only show 17 vectors in the ground ball chart for Wang (and 368 balls in play, vs 377 listed in the batted ball type chart). I think you're missing a vector.

Posted by: joe arthur at November 6, 2007 04:30 AM

Good stuff. I love looking at this stuff in such fine detail.

Posted by: Rally at November 6, 2007 11:25 AM

David-

You often hear baseball commentators, especially former players, talk about how much the defense prefers to play behind a pitcher who works quickly because it keeps them on thier toes, more into the game, etc. I suppose the same could be said for a pitcher who doesn't walk many batters, which implies that there is less time between the batted balls.

Is there anything in this data that would either support or refute that claim?
I note that Justin Verlander ranks much higher than Jeremy Bonderman. Verlander works extremely quickly, Bonderman not so much so, at least by my observation.

(as an aside, we hear so many pronouncements by analysts, many of which sound right, many of which don't. I always enjoy seeing folks like you put those pronouncements to the test.)

Posted by: tom at November 6, 2007 04:57 PM

It would be cool if "average time between pitches" was available. We could look at walks or strike %, as somebody like Daniel Cabrera is going to keep his fielders waiting regardless of whether he works fast or not, but if fielders look worse behind him it could be just that he's pitching behind and letting the hitters get into good hitting counts.

Posted by: Rally at November 7, 2007 09:19 AM

As a general rule, GB pitchers get more outs on GB than FB pitchers do. This is actually built into MGL's UZR model.

And, it's easy to see why. The extra 30 outs that Wang is getting is not necessarily due to the Yanks fielders, but by Wang himself.

I'm not sure how David is calculating the "predicted DER" here. It sounds like he's using the knowledge of Wang's handedness, but not his GB tendency. It might mean that "predicted DER" is based on both the fielders and Wang himself, so I'm not sure what the data is supposed to tell us.

Posted by: tangotiger at November 7, 2007 03:20 PM

where do you get these vectors to make these stats? is there any way i can get the vector data?

Posted by: Kobe Bryan at November 8, 2007 08:53 PM

Contact BIS, and give them a few hundred (thousand?) dollars with a promise that it's for personal-use only.

STATS will want thousands for sure.

Posted by: tangotiger at November 9, 2007 04:15 PM
Post a comment









Remember personal info?