Baseball Musings
Baseball Musings
February 17, 2005
Testing Lineups

James Click has an excellent article at Baseball Prospectus in which he tests different theories on how to construct a lineup (subscription required). One caveat; his charts don't match the data in the tables. (I don't know if that makes a difference, but I'm waiting to hear from Click to see why there's a discrepancy).

Interestingly, the data he has shows that descending OBA is about the best lineup you can construct:

Now it’s time to start mixing things up and having a little fun. In an effort to generate an optimal lineup structure, the first step is to verify some of the basic underlying principles. First, the idea that players with higher AVG, OBP or SLG should be higher in the lineup can easily be tested. To avoid tainting the results, each player will have the same stats except for the stat being tested. For example, when testing AVG, each player will have the same OBP and SLG. The program will be given six different lineups, two for each of three “teams." Each of the three teams will have one statistic in which they all differ and the other two will remain the same. These three teams will be analyzed twice, one with the variant statistic in descending order and once in ascending order. Further, the range of the difference in the variant statistic will be closely mapped to actual major league distribution. So despite the occasional Bonds, the program won’t have anyone with a .605 OBP.

After running each lineup, the program produced the following results. Below are the minimum number of runs, the mean, the maximum, and the 25th and 75th quartiles. From the numbers, a fair idea of the curve of each lineup can be gathered.

Lineup    Min  Quartile  Mean   Quartile   Max

Avg Desc  672    752      780     806      923
Avg Asc   662    755      782     808      919

Obp Desc  705    790      818     846      947
Obp Asc   660    762      792     821      926

Slg Desc  676    762      790     816      912
Slg Asc   656    747      777     805      926

Great work here, and one more positive for OBA.

Update: The charts were incorrect and have been fixed. The data in the above table is correct.
Posted by David Pinto at 05:18 PM | Management | TrackBack (1)
Comments

How does this gibe with all the studies that say lineup order doesn't really have much affect?

Posted by: sabernar at February 17, 2005 06:30 PM

I'm not a BP subscriber-- had I discretionary income I would be, but... with that caveat of ignorance-- this just looks wrong as explained...

If I were doing this I'd choose real stat years from a realistic lineup and simply alter the one stat I was studying in each run, and the order of the lineup. Because-- for example-- if everybody in the lineup is equally likely to double, doesn't that predetermine the result? Or rather, doesn't that automatically make the position of high OBP players in the lineup almost irrelevant? Wherever they hit, they're just as likely to be driven in, except in cases where another high OBP moves them over,no? If everybody batting in front of Bob Basher has the same liklihood of being on base when he drills one onto the highway, he'll drive in about the same number of runs wherever he hits won't he? The difference will be limited to the cases where Danny Dinger keeps clearing the bases in front of him, eh? And if everybody has the same OB and SL what does it matter that the BAs vary? The guys with more singles will drive in more runs than the guys with more walks, but we already knew that.

It sounds to me as if he's asking if it makes a difference if the sluggers bat right behind the high OBP guys but trying to answer it by looking at what happens when only the sluggers are on the field, or only the "walkers" are on the field.

Of course descending order works best in that case-- you KNOW you are giving the extra at bats at the top of the order to the better hitters, as they only vary on one axis... and yet the differences from "upside down" lineups aren't great because you've already removed most the things which could CAUSE differences... you've removed the noise from the study by removing the subject of the study. It's vanilla pudding with a few walnuts, not trail mix... or am I misunderstanding?

Posted by: john swinney at February 17, 2005 07:00 PM

so you'd bat barry bonds and albert pujols and tood helton and frank thomas leadoff?

if i recollect, batting little giambi (with team high OBA) leadoff was, um, not proved to be real too good an idea...

Posted by: lisa gray at February 17, 2005 08:05 PM

I vaguely remember Bill James doing something similar.

The reason the OBP thing works is pretty basic...in order to score, you need men to get on base, the more men you get on base, the more you will score. It also pushes the lineup along which causes more men to come up, blah blah blah.

So obviously, if you have higher OBP before lower, then you will get more team at bats and more runs as a result.

Posted by: mrbisco at February 17, 2005 09:36 PM

In theory, this sounds right, but it is not. Different events have different values in different spots in the order. So, for example, a big time HR hitter with a somewhat low OBP (for a big-time slugger) is best near the middle of the batting order. A guy who gets on base a lot, but has little power is better at the top of the order. A guy with a very low OBP and a high SLG is best near the bottom of the order. You can't simply sort by OBP. Pujols is much more valuable in the 3-4 slot than as a leadoff hitter. So is (likely) Bonds. On the other hand, Jason Kendall has much less (relative) value in the middle of a batting order than at the top.

Posted by: David at February 18, 2005 01:18 AM

Intuition seems to be proven in the data showing more runs scored as SLG% increases and OBP% decreases. Do these trends then show OPS having a null effect on batting orders? I wonder which trend is stronger? Presumably the OBP% trend given its larger discrepancy between ascending and descending.

Posted by: Eric at February 18, 2005 05:50 AM

In my research, batting order optimization can add 5 to 15 runs per 162 games. The least-appreciated spot is the #2 spot. This player should be just as good as your #4 hitter, except he should derive more value from his walks than his HR. But, we are talking about a few runs for any one spot. If you've got a guy with 25 HR and 110 walks (and is the best hitter on the team), put him #2. If he's got 40 HR and 50 walks (and best hitter on the team), put him #4.

Personally, I consider finding 10 runs "a lot", since that's 1 win, and teams pay 2 million$ for that.

Posted by: tangotiger at February 18, 2005 10:15 AM

I think I didn't make a clear comment yesterday, so I will try to line up my ducks where they can all be seen...

If I understand correctly, this fellow is running three sets of numbers each for three teams:

A team on which everyone hits say .250, slugs 450 but has varying OBPs and is run in ascending, descending and mixed lineups,

A team on which everyone bats 250, has an OBP of say 330, and varys in SP, run in the same orders,

And a team on which everyone gets on base at a 330 clip, slugs 450, but has varying BAs, treated likelwise.

And my point is that its the INTERACTION of OBP and SP which is the question in lineup design-- if you flatline one or the other, you aren't eliminating noise, you're eliminating the question... your best hitters will automatically be the ones who do best in the variant stat since the other dimension is steady. Extra at bats for those hitters will surely overbalance any other issues...

If I were doing this I'd choose a real historical team which had near average performance in all three stats, and compare their actual lineup (probably AN actual lineup if they were average) an ascending order and a descending order lineup for each stat, and the same for versions with each stat randomly varied (within the bounds of possibility-- can't run the slugging 50 points below BA). You will still find out whatever varying each stat individually can tell you, but they're running in a constant but "real" context...

I would have to adjust history to give each player in the lineup the same number of plate appearances, you would create a composite player to cover platoons. Oh and the "random" variations would have to be paired-- if I draw "minus ten points" and "player 3" out of a hat I'll need to draw another player immediately and assign him "plus ten points," and the last player will get "zero" automatically... If I draw a player who doesn't have ten points to lose, I'll redraw... Etc.

As it stands (if I understand correctly) I don't see how he can find out anything which is not already known. (Of course the whole question has been well studied already anyway, but...)

Posted by: john swinney at February 18, 2005 01:07 PM

The conclusion---that OBP is the best single stat for sorting lineups---doesn't bother me. What does bother me is that OBP Asc beats AVG Desc and SLG Desc in many segments. What does this say? (assuming the study was done properly) To me, it says that it is of supreme importantce that your high OBP hitters be next to each other in the lineup. In OBP Desc, the high OBP's are at the top of the lineup, in OBP Asc the high OBP's are at the bottom of the linup. In both OBP lineups, the high OBP hitters are clustered together. In the other linups, they're (presumably) scattered.

If you get the best run production by clustering your high OBP hitters (and putting them near the top of the lineup), I wonder if the folk wisdom of having your best hitters at #3 and #4 is good wisdom. Based on my (hastily-put-together) conclusions based on James Click's results, the best lineup might be to put your best hitter (best OBP) at the #3 spot, 2nd best hitter at #4, 3rd best at #2, 4th best at #5, 5th best at #1, 6th best at #6, 7th best at #9, 8th best at #7 and 9th best at #8. That way, OBP gradually changes through the lineup; the best hitters are clustered around the #3 spot.

I don't have a BP subscription; anyone want to ask James to give this lineup a try?

Posted by: Jason at February 18, 2005 05:16 PM

I'm kinda surprised that this is even being argued. Maybe I am missing something, but it seems incredibly obvious that a lineup with descending OBP would produce more that others. Of course this is only theory...

I'll test it, I guess and see what happens. I'll have two teams play against each other with the same players but different lineups and see what occurs.

Posted by: mrbisco at February 20, 2005 12:36 AM

i am interested to know how many of the people commenting here have actually had to put a lineup together, in real life mind you, not into a simulation. none of these stats are adjustable to intangibles. i know that small ball is consdered a national league thing but if i am putting together a lineup to produce runs (which is really the goal right?), my #1 spot is a contact hitter with speed that can leg out the infield singles. my #2 spot is a guy who can bunt and draw walks. his whole goal is to move the leadoff guy over. OBP doesn't figure into that spot. my #3 is my best contact hitter that can drive the lead off guy around. my #4 is my 2nd best contact hitter. contrary to popular opinion, there is no reason the #4 should be the power hitting grand slam guy. the odds of 1, 2 and 3 getting the bases loaded are slim. My #5 is actually my power hitter. My #9 is usually my second fastest guy because if he gets on, the top of the order is coming up. #6, 7 & 8 are gut calls depending on how guys are playing. I am just trying to say there is way more than OBP or any single statistic that can judge the best lineup.

Posted by: trevor at July 5, 2005 02:34 PM
Post a comment









Remember personal info?