Baseball Musings
Baseball Musings
January 28, 2004
The Importance of OBP

Yesterday I got together for lunch with one of my readers, Dominic Rivers. Dominic graduated from the Sports Management program at UMass. He interned for the Pirates and has been looking for another job within baseball. Dominic told me about an article he published on-line, where he tries to determine how much weight on-base percentage should get in the on-base+slugging formula using linear regression. I find one statement very interesting:


Nevertheless, there are some aspects of this data that are difficult explain. Despite my deeply held intuitive belief that on-base percentage is always more important than slugging percentage, the two-year time period chart shows three eras where SLG appears to be more important than OBP. Those eras are 1981-1982, 1989-1990, and 1990-1991. Oddly enough, these periods happened to produce the lowest “r-squared” totals. For those who haven’t taken a stats class, or who had one but didn’t pay attention, “r-squared” tells you how much is explained by the regression equation. So for example, in the 2001-2002 time period ‘runs scored’ were approximately 88.2% (r-squared of .882) determined by OBP and SLG. The other 11.8% can be explained by other stuff. This “other stuff” might include baserunning, clutch hitting, and number of times reaching base on an error. Why does a “low r-squared” period correspond with a period where SLG is important? My opinion is that these eras, 1989 in particular, are characterized by a great deal of offensive parity. For example, in the National League in 1989, every team besides Atlanta had an on base percentage in the range of .305 to .321. In the American League in 1989, all but two teams scored between 4.13 and 4.78 runs per game. But these years are anomalies, and hence, I would not recommend that Major League GM’s attempt to build an offense based on numerous low-OBP/high-SLG Dave Kingman types.

This is just what I would have expected. If teams are very close in OBP, slugging will dominate. If they are close in slugging, OBP will dominate. But there's another lesson to be learned here as well. There's more than one way to score runs. Having a team with a high OBP is a great way to score runs, maybe the best way to score runs, but it's not the only way. You can do just fine with high slugging averages. You can do fine with high batting averages. You can do fine by being okay in all of those and just being lucky. As with so many things in life, there is no one right answer.


Posted by David Pinto at 08:33 PM | Statistics | TrackBack (2)
Comments

A couple of technical notes:

Looks like he used the constants to determine which was more "important" than the other. My problem with this is that OBP and SLG have two different scales. He's really saying that a point of OBP is worth more than a point of SLG, which is different than saying that OBP is more important than SLG. I would have compared the T stats instead of the constants.

Also, the year-to-year fluctuations are almost certainly just random variations. There's no reason to think that the relationship between OBP and SLG changes over time, except to the extent that OBP and SLG levels themselves change. Competition has nothing to do with the inherent relationship between OBP and SLG.

Posted by: studes at January 29, 2004 07:28 AM

First, I think that Dominic Rivers is stating that OBP correlates to Runs scored better than SLG. That's all. He then talks too much about the exceptions.

The last commenter is right about the scales, but heads in the wrong direction. Given two guys with 850 OPS, one with 400 OBP and 450 SLG versus 350 and 500. Most stat heads agree that the first player is better because he generate fewer outs. It's also true that it's harder to generate 50 point of OBP versus 50 points of SLG.

Posted by: Ken at January 29, 2004 09:34 AM

Ken, what do you mean that I headed in the wrong direction? I was careful not to head in any direction at all. At least, that's what I was trying to do.

Posted by: studes at January 29, 2004 12:19 PM

Ken and studes:

Sincere thanks to both of you for taking the time to review my work and share constructive criticism. The full article, which I hope you'll read if you haven't, touches on many different things - possibly too many. Since I published it last May, some people have latched on to certain things that they find interesting and/or passionately agree/disagree with. The anecdote that Dave Pinto discusses is one example.

The MAIN point that I was trying to get across in the article “How Orlando Palmeiro Got Hosed: Exactly How Full of S is OPS?” is that the increasing prevalence of OPS in the baseball lexicon is further denigrating the high OBP, low SLG hitter. This is not just happening to casual baseball fans, but some self-professed statheads as well. Yes, most statheads can tell you that a point of OBP is worth more than a point of SLG (aside - I'd be curious to take a poll of major league general managers - I doubt that 100% of them would know this). But in a world where shorthand and bullet-points are king, too many people's eyes look at an .810 OPS on a stat sheet and think "good" then see a .790 OPS and think "bad."

With all of the money that's at stake in real baseball, it is still possible to exploit this gap in perception and reality.

Erubiel Durazo's 2003 season wasn't all that disappointing, for example. But his OPS stayed quite low compared to most top 1b/dh types. This allowed the A's to sign him to avoid a costly arbitration battle, and sign him to a contract that is south of various valuable "4+" service class players.

p.s. Thanks to Dave P. for posting my article.

Posted by: Dominic Rivers at January 29, 2004 02:37 PM

I'm a post-Moneyballer, so I apologize if I am missing the boat on this. What happens if you use isolated power instead of SLG? Since BA is a component in both variables this might make assigning changes in both varaibles due to hitting more tricky. Great study. I like it.

Posted by: JC at January 29, 2004 03:22 PM

Thanks for your comments, Dominc. I agree with your basic premise. Have you seen "GPA"?, which is Aaron Gleeman's made-up stat? It's (1.8*OBP plus SLG)/4. The nice thing about it is that it has the same scale as historic batting average, so, in theory, the average fan may be able to relate to it.

JC, in response to your question, the real reason OPS (or a variation of it) works is because of its similarity to the weights embedded in linear weights or runs created. Doubling OBP brings those implicit weights closer to the linear weights, particularly the weights given to outs, but adding ISO alone throws them off. If you just add ISO instead of SLG, for instance, a walk would be worth the same as a single, which isn't right.

Posted by: studes at January 29, 2004 04:31 PM

JC sent me an email where he makes an observation that I'm shocked nobody has made in the 8 months or so since I wrote the article.

[i]OBP and SLG ought to affect runs/game differently in the AL than the NL because of the DH. If you get on base in the AL, there is a better chance that someone will bat you around than in the NL because of that pesky pitcher's spot. I reestimated your regressions for several time-periods from 1961-2003. I think the most relevant number for DePodesta would be the post-expansion AL. From 1998-2003 the regression results are:

RG = -5.84 + 22.92 (OBP) + 7.21 (SLG) + e

r2 = .92

That is a ratio of 3.18 [/i]

Posted by: Dominic Rivers at February 1, 2004 11:50 AM

Thanks for posting this Dominic. I just have a small update on the OBP:SLG ratio after discovering a small data error. 3.18 to 2.94, which is still approximately 3:1. Just trying to be precise.

Posted by: JC at February 1, 2004 12:40 PM