Baseball Musings
Baseball Musings

Statistics Archives

May 09, 2008
Daily Dose of Data
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:44 AM | Comments (0) | TrackBack (0)
May 08, 2008
Thursday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:34 AM | Comments (0) | TrackBack (0)
May 07, 2008
Daily Dose of Data
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:00 AM | Comments (0) | TrackBack (0)
May 06, 2008
Tuesday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 06:48 AM | Comments (0) | TrackBack (0)
May 05, 2008
Good for Tim
Permalink

I didn't get the Buck-McCarver broadcast on Saturday, but I wish I had so I could have heard this:

Joe remarked that Fukudome had gone 4-4 the day after the his cover issue hit the stand. After musing over Fukudome's ignorance of Cubs history and the variety of curses associated with them, he then asked Tim if he believed in curses or jinxes.

Now, it would be normal for Tim to play along with this silly idea. Tim however chose to rather bluntly shoot it down. "No," he said, "I don't believe in curses or jinxes or anything like that."

Buck then decided to bait McCarver by talking about how poorly McCarver had played after his two appearances on the cover of SI. McCarver responded again bluntly: "Can't a guy just play badly? What can't a guy just not play well? You don't need some curse or jinx to play poorly. Haven't we come far enough as a society not to believe in those things?"

Thanks, Tim for saying that so well. Hat tip to BBTF.

Posted by StatsGuru at 11:36 AM | Comments (0) | TrackBack (0)
Monday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:42 AM | Comments (0) | TrackBack (0)
May 04, 2008
Sunday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 08:24 AM | Comments (0) | TrackBack (0)
May 03, 2008
Saturday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:19 AM | Comments (0) | TrackBack (0)
May 02, 2008
Friday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:37 AM | Comments (0) | TrackBack (0)
May 01, 2008
Daily Dose of Data
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:27 AM | Comments (1) | TrackBack (0)
April 30, 2008
Wednesday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:37 AM | Comments (0) | TrackBack (0)
April 29, 2008
Tuesday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:08 AM | Comments (0) | TrackBack (0)
April 28, 2008
Monday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:19 AM | Comments (0) | TrackBack (0)
April 27, 2008
Rare Occurance
Permalink

This is really bucking the odds:

Consecutive losses for the Rockies despite leading in the eighth inning or later each game. It happened to only one other team in the past 100 years, the '78 Giants
.
Posted by StatsGuru at 09:49 AM | Comments (3) | TrackBack (0)
Sunday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 08:14 AM | Comments (0) | TrackBack (0)
April 26, 2008
Saturday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 10:01 AM | Comments (0) | TrackBack (0)
April 25, 2008
Friday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 08:09 AM | Comments (0) | TrackBack (0)
April 24, 2008
Daily Dose of Data
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:08 AM | Comments (0) | TrackBack (0)
April 23, 2008
Mid-Week Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:21 AM | Comments (0) | TrackBack (0)
April 22, 2008
Tuesday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 06:42 AM | Comments (0) | TrackBack (0)
April 21, 2008
Patriots Day Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:25 AM | Comments (0) | TrackBack (0)
April 20, 2008
Sunday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 08:07 AM | Comments (0) | TrackBack (0)
April 19, 2008
Saturday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:46 AM | Comments (0) | TrackBack (0)
April 18, 2008
Friday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:58 AM | Comments (0) | TrackBack (0)
April 17, 2008
Daily Dose of Data
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:45 AM | Comments (0) | TrackBack (0)
April 16, 2008
Mid-Week Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:19 AM | Comments (0) | TrackBack (0)
April 15, 2008
More Data, More Research
Permalink

Mike Fast at The Hardball Times looks at how new tracking technologies might change the game of baseball. I must admit I haven't been keeping up with PITCHf/x as much as I should. My first thought is a probabilistic model of the strike zone. Given handedness of the batter and pitcher, the count, velocity, release point and coordinates of the pitch when crossing the plate boundary, (a line extending out from the front of the plate), what is the probability of:

  • A strike call
  • Contact on a swing
  • Fair in play
  • Type of ball in play
  • A base hit

There are a lot more pitches than balls in play, so one should be able to develop a better model than PMR with the data for a season. Then you can see which pitchers and hitters perform above or below average on these types of pitches. This will keep us busy for years to come.

Posted by StatsGuru at 10:33 AM | Comments (1) | TrackBack (0)
Tuesday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:20 AM | Comments (0) | TrackBack (0)
April 14, 2008
Monday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:56 AM | Comments (0) | TrackBack (0)
April 13, 2008
Sunday Update
Permalink

The Day by Day Database is up to date.

I apologize for the lateness of the update. I spent the night suffering from an stomach ailment.

Posted by StatsGuru at 11:00 AM | Comments (2) | TrackBack (0)
April 12, 2008
Saturday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 08:18 AM | Comments (0) | TrackBack (0)
April 11, 2008
Friday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:35 AM | Comments (0) | TrackBack (0)
April 10, 2008
Thursday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:20 AM | Comments (0) | TrackBack (0)
April 09, 2008
MattaMagic
Permalink

According to PythagenMatt, the Royals are the best team in the AL.

Posted by StatsGuru at 08:58 AM | Comments (2) | TrackBack (0)
Daily Dose of Data
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:08 AM | Comments (0) | TrackBack (0)
April 08, 2008
Tuesday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:19 AM | Comments (0) | TrackBack (0)
April 07, 2008
First Place Marlins
Permalink

The Marlins defeat the Nationals 10-7 tonight and move a game ahead of Atlanta at the top of the NL East. They've been outscored 47 to 31, however, so don't expect that placement to last.

Posted by StatsGuru at 11:06 PM | Comments (1) | TrackBack (0)
What's the Probability?
Permalink

A nice probability quiz at The Numbers Guy. Here are two interesting ones:

4. In baseball, suppose the American League champion is better than the National League champion, such that it has a 55% probability of winning each game against the NL champ. Then the NL champ nonetheless will win a best-of-seven-games series four in 10 times. What is the smallest odd number, X, for which a World Series between these two league champs that is best-of-X will ensure that there's a 95% probability of a just result -- the superior AL champ winning?

5. Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind each of the other two, a cow. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, to reveal a cow. He then says to you, "Do you want to change your choice to door No. 2?" Is it to your advantage to switch your choice, assuming you prefer cars to cows?

I believe the answer to four is 383. I have a Python script that computes 95% confidence intervals, and the low end of 383 is 192. The low end of 381 is 190.

On question 5, the answer is to switch. I love asking this question. The answer is if you switch, you win the car 2/3 of the time. When you make the first choice, you'll be wrong 2/3 of the time. If you switch, you'll be right 2/3 of the time!

Posted by StatsGuru at 10:44 PM | Comments (12) | TrackBack (0)
How Clutch is Ortiz
Permalink

Cyril Morong does some significance testing on the number Bill James supplied on David Ortiz's clutch hitting. The only place Ortiz comes close to being significantly better in those situations is in extra-base hits:

Moving to XB%. He had a rate of 14.8% under normal circumstances while he a a 17.5% rate in the James clutch situations, for an 18% higher rate. The Z-score was 1.96 using the normal dropoff of about .013. So this is very close to being significant. His extrabase hit performance may truly be clutch.

If you think about it, that's what you really want in those situations, batters who can move runners a long distance with one hit. With a man on first, a single is nice, but an extra-base hit gives the runner a much higher probability of scoring.

Hat tip, BBTF.

Posted by StatsGuru at 11:53 AM | Comments (1) | TrackBack (0)
Monday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:21 AM | Comments (0) | TrackBack (0)
April 06, 2008
Closing the Door
Permalink

Joe Pawlikowski sends along this nugget:

The Angels do not give away ninth-inning leads. They have won 161 consecutive games when leading after eight innings. It's the longest active streak in the majors. The last ninth-inning loss came April 19, 2006, at Minnesota.

Since the start of the 2006 season, K-Rod has blown 10 saves. Either those were earlier in the game, or after the Angels took a lead in the top of the ninth (or in extra innings), or the Angels ended up winning the game anyway. I'd like to see the rest of the list in that time. I'm sure there aren't that many leads blown after eight innings.

Posted by StatsGuru at 10:23 AM | Comments (0) | TrackBack (0)
Sunday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 08:13 AM | Comments (0) | TrackBack (0)
April 05, 2008
Daily Dose of Data
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:08 AM | Comments (0) | TrackBack (0)
April 04, 2008
Friday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:35 AM | Comments (0) | TrackBack (0)
April 03, 2008
Thursday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 07:29 AM | Comments (0) | TrackBack (0)
April 02, 2008
Clutch Truce?
Permalink

The Other Fifteen wants a truce over clutch hitting.

This isn't a cry for fusion, or balance, or peaceful coexistence. The world wouldn't be a better place if newspaper articles all read "Today the Cubs and Brewers recorded 27 outs apiece in a contest at Wrigley Field, which revealed almost nothing about the two teams due to the small sample size involved." Nor would the world be a better place if VORP started including Steely-Eyed Resolve as one of its components.

What I am asking for is a simple truce: believers in clutch, I as a student of sabermetrics will stop telling you that clutch doesn't exist, or is insignificant, or what have you, if you will stop insisting that its existence in any way, shape or form has an impact on impartial evaluations of player performance. Do we have a deal?

No deal. There are clutch hits, which fit the narrative of the game discussed in the post. All players get clutch hits; that does not make them clutch hitters. When David Ortiz hits a walk off home run, there is no doubt in my mind it was a clutch hit. When Luis Sojo wins a World Series game with a hit, there's no doubt it was a clutch hit. That doesn't make them clutch hitters.

The narrative that X delivered in the clutch is fine. The narrative that he often delivered in the clutch is fine. The narrative that X is clutch hitter based on five or six at bats doesn't work.

Hat tip, The Hardball Times.

Posted by StatsGuru at 04:50 PM | Comments (6) | TrackBack (0)
Mid-Week Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 06:56 AM | Comments (0) | TrackBack (0)
April 01, 2008
James Q & A
Permalink

Freakonomics publishes a Q & A with Bill James, where fans sent in questions. I like this one about the Cubs:

Q: Why can't the Chicago Cubs get into the World Series? Is it the small park? Low salaries? The curse of the billy goat? Does sabermetrics provide any insights?

A: Talking about the origins of it -- the Cubs fell into a trench in history in the late 1930's, when almost all baseball teams built farm systems, but the Cubs for several years refused to do so. This put them behind the curve, crippled them for the 1950's, and really the organization did not fully overcome that until about 1980.
Since 1980 they have had several teams that could have wandered into a World Series, with better luck. They haven't had any one overpowering team -- like the 1984 Tigers, or the 1992 Blue Jays, or the 1998 Yankees -- that was so good that it demanded a seat at the Last Banquet of Fall. And, unless you have a team that good, you're at the mercy of the fates.

Posted by StatsGuru at 05:57 PM | Comments (2) | TrackBack (0)
Tuesday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 08:14 AM | Comments (0) | TrackBack (0)
March 31, 2008
Monday Update
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 09:39 AM | Comments (0) | TrackBack (0)
March 28, 2008
What's in a Name?
Permalink

On Baseball and the Reds doesn't like the term sabermetrics.

While I know this is probably a minority opinion, I really dislike--almost despise--the term "sabermetrics." Maybe it's just because I didn't grow up with Bill James. But that term has always sounded both pompous and half-baked to me--like we're trying to claim some kind of grand authority or officiality by coming up with an official-sounding name for what we do.

I think at least part of the backlash against "sabermetrics" has as much to do with that name as anything else. I've occasionally interacted with a local reporter in Cincinnati for some stat-inspired articles on the Reds over the past year, and one thing I've tried to stress (as have the other folks like me who have contributed to these articles) is to try to avoid calling us sabermetricians. I don't want to give people that as a reason for ignoring some of the ideas we advocate.

I'd much prefer it if everyone just called what we do what it is--baseball research. There's nothing really special about it...we're just searching for better understanding of how the game works.

I used to work for a company call Dragon Systems, Inc. The name came from the owner's hobby of collecting Chinese Dragons. You can see the logo here. The company built the best speech recognition software available, but other business people would constantly complain about the name and the logo. They'd tell us no one knows why you do by the name. They'd say the logo looks like you're a Chinese restaurant. They were probably right, but the owners kept the name and the logo and built a very successful business because they built a damn good product.

The upside of the name was that when you said, "I work for Dragon Systems," everyone had to ask what the company did. If I said, "I work for Voice Products of America," they'd say that's nice and move on. The same is true for sabermetricians. Baseball researcher, big deal. Sabermetrician, what's that about?

My good friend Jim Storer is married to a doctor at Yale Medical School. She was at a reception for new fellows, and the various new doctors are being introduced. The MC notes that one is a sabermetrician, and asks, "Does anyone know what that is?" Linda raises her hand and answers, "Sadly, yes." That great bit of comedy doesn't happen if he's a baseball researcher.

So Justin, if you don't like the term, don't use it. Be a baseball researcher. But don't deny others the fun of being a sabermetrician.

Posted by StatsGuru at 08:58 AM | Comments (2) | TrackBack (0)
March 27, 2008
Daily Dose of Data
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 02:39 PM | Comments (1) | TrackBack (0)
March 25, 2008
Daily Dose of Data
Permalink

The Day by Day Database is up to date.

Posted by StatsGuru at 11:02 PM | Comments (0) | TrackBack (0)
March 24, 2008
Cosmic Stats
Permalink

Cosmic Log looks at the science of baseball statistics quotes me from the AAAS meeting I spoke at in February.

Posted by StatsGuru at 04:49 PM | Comments (0) | TrackBack (0)
March 20, 2008
Ultimate Lineup
Permalink

Baseball Notes looks at the production of each team lineup slot in 2007 and comes up with the best combined lineup in the majors. There are some surprises in there as well.

Posted by StatsGuru at 03:24 PM | Comments (0) | TrackBack (0)
March 17, 2008
K-GB
Permalink

Rich Lederer at the Baseball Analysts looks at the relationship between strikeout and groundball rates among starting pitchers. Rich graphs the two rates against each other and groups pitchers by quadrant. What I think is interesting are the outliers on the high side of strikeouts and ground balls. If you take the five pitchers with the highest strikeout rates, you get a rotation of:

  • Erik Bedard
  • Scott Kazmir
  • Johan Santana
  • Jake Peavy
  • A.J. Burnett

If you do the same for the five highest GB% pitchers:

  • Derek Lowe
  • Fausto Carmona
  • Tim Hudson
  • Brandon Webb
  • Felix Hernandez

With the exception of Webb, the high K rotation is all aces, while the high ground ball rotation is mostly number two starters.

Posted by StatsGuru at 12:27 PM | Comments (2) | TrackBack (0)
March 13, 2008
Spring to Summer
Permalink

My latest column at SportingNews.com explores the relationship between spring training and regular season records.

The Baseball Musings pledge drive continues through March. Please consider making a donation.

Posted by StatsGuru at 04:21 PM | Comments (2) | TrackBack (0)
March 09, 2008
Complicated Stats
Permalink

Joe Posnanski takes on the idea that the new stats are too complicated:

But here's why this whole "It's so confusing" argument amuses me so much: People will tell you that the new stats are too convoluted or manufactured ... and yet there are NO stats more convoluted and manufactured than the basic statistics that baseball has been built around for more than 100 years.

Some of this should be obvious. Batting average? It's ridiculous. Preposterous. Imagine that no one had ever come up with batting average before ... and then someone on a blog came up with with this idea:

Blogger: I have come up with a new statistic. It involves balls put in play. I call it batting average.
Establishment: Great! How's it work?
B: See, what we'll do is, we'll take the number of hits that the batter has and divide it by the number of at-bats that he has in order to determine how often he gets a hit.
E: That sounds like on-base percentage. What's the difference?
B: Well, it's all in what you call "at-bats" For one thing, we don't count walks.
E: What do you mean you don't count walks?
B: They don't count. We take plate appearances and subtract walks. They never happened.
E: How can a walk never happen?
B: It just doesn't.
E: Aren't walks good things? Like in Little League, we always say "Walk's as good as a hit."
B: I hate walks. They're gone. So let's say a guy comes to the plate 12 times, and he gets four hits and walks twice ...
E: Right ... that's a .500 on-base percentage.
B: Exactly, but if you just subtract the walks, you will see that he has a .400 batting average.
E: Um, OK.
B: But there are other things. If you hit a fly ball, and someone tags up and scores a run, that does not count as an at-bat.
E: Why not?
B: Because you are sacrificing yourself for the betterment of the team? I call it a sacrifice fly. Get it?
E: Well, what are you sacrificing if it doesn't even count against your stats?
B: You just are, OK?
E: What if you hit a ground ball and the runner scores.
B: How's that?
E: Let's say the infield's back and a guy hits a ground ball to get the run in. How do you score that?
B: No, that's not a sacrifice fly.
E Why not? Doesn't that accomplish the same thing?
B: It just isn't. Come on, pay attention. What's it called. Sacrifice FLY? Hello! He didn't hit a fly ball.
E: It just seems to me ...
B: Sacrifice bunts also do not count as at-bats. And when you get hit by a pitch ... doesn't count.
E You don't get any statistical notice for getting hit by a pitch?
B: Like it never happened.
E: I'm afraid to ask this: What happens if you reach on an error.
B: That's the beauty of this system. According to my new batting average, you're out.
E: But you're not really out.
B: I know. Isn't it great?
E: Why does this have to be so complicated?
B: It's batting average! It will take over the world!

I like to explain this be asking a person to define an at bat as what it is, rather than what it isn't. You can't do it. Joe takes on ERA as well. It's a typical great Posnanski post.

The Baseball Musings pledge drive continues through March. Please consider making a donation.

Posted by StatsGuru at 07:50 PM | Comments (0) | TrackBack (0)
March 03, 2008
Win Share Aging
Permalink

The Baseball Crank publishes his latest age adjustments for Established Win Share Levels.

Posted by StatsGuru at 07:57 PM | Comments (0) | TrackBack (0)
February 25, 2008
Wouldn't You Like to be a Vorpy Too?
Permalink

Fire Joe Morgan revels in being called a VORPY by John Heyman.

It's a historic day. For years, man has waited for just the right term to use when insulting other men who love baseball numbers just a little too much. (What are they, gay for numbers? Probably.) And now, just like the wait for Shrek 3, that wait is ogre.

Jon Heyman has called us VORPies.

Now we can do that scene from Spartacus (and In and Out) in which we all stand up and declare, "I'm a VORPY!"

Posted by StatsGuru at 07:25 PM | Comments (7) | TrackBack (0)
February 16, 2008
Boston Symposia
Permalink

I participated in an AAAS symposia today on New Techniques in the Evaluation and Prediction of Baseball Performance. Thanks to Ed Aboufadel of Grand Valley State University for the invitation. Shane Jensen presented his SAFE system, a more sophisticated version of the Probabilistic Model of Range (PMR). Steve Wang showed new ways of visualizing data, concentrating on managers. Both were very interesting, and Alan Schwarz kept us on our toes as the moderator.

I talked about the Probabilistic Model of Range, and you can view the slide show here. One nice thing at this conference was a press conference after the talk. I've never done one of those before, and I must say the science writers asked very good questions. This was an unusual topic for this meeting but it went over well.

Update: AP covered the talk.

Update: Some browsers can't run the slide show. It works with IE. For those who can't you can download the actual power point presentation.

Download PowerPoint 2007 version. Unfortunately, the charts I used aren't compatible with PowerPoint 2003.

Also, word that Jeter is at the bottom of the list of shortstops doesn't play well with Yankees fans.

Posted by StatsGuru at 06:25 PM | Comments (3) | TrackBack (0)
January 29, 2008
Holding the Bannister
Permalink

Brian Bannister just became the favorite pitcher of sabermetricans.

Posted by StatsGuru at 02:42 PM | Comments (2) | TrackBack (0)
January 24, 2008
Upping the Ante
Permalink

MGL at The Book blog ups the ante on Tango Tiger's clutch project.

Here is the kicker. I am willing to donate a substantial sum of money to a charity chosen by one side of the debate - the "non-sabermetric" side of course, if they win. We would have to define "winning" - maybe best of 3, if we do 3 things, like clutch, batter/pitchers, and hot/cold. Or we can do each one separately.

If the sabermetric side wins, I will also donate money, but that will be to a charity of our choice and it will be less money.

I'm not sure how much, but it would be on the order of $10,000 for them and $5,000 for us. What the heck. Anything to make a point. If this flies, let none of my/our detractors/naysayers EVER say that I won't put my money where my mouth is! This should generate some good publicity and might encourage the media and perhaps some insiders to participate.

Mitchel is looking for members of the baseball media and baseball insiders to contribute to this project. If you're one of them, I hope you'll participate.

Posted by StatsGuru at 12:30 PM | Comments (0) | TrackBack (0)
January 22, 2008
Retaining Win Shares
Permalink

Over at The Book, win share aging curves.

Posted by StatsGuru at 03:21 PM | Comments (0) | TrackBack (0)
January 15, 2008
Predicting Clutch
Permalink

Tom Tango puts forward a great idea for the 2008 season to examine the idea of clutch hitting. He's asking for your help in ironing out the details.

Posted by StatsGuru at 12:05 PM | Comments (0) | TrackBack (0)
January 01, 2008
Lahman Database Updated
Permalink

The latest version of Sean Lahman's baseball database is available for download. It covers the history of MLB through 2007.

Posted by StatsGuru at 12:54 PM | Comments (0) | TrackBack (0)
December 11, 2007
Random Ortiz
Permalink

Dan Fox finds a lot lacking in Bill James latest Sports Illustrated article on clutch hitting. Fox:

First and foremost, the article seems to promote the idea that after the now famous study titled "Do Clutch Hitters Exist?" published in the 1977 Baseball Research Journal by Dick Cramer, that little to no work has been done on the subject of clutch hitting and that what has been done has had an in-grained bias. Quite to the contrary, the topic has been the subject of almost continual debate with a variety of studies published over the years as documented on Cyril Morong's fine site. And more recently there have been several very good analyses done as I discussed in the introduction to my Schrodinger's Bat column of March 1, 2007.

This is not a typical James entry from the Baseball Abstracts. There no details of the method used or why the results are interesting. For example, he produces a table of David Ortiz in clutch situations since 2002 and notes the numbers are impressive.

That's the regular season; I understand he's had a couple of hits in postseason as well. It's a pretty good record; in fact, you kind of have to see more data to understand how good it is. We've started an award for the major leagues' clutch hitter of the year, based on the data, and David could pretty much win it any year. Only a handful of players a year drive in 30 runs in clutch situations. As to whether these data prove that David is a clutch hitter ... I ain't going there. This discussion has been messed up for 30 years because we got our shoulders way out in front of our shoelaces. From now on, I'm holding back.

My problem with the Oritz table is there's no reason to believe it's not random. In 394 at bats, Ortiz produced 127 hits. Over the six seasons covered, Ortiz hit .298. A .298 hitter has a 95% confidence interval of 100 to 135 hits. Ortiz hit 35 home runs in that number of at bats. His home run rate during those six years was .07236. The 95% confidence interval for home runs is 19-39. So he's at the high end of the range, but still in the range. He had a 1 in 8 shot at hitting that many home runs.

When James says, "kind of have to see more data to understand how good it is," I take it he means compared to other players. Until I see the other data, I remain unconvinced that clutch hitting is more than random noise.

Bill, in a way, makes my point for me:

The other question everybody asks now is "How do you determine what is a clutch at-bat?" I'll have to stiff you on that one for right now. I'll explain it generally and leave the details for some other time.

"Clutch" is a complicated concept, containing at least seven elements:


  1. The score,

  2. The runners on base,

  3. The outs,

  4. The inning,

  5. The opposition,

  6. The standings,

  7. The calendar.


All these items whittle down the at bats to a very small number, and it is very difficult to find significance in small numbers of plate appearances. For now, I'll stick to my mantra that good hitters are the good clutch hitters.

Posted by StatsGuru at 08:41 AM | Comments (8) | TrackBack (0)
November 17, 2007
Visual Aid
Permalink

Josh Kalk introduced his first version of a web based tool for viewing PITCHf/x data. Right now, it allows you to see in two dimensions where a pitch passed the batter. For example, here's Alex Rodriguez. Compare him to Alfonso Soriano. Soriano swings at a lot more pitches out of the strike zone. However, when Alex swings out of the zone, it's usually a swing and a miss, while Soriano often makes contact. Alex appears to look for a ball in the strike zone, and when he's fooled the result is a swinging strike. Soriano appears to do a better job of putting the bat on the ball, although the result is often a foul tip on what should be a ball. I hope Josh's next enhancement is a way to view a particular result. I'd like to explore, for example, what batters get the most called strikes outside the zone. Does lack of selectivity really expand the strike zone from the umpires view?

Posted by StatsGuru at 11:22 AM | Comments (0) | TrackBack (0)