Baseball Musings
Baseball Musings
August 03, 2006
What are the Odds

Chase Utley is 22 games from tying Joe DiMaggio's hit streak, 10 games from tying Rose and Keeler for the NL record. Let's take a look at the odds of him making it.

For the probability of getting a hit, I like to use something I call hit average. It's a very simple calculation, hits divided by total plate appearances. I use this instead of batting average because the calculation is less complicated. Batting average is a conditional probability:

BA = P(Hit|AB)

By Bayes Rule, we can rewrite that:

BA = P(AB|H) * P(H) / P(AB)

But the the probability of an AB given a hit is one, so:

BA = P(H)/P(AB)

If we substitute the formulas for the probability of a hit and an at bat we get:

BA = (Hits/PA)/(AB/PA)

Where PA is plate appearances. This finally yields:

BA = Hits/AB

I bet you never saw anyone derive batting average before. :-)

Instead, I'm just going to use hits/PA as my probability. I'm going to do two calculations, one based on Utley's career numbers, the other based on his season numbers. This will give us a range of probabilities.

Utley's career hit average is .2599, while this season it's .2938. For hit averages, .290 is excellent. That's the probability of Chase getting a hit in a single plate appearance. But games are made up of multiple plate appearances, and he only needs to get a hit in one of those to keep the streak alive. Utley is averaging 4.6 plate appearances per game. So, in a ten game span we expect he'll have six games with five plate appearances and four games with four. We can use the Binomial Distribution to figure out the odds of his getting at least one hit in a four PA and five PA game.

Probability of Utley Getting at Least One Hit in a Game
PA in GameCareer Probability2006 Probability
4.700.751
5.778.824

Since each game is an independent event, we can just multiply the game probabilities to get the probability of a particular streak. For example, he needs ten game to tie Rose and Keeler. So based on his 2006 hit average, that would be six games at .824, or .824 ^6 (^ is the sign for an exponent) times four games at .751(.751^4). That works out to a 0.1 chance of tying the NL record, or 10%. Those are pretty good odds. Here are the various odds in a table:





Probability of Utley Tying a Hit Streak
StreakCareer Probability2006 Probability
NL Record 44.053.100
ML Record 56.0015.006

The odds are against him against reaching any of these streaks, but the NL record is certainly doable, especially based on this year's performance.


Posted by David Pinto at 08:02 AM | Hit Streaks | TrackBack (0)
Comments

David, I was wondering what the odds were for Joe D or Pete Rose to reach their respective streaks after 34 games. It would be interesting to see Utley's probabilities relative to DiMaggio's and Rose's.

Posted by: Teddy at August 3, 2006 09:17 AM

Many years ago I did a calculation for DiMaggio's hit streak and it came to a 1 in 10,000 chance of happening. I'll have to check, but at the time, Joe had a hit average of around .314.

Posted by: David Pinto at August 3, 2006 10:14 AM

I join Teddy's ponder.

Posted by: Yazif at August 3, 2006 10:20 AM

You might find this interesting:

http://www.hiremetheo.com/wordpress/?p=47

For Dimaggio, I found that on average, 1 in every 490,000 hits would begin a 56-game hitting streak

For reference, I found Hershiser's 59 scoreless innings to be much less likely (though I believe he barely broke the record of 58).

Posted by: Mike at August 3, 2006 10:55 AM

I bet guys working on serious hit streaks swing at more pitches, converting some walks into hits (and some into outs, of course). That increases Utley's chances.
But maybe pitchers throw fewer strikes to guys working on long hit streaks? That would drop his chances.

Posted by: Jamie at August 3, 2006 11:35 AM

This may be the most pointless thing I've ever done since projected starters are relatively unpredictable, but if Utley's going to do it he will have to go thru these guys, in this order:

Marquis, Glavine, El Duque, Maine, H. Ramirez, Hudson, James, Arroyo, Ramirez, Milton, Pedro, Glavine, El Duque, Maine, Livan, Armas, TBD WAS Starter, Zambrano, Prior, Hill, TBD Cubs Starter, Pedro

Given the Cubs and Nats both have injury depleted rotations, the above can be considered completely wrong. But it's still fun to speculate that he could do it against Pedro Martinez... And that he'll have to face at least 6 LHP including Glavine 2x who he's a career 0 for 3 against. He's also 2 for 16 against WAS starters this year.

Posted by: Nate at August 3, 2006 11:47 AM

swidyojx fgaveujqk sygod nyvm brgpicvhu hacf xefygoi

Posted by: hmdanevsf gmoew at August 8, 2006 08:39 AM

muawe uvpwy qnczthir mvdba nbeftcpw ltguv jyothm pwzqcyd saphg

Posted by: pfoeqywcg ypmn at August 8, 2006 08:40 AM
Post a comment









Remember personal info?