Fridays Above Replacement

“Luck” is a very divisive word among baseball fans of late.  The word gets thrown around a lot when discussing common statistical measures, such as Batting Average on Balls In Play (BABIP) for hitters or Left On Base Percentage (LOB%) for pitchers.  What does “luck” really mean, though, in the context of hitting and batted balls?

The way I see it, “luck” is just a stand-in phrase for a much longer explanation about the variance of batted balls that come off the bats of major league hitters.  I think all fans inherently understand that Hitter A is more likely to get a hit if he hits a screaming line drive than Hitter B if he hits a lazy fly ball.  So, if a Hitter A makes an out on the liner and Hitter B gets a hit off the can-of-corn, it’s much easier to say Hitter A was unlucky and Hitter B was lucky.

So, how does one measure if a hitter is getting lucky or unlucky over a given period of time?  If you watch enough of a hitter’s at-bats, you can generally tell how lucky or unlucky a hitter is getting.   However, few of us can watch every at-bat of an entire team attentively enough to get an accurate picture.  Even if we can, we are likely to fall prey to some kind of bias.

Because of this reality, we have eXpected Batting Average on Balls In Play (xBABIP)!  Not only do we generally know the things that help a BABIP, but we also know what things can hurt a BABIP.  So with a little math, we can come up with a multi-variable equation for what a hitter’s BABIP should be, given those variables.  Then we compare what BABIP should be versus what it is and we have our luck factor!

The xBABIP equation I’ll be using is a modified version of what is found here, which has been refined and modified a few times by a few different folks.  The modification I’m making is replacing the Bill James Speed Score (Spd) with FanGraphs Base Running Runs Above Average (BsR).  Spd is an older formula that only takes things into account like triples rate, steal rate, and stolen base success rate.  I probably don’t have to tell you that triples rate is highly variable and can depend as much on your park as on your speed.  So, BsR gives a much more balanced approach with more variables that more quickly estimate your overall speed in relation to getting hits.  I took BsR and scaled it to a per-150-games baseline and then inserted it into the existing xBABIP equation in such a way as to keep the relative impact of the speed component the same in regards to xBABIP.

In addition to foot speed (which has, admittedly, a small impact), what else goes into xBABIP?  Line drive rate, fly ball rate, hard-hit rate, infield fly ball rate (popups), and opposite-field hit rate.  All of these things either positively or negative correlate to BABIP.

So, why am I writing about this?  Because we want to see what the Reds are doing, of course!  Without further delay, here is a chart!  We love charts, right?


Currently, Adam Duvall has been the luckiest Red in terms of BABIP-xBABIP difference, at 40 points above expectation.  Watching his at-bats, we can see why this is.  In the last few weeks, he’s had a few blooper and a few broken bat bleeders find holes for base hits. This, coupled with him generally hitting the ball hard, gives him a high-ish BABIP.

Currently, Joey Votto is the most unlucky Red in terms of BABIP-xBABIP difference, at 79 points below expectation.  Again, this shouldn’t be a surprise to anyone who has watched most of his at-bats.  He’s routinely hitting the ball hard, but it has been going right at people quite often.  For example, in the last 2 games, he’s hit a 105 mph and a 103 mph line drive right at Lonnie Chisenhall in right field.  A few feet either way and those are doubles, a few degrees more loft and they are homers, a few degrees less loft and they fall in for a single.  Plain old bad luck.

Jay Bruce is the poster child for normalcy.  He’s running a well above average BABIP and a well above average xBABIP; almost identical.  He’s hitting the ball hard, hitting more line drives than he usually does, and hitting fewer fly balls he usually does.  Add all that together and it’s a good indication that Bruce should be running a higher AVG than usual, and he is.   The shift doesn’t seem to be hurting Bruce as much this year, likely do to the increase in line drives.

Billy Hamilton is an interesting case.  His speed scores are off the charts, which would seem to suggest he should be getting a lot of infield hits and bunt singles.  This isn’t the case because of Billy’s somewhat unique profile.  He generally can’t hit the ball hard enough to keep infielders honest which causes them to play so far in that it effectively removes bunt singles and infield hits from Billy’s profile.  Billy, no matter how long he plays, will likely always under-perform his xBABIP.

Also interesting to note is that Tucker Barnhart has the highest xBABIP of any Red.   This is due, mostly, to the fact that Tucker is destroying the baseball to the tune of a 36.9 Hard%, and a 29.2 LineDrive%.   The Hard% is 2nd on the team to Votto, and the LineDrive% is first on the team.  He’s also not popped up a single time.

Some of you may be thinking a 119-point spread is pretty large (from +40 to -79) and you’d be correct.  BABIP is one of those stats that takes a long time to reach any level of stability.  Many hundreds of balls-in-play over thousands of at-bats is required before we can be confident in a player’s true-talent BABIP.

So, let’s look at these same players, but over their entire careers and see how much we should actually trust our friend Mr. xBABIP.  The following chart shows career BABIP and career xBABIP, sorted by career xBABIP.


If we take the four players with the largest sample size (Phillips, Votto, Bruce, and Cozart) we see that their career BABIP are all no more than a 13-point difference from their career xBABIP.  This should give us a pretty good feeling that xBABIP, even on a partial season sample, should be a decent estimation of what reality should be.

The other hitters on the list have a larger variance due to the fact that they simply haven’t had enough at-bats for their BABIP to begin to stabilize and become a true depiction of talent level.  In my pre-season analysis of Eugenio Suarez, I presented why I think he’ll be a BABIP over-performer.

Of particular note, I think, is that Votto’s .355 career BABIP is 4th in MLB history behind Ty Cobb (.379), Rogers Hornsby (.365), and Rod Carew (.359). I am only counting people who started their careers after 1900 and accumulated at least 4000 PA.  If I lower the threshold a bit to 1500 PA, Christian Yelich becomes 2nd in MLB history at .366.  Watch out for that kid. He can hit.  Mike Trout and Starling Marte (!) also sneak into the top 10 all-time.

If you view BABIP as a overall measure of how well a hitter strikes the ball (contact quality) and how well the hitter uses the entire field and mixes up his ball-in-play types (LD, FB, GB), not many have ever been better than Votto.  It is unfortunate that he’s had a poor start to the season, but I still appreciate watching him go at his craft, spraying hard-hit liners right at Lonnie Chisenhall.

So why did I write this article?  I don’t really know.  Seemed like a fun topic and an excuse for me to do a little math.

What should the take-away be? Probably something like “don’t assume a high (or low) BABIP means a hitter is getting lucky or unlucky without first looking at their peripherals.”  Also, something like “Duvall is getting a bit lucky, Bruce and Cozart are striking the ball very well right now, and most of the other Reds are getting a bit unlucky.”   Maybe you already knew that intuitively, but now you know it mathematically!

BABIP figures courtesy of FanGraphs.

Note: A lot of work is being done very recently on creating an xBABIP equation using only inputs from StatCast.  I haven’t had the time to fully explore this, but it seems promising to use actual results.

Join the conversation! 32 Comments

  1. Ever the optimist my takeaway here is actually very positive. Going forward I expect Bruce and Cozart to continue to hit well. I expect Votto to go on a tear. And I expect only Duvall to fall back a bit while Barnhart, Suarez and Phillips should all see an uptick in their numbers. That’s good. And even though Hamilton hasn’t been lucky and probably won’t be moving forward, I do like the more slashing style of hitting I’ve been seeing from him lately. The Reds offense hasn’t been awesome this year, but neither has it been awful and I expect it to improve based on these numbers and not just wishful thinking. Now about that bullpen…

  2. Love your stuff, Patrick – well done!

  3. I simply don’t buy luck or unlucky. in the real world or in sports. People make their own luck. Duvall is working hard and has earned his mild success nothing lucky about it.
    Joey Votto seems like he is striking out twice a game…. that seems to me to go beyond any sort of statistical scrutiny… he is on pace for 180K’s for the season. Looks to me as if he is getting pounded inside and hasn’t been able to adjust.

    • Do you think a baseball player can aim a ground ball between the shortstop and third baseman? If luck isn’t involved, how can you explain BABIP variation from month-to-month?

      Even a casual fan can see that line drives hit right at outfielders are unlucky and that check-swing infield hits are lucky.

      • One way you could explain variations is that hitters play a different set of teams monthly, teams with different skill levels in their fielders. E.g.: we all saw Peraza miss fly balls that Hamilton would have gotten to this past week. I guess you could say the guy was unlucky to play against better fielders one particular month, I suppose. But that’s like saying every time the Reds play a good team, they’re unlucky to have to play them. No, it’s just the schedule. I agree luck is something of a factor, but tends to even out over time for most guys, so it’s not worth worrying about.

        BABIP is somewhat interesting, but again it’s trying to predict the present based on the past, and nearly all players have off years and a point in their careers where they start to decline for good. I’ve yet to see a formula that predicts with even mild accuracy whether a guys gonna have a bad year or not, or when he’ll decline for good. No one in their right mind, even thru spring training, could have guessed Votto would be as bad as he has been so far. Certainly no amount of saber formulas predicted this. Sabermetrics isn’t visionary, it’s reactionary. I’m not saying it’s not useful, just that it’s very limited in its accuracy of prediction. Human beings simply do not respond well when asked to perform with machine like repetition. Formulas suit machinery much better than humans. No formula and maybe a few people can predict injuries, no one can predict decline, no one can predict lack of focus or emotional issues. A lot of life is mysterious, as is a lot of baseball.

        That’s not to say a formula can’t be found to predict the percentage of chance a player is due to have a bad year. I suppose you could analyze how many bad spells great players, average players, and bad players have had during past careers and come up with some formula of predictability for whatever stage a guy is at. Someone will do that someday soon, maybe already has. Then we can see how useful it is.

        • It has nothing to do with prediction.

          A guy with a .200 BABIP and .300 xBABIP should NOT be considered to be a .300 BABIP guy going forward. That’s the prediction part. It should not be used as a prediction.

          What we CAN say is that for the past, this hitter has been the recipient of more unfavorable outcomes than should be reasonably expected based on his batted ball profile.

        • Thanks for agreeing with me, Patrick. Glad you get it. BABIP is interesting, but of little value predicting whether a guy is “due or not”. Bad luck does not always even out. The law of averages doesn’t always correct itself on individual players. Some guys have more good luck than others. Karma is mysterious and inexplicable. It’ll never fit into a sabermetric formula. Hence the inherent limits to sabermetrics.

          One issue. When you say “this hitter has been the recipient of more unfavorable outcomes than should be reasonably expected based on his batted ball profile”, it seems to me you inadvertently used the expression “reasonably expected” incorrectly.”Reasonably expected” is a euphemism for “predictability”. When a weatherman predicts rain tomorrow, he means we can reasonably expect it, based on his view of past weather models and present data pointing to tomorrow At least he better mean it, or why have him in the first place? He may be wrong, of course, but he’s predicting it. My point is that we both agree BABIP is not predictive. No sabermetrics formulas are. It’s easy to sometimes think we can predict a player’s future better from formulas, but we can’t. Heck, even the sun will no longer shine some day. We can take nothing for granted in life. Maybe gravity and the basic laws of physics, but beyond that, nobody knows what will happen tomorrow. Sabermetrics is fun as an exercise. but has no real value in predicting a player’s future. My position is it may cripple a coach’s ability to trust his instincts and gut feelings about a player, and that would be a bad thing, not a good thing.

    • You may have missed my opening. The word “luck” doesn’t mean what you think it means.

      It means “batted ball variance” but it’s much easier to say “luck” than have to write a paragraphs every time to avoid using the word “luck.”

      After last night’s game, Joey Votto is 1st in the NL in hard-hit percentage at 46.8%. No one in the league has the hit ball hard more consistently than Votto, yet his numbers are pedestrian. How do you explain that other than “bad luck?”

      • Patrick, maybe you’ve covered this somewhere and I’ve missed it… but I notice how often you cite hard hit % and exit velocity with the new data available. How correlative is hard hit % with success as a hitter? Sometimes I wonder if it matters as much it seems it might intuitively… we have more ability to see some of these things, but does it necessarily mean anything, or is there a general range where exit velocity tops out as explanatory of anything?

        • I guess it depends on what your definition of “success as a hitter is.” A lot of guys with very high Hard% have middling or below-average BABIPs because they hit too many fly balls and too many infield popups. But those same hitters may have high wRC+ because they hit a lot of home runs.

          I think the distinction here is that Hard% has some arbitrary cutoff, which Baseball Info Solutions (BIS) hasn’t disclosed (to my knowledge). Let’s say that cutoff is 95mph. Well, most 95 MPH fly balls will be deep outs. Usually balls need to be around 97-98mph before they can become homers, unless they are right down a line with optimal trajectory. So if you are Khris Davis or Jose Bautista, for example, you have a low BABIP even with a high Hard% because of all the fly balls.

          So, to that end, I took all players with 1500 or more PA (sample of 583) from 2002 to 2016 and ran a correlation between BABIP and all other measures and between wRC+ and all other measures. Here are the results. (This is R [correlation coefficient], not R-squared [coefficient of determination]. Left it as R to show positive/negative relationship.)

          BABIP to IFFB%: -0.619
          BABIP to Pull%: -0.509
          BABIP to FB%: -0.488
          BABIP to LD%: 0.430
          BABIP to GB/FB: 0.413
          BABIP to Oppo%: 0.411
          BABIP to GB%: 0.366
          BABIP to Soft%: -0.227
          BABIP to Hard%: 0.191
          BABIP to HR/FB%: 0.77 (interesting b/c HR aren’t counted in BABIP)

          As we can see from that, for pure BABIP maximization, there are many things that are more important than merely hitting the ball hard. Such as not pulling the ball too much, or not hitting too many fly balls. Hard% is even less important than not hitting the ball softly (Soft%). The thing that makes Hard% important is that you can hit a ball hard AND hit a line drive. Or hit a ball hard AND go the other way. Hitting the ball hard increases your expected output across any batted ball profile.

          Now, as far as overall production (wRC+) goes, Hard% is much more important. Here’s the same correlations to wRC+:

          wRC+ to Hard%: 0.741
          wRC+ to HR/FB: 0.690
          wRC+ to Med%: -0.517
          wRC+ to Soft%: -0.491
          wRC+ to GB%: -0.280
          wRC+ to IFFB%: -0.279
          wRC+ to GB/FB: -0.241
          wRC+ to FB%: 0.238
          wRC+ to Oppo%: -0.186
          wRC+ to Pull%: 0.137
          wRC+ to LD%: 0.109

          If you want to produce runs, hitting the ball hard is the single most important thing you can do as a hittier. HR/FB rate is really just a combination of Hard% and Pull%.

          So, you can interpret this how you want, but since Hard% is very easy to find and state, and people inherently understand it (105mp is better than 104 mph), I like it!

          Also of note is that there are certain exit velocity buckets where less is better. Line drives, for example, that are hit too hard (like Votto the last 2 nights) can carry to the outfielder for an out. If they are hit around 80mph, they will fall in front of the fielder for a single. So, there are definitely nuances of all this, but the statement “Hitting harder is better,” is true for any somewhat normally distributed batted ball profile.

        • Wow… thank you for chasing all that data down. Very interesting how much it ranges in correlation to WrC and BABIP… but it does kind of answer my thoughts as to just b/c Votto is hitting the ball hard, it doesn’t necessarily indicate his BABIP is on its way up. Like you said, lots of nuance.

      • First, Patrick, thanks for the series of articles–they’re serving me as a primer in new stats, and a much-needed primer, at that. I wonder if Votto’s bad luck (hitting the ball hard, but often right at a fielder) is the intended result of the other teams pounding the inside part of the plate. To my eye, he seems to be pulling the ball more, which makes a certain intuitive sense, and is thus more affected by a defensive shift. Just a thought, and despite your good example, a thought based on the eye test.

    • Is the inference from this that Duvall is working hard and Votto is not? That the second half of last season Joey worked hard, but decided to rest on his laurels and is coasting? All we are ever told about Votto is that he is relentlessly dedicated to his craft.

      When Jay Bruce goes hot, is he somehow working hard, and when he goes cold is he mailing it in?

      Not saying that practice, conditioning, video work, etc don’t help, just saying I feel like it’s somewhat specious to assign “hard worker” to success and “not working hard enough” to people struggling in this game of variations.

      • I can’t agree enough with you on this. Baseball is hard. Hitting a baseball is the hardest thing to do perhaps in all sports. There is so much nuance and variance in the game that we love. If people don’t think there is luck involved, they’ve never been around a lot of players. Baseball players as a whole are the most superstitious bunch of people I’ve ever been around. Luck is a factor. We sometimes call it “The Baseball Gods” and ballplayers always want to be on the right side. Confidence ties directly into luck as well. Sometimes you get discouraged because you hit rockets that find gloves and sometimes it takes just one little bleeder to know that your luck is changing. The Baseball Gods are on your side again.

    • Dan, to discount the ides of luck (or variance, if you will) is ignorant of real life. If you’ve ever played baseball before, you know full well, that luck plays a major role in the outcome of any particular at-bat.

      I’m badly fooled by the pitcher, and make terrible contact with the ball, but a little dribbler down the third base line gets me an infield hit. Did I really do anything to earn that hit myself? No, that is simply luck. Or let’s say the pitcher busts me inside, and all I do is hit a little pop-up that drops into no-man’s land in the outfield. Did I deserve that hit? Absolutely not. Both times I SHOULD have gotten out, but good fortune was smiling on me.

      Now let’s say I’m seeing the ball really well. I hit a hard line drive into the outfield, but the ball happens to go right toward the left fielder and I’m out. If the ball had been just 5 feet in either direction away from the fielder it’s at least a double, but because of luck it’s an out. Did I deserve an out? Not really, but in this case luck was not on my side.

      The only way one can truly believe that luck plays no role, is to believe that a hitter can place the ball exactly where he wants it. Meaning, if he hits it right at the outfielder, then he should have done a better job of not hitting it there. If you have ever played the game, you know that is impossible. You can try to pull the ball, or try to go the other way, but you still have little control over whether the ball goes right at the fielder or not. And that’s what this whole discussion is all about.

      As for Joey Votto, of course the strikeouts are not due to bad luck. He is definitely off this season in that aspect. But all those screaming liners that have gone right at people? I don’t know what you would have expected him to do differently.

      • Good point on Votto’s strikeouts, DOC. Whatever he’s done to change his approach to combat inside pitching, the increase in K’s has been the result.

        Also, he’s gotten more high and away pitches called strikes this year than last, to the tune of 10-15% more called strieks on those pitches.

        If Votto can continue this type of K-rate until next week, I’ll explore it a bit further.

        • Yeah. When I’m defending Votto, I don’t mean to take away any responsibility he has with regard to his slow start. He DOES lead the team in strikeouts with 45, well ahead of Bruce, Duvall, and Suarez. Part of the reason his BA is just .215 (!) is that he has not put the ball into play enough.

          But when he has put it in play, he has really dealt with some bad luck, or the wrong side of statistical variance, as you will. He is hitting the ball harder than he ever has for his career, yet with only 10 XBH to show for it. Part of the reason for his low BABIP this season could be attributed to a slightly higher GB rate and slightly decreased LD rate. But that should be canceled out somewhat by the higher hard% too. Perhaps the strangest part of Votto’s stat line is to see he only has 4 doubles so far this year, an indication that when he hits the ball into the outfield, it’s more often than not going right at somebody.

  4. Thanks for this post! I’ve found myself frustrated with the Reds offense especially Votto’s struggles. But this post provides significant perspective on the Reds offense especially Votto.

  5. What Statcast is going to be capable of in 5 years is going to revolutionize how players are evaluated. There will always be room for a scouts’ eyes but the amount and type of actual measured game data and what can be done with it from an analytic standpoint will be staggering.

    • Completely agree. Can’t wait!

      • LWblogger2, wondering why the excitement at the possibility of more sophisticated formulas. Much of the new data will be so “nuanced” it will have diminishing returns in value. Kinda like whether a new red car has a little more touch of blue in the paint formula this year than it had last year. How much does that matter in car performance? And remember, it’s reactionary data, therefore neither predictive nor visionary. I wonder if saber formulas will get to the point of including how many massages a guy is getting after games, or how often he clips his toenails. If the staggering pile of this new data is not predictive, and none of it is, how much good will it be in reality? Scouts, coaches and player development guys will always be in charge of evaluations. Their eyeballs, ears, and feel will always be the guiding force. They already have tons of info/data at their finger tips. Too much info can be distracting and crippling. Tony Perez made it all the way to the HOF with a “See, the ball, hit the ball” philosophy. Paralysis by analysis is very real.

        We cannot assume Joey Votto will fix his huge increase in Ks this season, nor can we expect his BABIP to make it up to his xBABIP levels. We can hope, we can even expect, but we cannot assume or assure others. We can only say it’s unlike him in the past. Well, we already knew that without BABIP. We can see it. Whether it’s Batting Average, WAR, BABIP, or whatever, we never know for sure what’s coming from a player one season to next. Eyeballs and ears of good baseball people tend to notice subtleties and changes much quicker and clearer than math formulas. Heart, attitude, and confidence will never be measurable by mathematics, but experienced baseball people can spot them in an instance. And these are far more importance traits than how often a guy hits to the opposite field.

        Jez sayin’.

        • First off, I know a thing or two about scouting and about traditional player evaluation. You’re 100% right in that it is very important, and that it isn’t going away, nor should it go away. The big difference that the Statcast data gives us that scouts and even newer statistics don’t give us is concrete, physical, measured, data. There isn’t guesswork involved in an exit velocity or in a player’s running speed. Statcast is a scout’s dream come true. It provides the scout with so many tools to provide a better evaluation of what his eyes and ears are telling him. Do you get where I’m coming from? It also provides the analytics people with a wealth of data that has no guesswork involved and in fact can simplify and not further complicate things from an analytical perspective. That’s why I’m excited.

        • This is not specifically directed at anyone in particular… but I always find it odd why people have a hard time trusting math in baseball, but they trust math in every other walk of life… such as trusting the math that keeps the roof over your head from crushing you to death, or the math that makes this message go over the internet so you can read it, or the math and physics that keeps you from wrecking your car every time you get into it.

        • Patrick, that’s a great observation. The distinction boils down to mere personality conflicts. Some people don’t like to be told how to enjoy their pastime or how to think or to change their way of thinking or even be presented with a different view point. And some folks are just plain contrarians. Not necessarily believing the position they are taking but taking it just because it’s argumentative.

          Can’t really change that way of thinking for some. But there are many folks that are open to different approaches and view points. Best to just recognize what it is and move along. There’s always a Yahoo message board that’s waiting for a fight to be picked.

  6. Holy smokes Patrick! Your breakdown in the comments above is even more interesting than an already interesting main article. A ton of great work in that comment!

    • Slow Fridays where all your direct reports are gone make for plenty of time to work on a spreadsheet. 🙂

  7. It’s nice to be able to put some numbers to the often invisible idea of ‘luck’ or variance. I guess it’s good to see that most of our hitters – Votto especially – are currently hit with ‘bad luck’ and possibly due for some hits falling in. Interesting article!

  8. maybe some day hard hit balls will be factored into “possible runs scored” and then we can say the Reds are “unlucky” at scoring runs and we should be in first place. Hopefully Votto isn’t considering it an unlucky factor and is attempting to make an adjustment. I think it is as simple as that. Pitchers made an adjustment to Votto and he hasn’t readjusted. Duvall is an unknown player and teams are still learning how to pitch the guy. Love the article not picking a fight but I do find it odd how much luck gets tossed in on a sabermetric primary website.
    Something has changed and the specific formula hasn’t been presented to me that is the unluck.
    If we go by hard hit balls then Schebler or whatever his name is would still be up with the Reds playing. I don’t find the hard hit balls stat track very useful.

    • What adjustment is there to be made when you hit a hard line drive, but it goes right at the outfielder? Do you not want your players to try to hit hard line drives as much as possible?

      Again, unless you’re trying to say that players can (or should) be able to control EXACTLY where the ball will end up, then there is an element of luck involved. Whether a line drive goes right at a fielder, or drops in a few feet away, is not something that can be controlled by the hitter. It’s just not.

    • Dan, no one is trying to explain away Votto’s increased strikeouts this year as some form of bad luck. He’s not making contact nearly enough, and that is really hurting him. That is where the adjustment needs to be made.

      But when he has made contact, he hasn’t gotten the results that he should’ve based on how the ball has been hit. Just the same as saying that Duvall has played well, but has also benefited from some poorly hit balls that still managed to become hits. That’s where the luck discussion comes in.

    • Perhaps you should re-read the beginning of the article rather than getting hung up on the word “luck.”

      It’s fine if you don’t find hard hit balls useful. I’m sure you also don’t find the math behind load-bearing walls useful, but I bet you’re happy the engineers took the time to figure it out. 😉 Hopefully in a few years you’ll be happy some baseball fans decided to pay attention to hard-hit balls.

Comments are closed.


Fridays Above Replacement