Advanced Metrics

Hitting as well as can be expected

The Statcast system – Doppler radar and high-definition optical cameras – is installed in every major league park. Since 2015, Major League Baseball has tracked movement in a baseball game at 30 frames per second. On any pitch, Statcast records more than 600 measurements, which over a season is four billion data points.

This deluge of information provides more ways to measure and evaluate players. Yet, critics say it detracts from the enjoyment of baseball. But for the average fan, data can produce richer narratives. Did Scott Schebler’s shoulder injury lower the velocity of his throws from the outfield? Is Joey Votto swinging at a steeper launch angle? Is Raisel Iglesias throwing his slider with more break? Is Billy Hamilton the fastest runner in baseball?

For the business of running a major league team, wonderment takes a back seat to analysis. Measurement and evaluation are crucial to understanding your team and players you are interested in acquiring. The big payoff is if an analytics department can design leading-edge statistics using that data to better predict player performance and therefore uncover meaningful markers for building rosters.

A few stats based on the new data are making their way into the public sphere. One is expected, weighted on-base average (xwOBA), which is a way to evaluate hitters.

Can xwOBA offer insight into the Reds roster and pending offseason decisions? Before we get to that, let’s work through what the statistic is and then what it means.

[If learning details of new statistics doesn’t crank your gears and methodology makes your eyes glaze over, skip to the final section titled How xwOBA Informs Reds Decisions for the fun stuff.]

Start with On-Base Average (OBA)

The starting point for xwOBA is an older statistic, on-base average (OBA). OBA is another name for on-base percentage (OBP). It’s an easy statistic no matter what you call it. OBA measures how often a hitter gets on base. More formally, OBA is a rate statistic that credits a hitter with each safe event (walk, hit-by-pitch, single, double, etc.).

OBA does share a deficiency with batting average (AVG); it doesn’t differentiate between positive outcomes. All ways of getting on base count the same. Home runs improve your batting average and OBA the same as singles. So as an individual measure, it offers no way to show how much power the hitter possesses.

One way to fix this shortcoming is to count the extra bases a hitter achieves – one extra base for a double, two for a triple and three for a home run. This is the basis for the popular statistic we know as Slugging Percentage (SLG) and a more modern formulation, Isolated Power (ISO).

While SLG and ISO offer additional important information to AVG and OBP, they aren’t as precise as possible. Doubles aren’t quite worth twice singles in creating runs, home runs aren’t four times better.

To address that we can weight positive hitting outcomes based on run creation instead of extra bases. Every method of getting on base contributes to run scoring in one or more of three ways: (1) driving in runners already on base, (2) advancing runners already on base, and (3) putting a new runner on base. Each type of safe event affects those three factors differently.

That brings us to “weighting” OBA based on run creation.

Weight Hitting Outcomes

Weighted on-base average (wOBA) is a statistic first introduced by Tom Tango, Mitchel Lichtman and Andrew Dolphin in The Book: Playing the Percentages in Baseball (2006).

wOBA weights each positive hitting outcome (HBP, unintentional BB, 1B, 2B, 3B, HR) by the average number of runs it produces. The weights aren’t hypothetical. For wOBA, run production is based on actual baseball games played across the league in a given season. For example, the weighted value of a single comes from the average run production of every single hit in major league games. Assigned weights aren’t theoretical; they are empirical.

[The reason you haven’t seen wOBA cited at Redleg Nation much is that it’s basically the same as weighted runs created plus (wRC+) which we do use frequently. wRC+ takes the data for wOBA and puts it on a 100-point scale (that’s the “plus”). A 100-point scale makes comparison to average and calibrating year-to-year variations easier.]

Back to weighting on-base outcomes. Here are the weights for each safe event so far in 2017:

  • wBB = .693

  • wHBP = .723

  • w1B = .877

  • w2B = 1.231

  • w3B = 1.550

  • wHR =1.976

The weights are constructed to produce a league-wide wOBA equal to league OBA.

Weighting transforms a statistic (OBA) from one that doesn’t differentiate between types of hits to one that reflects the productiveness of the hitter in terms of run creation. And not just his runs batted in (RBI), which is easy to count; also the precise role that advancing runners and putting runners on base plays in scoring runs.

wOBA had been around for a decade before the Statcast data arrived. That brings us to expected weighted OBA.

Add Expectations

Statcast measures batted balls for Exit Velocity and Launch Angle. Every combination of Exit Velocity and Launch Angle produces a specific average wOBA. Again, these aren’t hypothetical numbers. The wOBA values are based on the outcomes of baseballs hit that way this season. For example, take every ball hit at 103 mph Exit Velocity and 25º of Launch Angle and figure out how many runs they produced.

From that set of data – the wOBA for every EV and LA combination – we can figure out how many singles, doubles, triples and home runs a batter should have hit, independent of fielding/defense and random chance about where the balls fall. That’s because we know the EV and LA combination on every ball that every batter did actually hit.

Take that data for a hitter’s actual batted balls, and add in established run created weights for his actual walks, HBP and sacrifice flies, and you come up with a statistic that shows what his run production was, independent of factors beyond the hitter’s control.

That’s expected, weighted OBA.

xwOBA removes random elements of defensive plays and luck. Like wOBA, it’s put on the scale of league OBP. League average OBP is (.322).

What is a good wOBA or xwOBA? Here is a guide:

  • .390 = Top 10 of qualified players

  • .370 = Great

  • .340 = Above Average

  • .322 = Average for qualified players

  • .310 = Below Average

  • .300 = Poor

  • .290 = Bottom 5% of qualified players

Research shows xwOBA offers better predictive value of a hitter’s future run production than his actual wOBA.

This is a fancy way of backing up traditional baseball scouting. For decades, baseball scouts have said something like: “Player A is only batting .250, but he’s sure been hitting the ball hard. The hits just haven’t fallen in. I expect he’ll produce more in the future.”

Today, radar and optical cameras allow us to take that scout’s insight, validate it and attach an exact number. That’s a powerful tool.

The top 12 major league hitters by xwOBA include: Aaron Judge, Mike Trout, J.D. Martinez, Freddie Freeman, Paul Goldschmidt, Anthony Rizzo, Giancarlo Stanton, Bryce Harper … and Joey Votto. So it’s a decent sorting stat.

OK, that’s the method. Let’s apply it to the Reds and a few of their upcoming roster decisions.

How xwOBA Informs Reds Decisions

Here’s a table showing the wOBA, xwOBA and the difference between the two numbers for the Reds regular position players.

The final column is the player’s xwOBA rank among hitters with at least 200 plate appearances. 320 players – or about 10 per major league team – meet that benchmark.

Positive numbers in the third column mean expected hitting exceeds actual hitting and argues for optimism regarding that player going forward. Negative numbers (in red) indicate the opposite. Remember, studies show that xwOBA predicts future run production better than wOBA.

Do these numbers tell us anything about decisions confronting the Reds this offseason?

Well the data is in a table, so it must mean something. Or four things.

1. Zack Cozart and Scooter Gennett

Bright red warning signs flashing for both players. Their expected production based on how they’ve hit are well below actual production. This confirms a common sense reading of their career numbers. xwOBA is a strong indicator that neither player will repeat their huge 2017 seasons. In Cozart’s case, a xwOBA of .336 is still well above average.

2. Adam Duvall and Scott Schebler

Duvall and Schebler have had similar years in actual production (see their respective wOBA). Which one to keep? Other factors offer split advice: Duvall has an edge in defense, Schebler in age. But xwOBA strongly leans toward Schebler. He’s hit the ball hard and not seen the expected production yet. Duvall, just the opposite. Clear evidence for how the Reds should handle the crowded outfield.

3. Billy Hamilton and Jose Peraza

Hamilton and Peraza are at the bottom of many league-wide batting stats. That’s because neither hits the ball hard. So it isn’t surprising that xwOBA is awful for both, since it’s based on the quality of ball striking. But Peraza has a clear edge. xwOBA is another piece of evidence to throw on the pile that shows we shouldn’t expect Hamilton to get better at the plate.

4. Joey Votto

We aren’t worthy of Joey Votto. But we didn’t need Doppler radar pulses to tell us that.

66 thoughts on “Hitting as well as can be expected

  1. Very fascinating content Steve. Granted, offensively this table illustrates that BHam and Duvall might be the best to trade this winter. But those 2 gold gloves both players are probably going to be awarded will tell a different tale.
    Gennett certainly seems as a trade high candidate, even moreso after this evidence. Re-signing Cozart looks more and more implausible. I don’t think the Reds can keep both of Duvall and Schebler, and this suggests that Duvall might as well be a sell high on candidate instead of Schebler.
    Dove tail this information with salaries and the 2018 budget, and a clearer picture is emerging of what might transpire on the roster front this winter.
    Nice work Steve.

  2. Excellent stuff, sir! Thanks for posting this. I’m not sure I change a resigning decision with it though. I’d still resign Cozart, but use this data to draw a hard line in the sand on numbers, if too low for him, then he walks.

    Opposite with Scooter– If a trade is made i demand a return equal to his current not predicted production, but if this isn’t granted, I make future decisions internally based on the predicted.

    When you say ‘clear evidence’ about what to do in the OF, what do you mean? I think this data makes it cloudier than ever before! Schebler looks like a potential All-Star, but he isn’t one yet. And BHams biggest asset to the team isn’t captured in this stat.

    Thanks again for your thought-provocation!!

    • Pretty sure he means its a no brainer that the Reds should keep Schebler over Duvall. Schebler’s xwOBA: above average, Duvall’s : poor.

  3. I could probably look this up, but how does striking out, but reaching base (on passed ball or wild pitch on third strike) factor in? Avg, OBP, and the advanced stats?

    • Like an intentional walk or reaching on an error, treated as a no positive batting outcome.

      • Just curious, why is a HBP worth more than a walk? With a walk there is always a chance you can advance more than one base due to the possibility of a wild pitch, while with a HBP you only get one base max.

        • That’s an interesting question. First of all, it’s supported by the facts of what has occurred in games. The weights are descriptive, not theoretical.

          So the question is why are HBP more associated with run scoring than are BB? A few theories have been advanced. The one I like the best is that HBP is more of an indication that the pitcher doesn’t have control than is a walk. Sometimes pitchers work around hitters and walk them, knowing they can get out of innings. Pitchers rarely hit batters on purpose. Maybe there is a mental impact on the pitcher from hitting a batter that carries over to subsequent batters. Walks do tend to occur when they do less damage.

          • I think your last sentence is the once that makes the most sense. HBPs are mostly randomly distributed, so they can happen in really bad times, whereas walks can sometimes be pitch arounds that help the pitcher, like walking Votto to get to Duvall before Price finally dropped him in the lineup.

  4. Great work Steve. I was surprised to see the variance between Duvall and Schebler.

    Quick question—-Do shifts skew the data? Meaning, does Schebler’s left-handedness and the resulting shifts used against him depress his output in a way that could be the norm moving forward?

    • Re: OF – Me, too. Also the drop off for Gennett and Cozart.

      I haven’t read anything about shifts in relation to xwOBA, but there is probably stuff out there on it. I did think about it a while and my guess is that batters who hit into shifts put themselves at comparative disadvantage vs. expected production. That might explain some of what’s going on with Schebler, but I don’t know. It doesn’t seem to show up for Votto or Gennett, other lefties who hit into it.

      I think you’ve hit on a good area of thought about xwOBA – hitters who have tendencies in where they hit the ball (like extreme pull hitters) who deviate systematically from what the average hitter – and thus average wOBA associated with a EV/LA combination – does. Hitting into a shift might be another example of that.

      • We already have spray charts for players. I don’t know if this is based on the eye test or not. Statcast should easily be able to do this. Since it can get the vertical launch angle, it should be able to get horizontal. With this, it could be weighted differently for lefties and righties, and also based on LHP/RHP.

  5. It’s not easy to look up xwOBA. The best/easiest way I know is at Baseball Savant, the site that MLB uses to house Statcast results. Use the Statcast Search function there, and (1) switch Player Type to Batter from the default Pitcher, (2) choose xwOBA in the Sort By field, and (3) set a minimum number of plate appearances high enough to rule out pitchers to make it easier to read through.

    https://baseballsavant.mlb.com/statcast_search

    Pretty soon, I expect to see xwOBA start to appear in mainstream sites like FanGraphs and Baseball-Reference. I’ve seen more and more writers use it.

  6. I really…REALLY…enjoyed that article. I had one question about Schebler and the “hard hit balls that couldda/shoudda been hits”. Schebler appears to me to be more of a pull hitter than Duval. Just my impression but let me know if otherwise. It also seems that he sees more shifts from opposing teams. As a result, many of these rockets he hits sees a glove that might not be there otherwise. How exactly is this (the shift element) factored into these more sophisticated metrics?

    It also appears to me that we give Schebler a lot more slack for his numbers because of the injury that he played through…but I’m not totally convinced. Now…don’t get me wrong…I like the guy but I’m not sure with the defense that Duvall is not the preferred keeper. Can you convince me otherwise ? Great job !

    • See my reply to sultanofswaff’s comment above about the shift. I don’t know the answer for sure. But my impression of how xwOBA works is that it would over-value hitters who face shifts a lot and hit into them. It would mainly affect singles. There might be a similar dynamic with extreme pull hitters.

      Regarding Schebler v. Duvall – they each pull the ball about 44% of the time. It would have to be an extreme pull – down the line – to cause under-valuation by xwOBA.

  7. So where is the metric that incorporates speed into the players worth? wRC and wOBA should be thrown out the window for players like Billy and Peraza, since they aren’t power hitters, and their biggest asset isn’t used in such statistical metrics. Such a stat would also negatively affect just about everybody else. Maybe WAR comes close??

    • You can’t “throw out the window” the measurements of hitting just because certain players are bad at them. Billy Hamilton is TERRIBLE at hitting and producing runs because he hits the ball so softly. But yes, you have to add a speed stat in to get a full picture, and defense. xwOBA is just about producing runs at the plate.

      Billy Hamilton isn’t a magic run producer with his speed. Speed doesn’t drive runners in or advance runners, two of the three parts of producing runs. xwOBA does under-value fast runners in the sense that they can get on base (or farther on base) than on an average hit. For example, a fast runner can reach first base on more topped balls than a slow runner. But Hamilton’s batting average is still really low.

      Here’s some research showing the affect of base running speed on xwOBA is low:

      https://www.vivaelbirdos.com/2017/8/21/16176812/statcast-sprint-speed-tommy-pham-yadier-molina-albert-pujols-cardinals

      To answer your question: FanGraphs has a speed/base running component in their WAR calculation. It looks at net value of stolen bases plus other base running and figures out how many runs above average the speed is worth. Hamilton leads MLB with 10.5 runs above average. That’s good for one win/year. It’s essentially all of his value. His lousy offense cancels out his great defense.

      Hamilton’s overall WAR – accounting for offense, defense and base running is 1.2 at FanGraphs, 0.9 at Baseball Reference and 0.9 at Baseball Prospectus. Very low.

      • Thanks for the response….very well explained.
        For arguments sake, if 60 of Billy’s singles were doubles and he had 0 stolen bases he would only have the equivalent of about 1 WAR?

        I understand he is not advancing runners or driving them in when he steals a base. But he is advancing himself, why is that not in the computation?

        • The difference in a double and single (listed above) is about .35 runs… so if he were to turn 60 singles into doubles, that would be 60*.35, or about 21 runs. Based on the current runs-per-win for fWAR of 10.07 this season, that’s just over 2 WAR.

          Then, you’d have to subtract his net stolen base value, which is 7 runs according to FG (meaning 3.5 of the 10.5 Steve mentions above is from other base running), and you’d be at 14 runs, or right about 1.4 WAR.

          It’s not in xwOBA, but it shows up in actual wOBA, so with him, you’d have to do the sort of calculation we just did, along with some other stuff, to try and tease out the effect of him turning “expected singles” into “actual doubles.”

          To be honest, I think the effect is quite low. He’s got 17 doubles this year and I bet maybe only half of them were the “expected single” variety.

        • xwOBA is meant to evaluate hitting, not overall performance. WAR, which does incorporate stolen bases, measures the overall value of a player.

          A single and SB is nowhere near as valuable in run production as a double. As you point out, doubles are better at driving in runners. If a single doesn’t drive a runner in from 1st or 2nd, adding a SB on top of it wouldn’t help. Same with advancing runners.

          But adding 60 SB would increase the chances of your own run scoring, that’s why it’s added in WAR. It’s not in xwOBA because that’s not what it measures.

    • Billy’s wOBA/xwOBA differential might be the only one above that may be related to talent, as opposed to luck.. he can use his speed tool to get an extra base out of batted ball, sometimes. That .03 difference is enough to move him 1-2 tiers in Steve’s guide above; it is the difference between below average and above average, or between above average and great. The big problem, as Steve rightly responded, is that the final number, even after the discrepancy bump, is still SO low that it’s a full tier below the “bottom 5%” tier.

      This saddens me, because what Billy excels at- taking extra bases and making crazy diving catches- is a very entertaining part of baseball, it just isn’t as valuable (in runs created or wins) as hitting for power and drawing walks. I suspect it still has great value in terms of ticket sales and dollars, which matters to owners quite a bit.

      This is a bit of a pet peeve of mine.. the analytics are encouraging plate discipline and walks, while viewers want shorter games and flashy plays. Baseball is looking at pitch timers while simultaneously providing stats to players that suggest working an 11 pitch walk is a great thing. Baseball should be using their stats to try and find possible rule changes that would lead to shorter, more entertaining play; what can we do to make putting the ball in play early in the count more valuable (in runs/wins) than plate discipline? If that becomes more valuable, then different skill sets are encouraged and promoted from lower levels, and we may end up with a better product on field at MLB level.

      • The most productive thing is always going to be swinging at good pitches to hit, and minimizing swinging at bad pitches to hit.

        I don’t think there’s ever a way around that because of physics and the length of the bat and whatnot.

        • Andy is dead on though with respect to entertainment value. It really is boring to watch home runs and walks and strike outs and not much of anything else. The game’s long term future is in jeopardy. Basketball in some ways has surpassed baseball because not enough is happening on the field.

  8. The ‘problem’ as we discuss off season moves, especially trades (and the trade value of each player), is that the other teams all have this info, too. And more, much more data, as well. If Duvall is already passed peak age and is only predicted to be the 233rd best hitter in the league, it would take a unique situation for a trading partner to be willing to send much value back in the Reds’ direction in a trade.If Gennett is expected to be just below average in hitting while being below average in base running and defense (per Fangraphs), he won’t gain the Reds much in most trade scenarios – again, finding a trading partner that would more highly value him for their own reasons is key. Or conversely, finding a way to maximize players’ values to the Reds production next year (e.g. platoons) can bring the Reds more value than a trade.

    • This is a great point. I’m working on a post (nowhere near done yet) about possible Reds trades that take this thinking into account. I think readers here will find them provocative and interesting.

    • I feel a little bad for Duvall. He had a .904 ops at the end of June. He was still at a healthy .845 thru August 19th. He was right near the top of the NL In extra basehits into July as well. Not to mention he’s an excellent LFer and leading the league in assists. I think he just wears down? Baseball is hard and playing baseball every day w/diabetes has got to be difficult? Its that simple.

      Price is not a good manager and doesn’t utilize his personnel correctly. I don’t know how many times he’d play Tucker vs a lefty and then turn around and give Mesoraco a spot start vs a righty. Hamilton leading off while Winker is 6th, etc. How many weeks did he pitch Bob Steve 1 inning a week in a blowout? How did that help develop him? Develop his bank account compared to AAA maybe? Price.is.an.idiot PERIOD! Duvall may not be a legitimate All-Star but he’s not Jonny Gomes either.

      • I have often wondered what Duvall’s production would be if he were to get regular rest. I agree with your assessment – course no way to prove it at the moment but if he stays with the Reds next year, he needs at least one day off a week.

        • And that, more than any other reason, is why extending Price’s contract for 2018 was an absolutely horrendous move. Not only will price not put the team and players in the best situation to succeed, he will continue to risk the health of players.

      • Only thing I can add is he did it to Duvall again just like he did last year.Price just does the same things over and over but in all honesty he has never been held accountable.I expect him to do the same things in 2018.

  9. That is my problem with most of the “we should trade x for a starting pitcher” ideas, usually it is based on the premise that player x is either horrible is is due for a regression. The rest of the leauge has the same info on these guys, so unless they really value some specific metric higher than everyone else, i.e base running, you aren’t getting a # 1 or 2 pitcher in return. Obviously if some front office is obsessed with getting players who can steal a base and will overpay for a guy like Hamilton then you take it. I find it unlikely you can package Hamilton, Scooter, and some AAA guy that wasn’t good enough to make the Reds rotation for a cost controlled #2 starter. The other clubs aren’t basing the decision off ESPN top ten highlights of diving catches and four homerun games

  10. Steve, is the data for xwOBA available in periodic or segmented results (i.e. monthly, pre all star break, cumstomized dates, etc? I was specifically wondering how injuries might impact season-long results. That might impact predictability for players like Duvall who struggles late in the season, Schebler who played with a barely functioning shoulder or Cozart who is playing with two gimpy quads.

  11. Well, there it is. I knew it would be just a matter of time before Sabermetrics was used to justify getting rid of a Cozart and Gennett. It sounds as if xwOBA is your bible. ” This is what xwOBA says, therefore we do what xwOBA says! ALL HAIL XWOBA!” Let’s hope the Reds don’t listen to this.

  12. “Well the data is in a table, so it must mean something. Or four things.”

    A man after my own heart!

  13. After reading some of the comments above about Duvall, resting, and the FG article about Tony Cingrani, it’s possible another team may take him as part of a LF platoon that gives him regular rest and only plays him in situations advantageous to his strengths, while hopefully having another OFer to complement that. Perhaps someone like Jesse Winker would have been this season…

  14. An all time great RLN discussion here…thank you, Chad and all who have followed as editors and contributors.

  15. I would love to know how many teams construct their own metrics as opposed to using known metrics such as those discussed in this post. I suspect that the teams hiring at the Ph.D. level in machine learning and data mining are constructing their own.

    • I think pretty much every team has their own calculation for some kind of WAR equivilant, as a quick and dirty measure. I also imagine that most analytic departments have propriatary statistics that they generate and use for player evaluation. I mean, these guys are doing this at least 8 hours a day, 5 days a week. In a lot of cases I’m sure it’s even more. What is being done with the Statcast data that isn’t public domain? There is a lot more out there than what Baseball Savant has. The amount of data is staggering. I’d love to know what teams are doing with it.

  16. Incisive, cutting edge stuff Steve, oh, er…I mean, Mr. Jonah Hill 🙂

  17. Already mentioned above but bears repeating – Schebler may hit the ball harder but so often he simply pulls the ball into the teeth of the shifts employed against him. Surprised actually he only pulls the ball 44% of the time, it feels more like two thirds at least. If he could develop into a ‘hitter’ like I’ve seen from the Royals or the Cardinals players he’d slap the ball the other way or bunt easily to get on base – heck during this amazing Indians run saw a few Indians do exactly that.

    If it’s a choice between Schebler and Duvall for me Duvall’s defense is far better than Schebler’s and outweighs how much harder Schebler hits the ball. Until the second half of the year when Duvall wore down again, Duvall also hit much better with runners on base than Schebler.

    It’s not either or though; given guys get hurt, you need depth, and because both Duvall and Schebler are relatively inexpensive wrt their production in 2018, it is better served for the Reds to keep both and rest them both plenty or platoon them until those times Billy is hurt diving for a ball or bunting badly. Winker of course should play everyday because he will contribute more than either Duvall or Schebler on a day in day out basis.

    • I’m willing to bet that what you’re perceiving has a lot to do with Schebler’s pull % when he hits the ball on the ground. I haven’t looked but I bet it’s just as you expect, much higher than 44%. The 44% is for all balls he hits including the ones in the air and on a line. On grounders it’s probably higher. If I get a few, I’ll look it up.

      • 2015 – 66.7%
        2016 – 61.2%
        2017 – 54.5%

        Going the right way, at least!

        Although, shifting is more of a pull-to-center thing, since they usually have a guy standing right by 2nd base.

        So, let’s look at Opposite Field rate.

        2015 – 16.7%
        2016 – 6.8%
        2017 – 9.8%

        So, yeah – 90.2% of his grounders are pulled or up the middle. About as easy to shift on as you can get.

  18. Price won’t platoon Billy as long as he is still a Red.He will play everyday and bat leadoff.Hard for me to keep Winker on the bench but its equally hard to keep 30 homer outfielders on the bench.Don’t know if the nation can handle Billy getting 5 at bats a game while one of the other 3 guys watches from the dugout.I know I can’t but if we think Price is going to set Billy if he is healthy then well we know better.

  19. Interesting and thought provoking. After you get past the hypothetical trade options which only the real market can determine, the only real decision is Cozart.
    My #1 priority would be to sign Suarez to a long term extension.

Comments are closed.