Advanced Metrics

Defensive metrics and Eugenio Suarez

Baseball has been a game of numbers and keeping stats from the start. The New York Morning News printed the first baseball box score in 1845. Other newspapers followed suit with printed tables of statistics after each game.

One of the earliest baseball outcomes measured was the defensive error. Philadelphia shortstop Billy Shindle made 122 of them in 1890. That remains the single-season record. Herman Long holds the major league career record with 1096 errors committed between 1889 and 1904. Those records appear safe unless players start using gloves from the 19th century.

“An error is a mistake by a fielder that allows a batter to reach base, or a runner to advance an extra base, or allows an at bat to continue after the batter should have been put out.” (Baseball-Reference) An official scorer judges whether an error has occurred based on if the play could have been made by “ordinary effort.”

For decades, broadcasters and fans have used the number of errors committed and that stat’s first cousin fielding percentage as the criteria to evaluate the defensive skill of players.

The profound weaknesses in that practice are obvious. Skills such as defensive range and arm strength remain unconsidered. In fact, a player with an above average range exposes himself to the real possibility of more errors.

Consider this example: Two fielders are confronted with an identical set of 100 ground balls. Fielder A gets to 60 of them and makes a perfect fielding play each time. 10 times his arm is too weak to beat the runner. Fielder A recorded 50 outs on the 100 ground balls. The number of errors he commits is zero and his fielding percentage a pristine 1.000.

Fielder B gets to all 100 ground balls, but 10 bounce off his glove or roll through his legs. He also makes five inaccurate throws to first base. The official scorer assigns 15 errors to Fielder B and his fielding percentage is .850. But Fielder B has recorded 85 outs, compared to 50 by Fielder A.

Fielder B is more valuable to his team, but to the average viewer – and traditional box score reader – he looks much worse.

Errors would be an incomplete and misleading indicator of defensive skill even if they were determined in an objective way. Instead, they are based on what a human being thinks should have happened. Bill James has pointed out that a baseball error is the only major statistic in sports determined that way.

Think of the questionable scoring decisions – whether due to an honest mistake or hometown bias – we see all the time. If you doubt that those judgments are subjective, consider that players can call the press box and have scoring decisions overturned. And yet, these decisions are the foundation not only for traditional defensive stats, but also in calculating ERA, one of the most trusted measurements used to evaluate pitchers.

Errors and fielding percentage aren’t meaningless. In fact, they are important components of evaluating defensive skill. But alone they fall far short in providing a complete picture.

Accurate measurements are crucial as defense is being stressed by teams more than ever. Players like Jason Heyward and Elvis Andrus have earned massive contracts in large part because of the defensive skill they offer.

To correct the shortcomings in traditional defensive stats, we’ve seen the creation of several advanced measurements (metrics). Stats like Ultimate Zone Rating and Defensive Runs Saved among others have become available to the public at sites like FanGraphs.

Many fans and broadcasters question the reliability of the new defensive metrics. Variables like range and arm strength are hard to measure. There is no agreement on which defensive metric(s) is best, and different metrics produce dissimilar results. Sample sizes are pretty small considering the tiny number of difficult batted balls hit to each fielder in a season. A couple unlucky or injury-induced bad plays can make a big difference. And even the folks who created the metrics warn against taking them as gospel.

Skepticism is understandable. Let’s try to demystify what’s going on.

New defensive metrics attempt to answer two simple questions: How many plays should the defender have made and how many did he make?

Ultimate Zone Rating (UZR) is set on a scale where 0 (zero) is league average. A positive UZR means the player was better than average as a defender, a negative UZR means below average. UZR is expressed as the number of runs a player saved or cost his team due to defense. If you want a detailed explanation, read this post by Michael Lichtman, who came up with UZR.

UZR uses a zone-based method. The field is broken up into different zones and players are assigned value based on how many plays they make in each zone, compared to how often other players make the play. When Billy Hamilton makes a catch in centerfield that no other CF snags, he gets more credit than for a routine play. UZR uses many years of data when deciding how much to award or penalize a player. It’s also park adjusted.

Defensive Runs Saved (DRS) is calculated by The Fielding Bible, run by John Dewan. DRS is similar to UZR in many ways. DRS uses the same scale, with zero as league average, uses zones and adjusts for park factors. The biggest difference is that DRS uses a one-year sample when comparing plays. DRS also uses smaller zones because they think that makes their ratings more precise. DRS also factors in defensive positioning.

Both UZR and DRS are based on every play in each game. People employed by Baseball Info Solutions watch game video and code players and zones. They determine ball velocity and angle using objective criteria. But humans don’t estimate the difficulty of a play. A computer algorithm compares one play to others.

But even that limited role for humans is about to change. Statcast – MLB’s revolutionary, state-of-the-art tracking technology – is now measuring defensive data much more precisely than we’ve ever had before. Its data will take the small residual human element out of coding. UZR and DRS can use Statcast data to refine estimates of the exact amount of ground a player cover, as well as the velocity and angle of the hit ball. Statcast also provides an exact location for all defensive players at the time of the pitch.

“Statcast is taking the zone-based fielding models further and further,” said Sam Grossman, Reds assistant general manager, to our Redleg Nation Q&A group earlier this month. “We’re doing a lot of things we couldn’t do two years ago when we just had the zone and where we thought the player started. It gives us a much more accurate view of what a guy’s range is.”

Yes, Statcast still has data gaps that must be plugged. But it’s a gigantic stride forward in measurement, with improvements each year.

With that context, let’s talk Eugenio Suarez and what to make of his defensive ability.

Suarez has a positive UZR score in 2016. His range (+5.5 runs) and double play (+0.4 runs) skills have offset his negative error (-3.1 runs) factor. Arm strength – and that’s a plus for Suarez – isn’t measured for infielders. Overall, Suarez has a positive 3.4 UZR when extrapolated to a 150-game season. Suarez has a +1 runs DRS score. That stands in stark contrast to his extremely negative ratings from 2015 when he played shortstop.

There’s sense to be made from all that. The skills that are portable from Suarez’s time at shortstop – range and turning double plays – are his strengths at 3B relative to others who play that position. Suarez gets to more balls. But he’s still the same bad-hands, erratic-throws guy that he was at shortstop for the Reds last year.

Suarez leads all NL players in errors committed with 22 (Billy Shindle can rest easy). Suarez was similarly error-prone at shortstop, with 19 in just 96 games for the Reds in 2015. Errors are glaring; range relative to other 3B is less obvious, even to those of us who watch every night. That explains the paradox of how Suarez often can look so bad in the field but yet have positive advanced metrics. Moving his shaky glove from SS to 3B has helped increase his net defensive value to the team.

It’s also important to note that Suarez’s 2016 numbers are all pretty small in absolute terms. We’re dealing with defensive runs above average, not wins. The rule of thumb is that a 10-run swing equals one win. Most individual runs don’t impact the win or loss in a game. In a depressed run environment, maybe 9 runs per win is right.

A UZR or DRS of +5 is considered “above average” but not “great.” It takes a score of +10 runs to be “great.” A player with a +15 UZR/DRS score is considered Gold Glove caliber. Billy Hamilton’s UZR/150 is 16.8 runs and his DRS is +14.

A defensive player with a UZR or DRS that’s +/- a few runs should be considered simply average. That’s Eugenio Suarez at third base.

23 thoughts on “Defensive metrics and Eugenio Suarez

  1. Nice analysis. It did seem that Suarez improved as he got more comfortable. And he does make some great throws.

    Now Billy’s numbers are outstanding … but we knew that, didn’t we?

  2. If I’m understanding this article right, you do highlight the importance of errors. In other words, you seem to be saying that errors should not be entirely overlooked or dismissed as a measure of defensive value, but that it shouldn’t be taken as the primary or sole source of measurement. I get that, I do, but on the other hand you seem to be saying that we should be ok with a relatively high error count so long as his other defensive metrics are average to above average or better. I admit that this confuses me and seems conflicting. How it’s possible for a player to have positive defensive metrics in spite of a relatively high or flat out high error total is beyond my comprehension at this point. But, as of right now, I don’t know if I could live with player B’s 15 errors just bcuz he has more putouts. Yes, I understand that 85 putouts are better than 50, but at the same time the 15 errors for player B would be hard to ignore. The ideal, obviously, is to hopefully find a player that has a high number of putouts combined with a low error total. But something tells me that those kind of players are far and few between and that we should be ok with players who commit a decent amount of errors so long as their putouts are relatively high since that seems to be the kind of player most readily available bcuz of their sheer numbers. Now, I apologize if I’ve missed the ball (so to say) on some things (which I’m willing to admit bcuz I’m just not sure about how to take the seemingly conflicting stats of average or positive defensive metrics of a player with a goodly amount of errors). The main area I think I could be wrong on is the amount of available players who have both a high putout total combined with a low error total. I simply don’t know how prevalent those kinds of defensive players are in relation to average defensive players (one’s who have positive defensive metrics with a relatively high error total). Like most (or hopefully all) baseball fans though, I don’t like errors being committed on a regular or even semi-regular basis by my team’s players, and I don’t know if I could live with that year-in & year-out (even if his putouts are high).

    • PS: I also apologize if I’ve misunderstood what exactly you were trying to say about players with positive defensive metrics with a decently high or high error total. I’m sorry, I’m just very confused about what’s exactly being said in this article and how I’m supposed to take these kinds of players.

      • I think Steve’s illustration, provided in paragraphs 6 through 8, will give you the gist of how a player with good range and a strong arm can compensate for making errors, even a lot of errors.

    • I think the key is that on the 50 balls that player A didn’t get to, they would most probably be hits. So with player A you have 50 hits compare to B only having 15 runners reach base. 15 runners compared to 50 is the value.

    • Each team only gets 27 outs – just as not making outs is basically the most fundamental aspect of offense, converting balls in play into outs is the single most important thing you can do on defense. Player B created more than an entire game’s worth of additional outs, over an above Player A. That’s huge.

      Don’t get stuck on the errors thing – we’re used to thinking errors = bad because we all grew up watching and reading that message over an over again. But errors were really just a proxy, a stand-in for whether the player was making an out on defense or allowing the other team’s hitter to reach base safely. And now we understand that there are other ways in which a defender can turn batted balls into outs, aside from just errors.

      • Adrian J Loder errors are bad, that’s why they’re called errors. A lot of errors is even worse

  3. I’m a big Eugenio fan but I’m actually more concerned with his bat! He’s 2 for his last 29 and really hasn’t had a very good year w/the bat despite the 20 HRs. You can see him trying to take pitches and work the pitcher but his swing just seems too long!

    Senzel will take over 3rd at some point and Herrera might end up as a better option at 2b. I’m not giving up on Suarez but he looks like a utility guy or trade bait unless he picks it up! When he’s right and spraying the ball from foul line to foul line then he looks like he could hit .280+ w/some power every year!

  4. I have heard all year that Jay Bruce (this year) is one of the worst defensive outfielders (based on metrics such as UZR) in the majors. I have not watched him since he went to New York, but throughout his career here he was a very good defensive outfielder. To me, I didn’t see anything change significantly this year while he was with Cincinnati from previous years (defensively). Just wondering if anyone else saw the terrible ratings on Bruce and wondered how that negative rating could be justified.

    • Purely by eye test, I thought that Jay had declined significantly over the last few years, mainly in not getting to some balls that it seemed he would have gotten before. I also find it hard to believe that he is one of the worst defensive outfielders, though.

      • I agree. To me the greatest fall off with JB has not been horizontal so to speak but rather that he has trouble going back on balls and while often in the area doesn’t complete plays that my eyes and memory seem to think he would have in the past.

        In terms of the metrics Steve presented… Several folks have theorized here that JB actually suffered by playing next to BHam because BHam ranges so far into the gap that he often ends up catching balls Bruce would have caught in the past and still could get to and thus JB suffered a “false negative” impact on his range rating.

        Also a general comment about playing RF in CitiPark versus GABP, there is a lot more outfield space to cover at Citi.

        • Bruce was terrible the last couple years. I saw him run bad routes time after time. He was especially bad when it was hit right at him.

    • I started watching Jay on defense closely this year because I was a little skeptical of the numbers too. The more I watched, the more it really makes sense. He really takes some terrible routes to balls. There were multiple times this year a player got a double that should have been cut off for a single. That adds up.

  5. It seems to me that there should be some accounting of whether the errors committed (or the ball not gotten to, if below average range/arm strength) led to runs scored. I suppose in a totally random sense, such an error (or range) could occur at any time, and therefore, the more errors (range), the more likely a run scores and therefore errors (range) is a good surrogate stat for hurting the team. But maybe errors (range) are not totally random and depend on game situations. It would also seem that their impact on a game certainly is dependent on the game situation.

  6. A-E-I-O-U. Eugenio looked better in the field as the season went on. He has nicely snagged a bunch of hot drives that have come his way. And sometimes he flubs on the routine. He had 19 errors last year that broke down to 13 fielding and 6 throwing at SS. This year so far at 3B he has 22 errors that break down as 15 fielding and 7 throwing. That is a boatload of E’s.
    This makes me think that Suarez is better suited for 2B.
    As Reds Asst. GM Nick Krall talked in the RLN/RR Q&A session about player value and value to the team (Iglesias starter vs. reliever).
    Steve, is Suarez’s value better realized for the Reds at 3B or 2B going forward? This assumes of course that BP is not in the equation altogether.

    • Value becomes a comparative equation. Will Suarez be moe valuable than Herrera at 2B? Will Suarez be more valuable than Senzel at 3B? Those appear to be the leading candidates for 2B & 3B going forward right now, but things have a way of getting tossed around in the wash. Senzel hasn’t played above A ball yet and Herrera was still not fully recovered from his shoulder issue. The Reds also have several infield prospects who could force their way into the discussion along with a probable top 5 draft choice in the upcoming rule 4 draft (picture another Nick Senzel in the picture!).

      • Rule 4 draft has more arms at the top than there are bats. But a whole season of college and high school has to play out.
        In a way, way too early to forecast forecast, UF Catcher JJ Schwartz is at the top and FL high school SS Mark Vientos is at the top. He is 6’3″ and 180 and is a RH hitter. Good swing and good glove.
        I’d be leaning for the SS at this early juncture.

  7. Steve’s analysis of defensive metrics is very good. I like the defensive stats because it gives a broader and more complete view of a player’s defensive capabilities. Hamilton covers the most ground in CF for the Reds since Eric Davis. Hopefully the Reds concentrate on drafting the best hitters they can find. They play in the best HR park in baseball and they should find players who can take advantage of it. The Rockies have the best hitting park due to the spacious OF which allows a lot of balls to fall in for hits. GABP is the opposite of that but it provides a friendly launching pad for power and near power hitters.

  8. I tend to believe most of the defensive stats, in general, but it’s hard to get excited about a guy with great defense when he is playing at a position or in a lineup where more offense is needed. BHam may be worth 4 WAR total but nearly all on defense, and it shows as the fact is this team has no pitching. With a real pitching staff a sub1WAR hitter on offense leadoff is hard to win with. We need a true leadoff hitter with 3 WAR on offense and average defense. Also defensive value declines rapidly with age so it’s a bad bet for the future to assume steady value.

    That said, Votto needs to remediate himself on defense or his huge offensive contributions will be lessened. On the bases too!

  9. As Votto gets older the Reds should increase the number of days off that he gets. Also, with a potential new 2b, SS, and even 3B in the next 1 or 2 years could make the case for Suarez being a super utility guy that plays the entire infield. He could potentially get in 120 or 130 games a season in a role like that.

    • I think as he ages, it would make sense to perhaps get him a couple days off per month. That works out to him starting about 150 games. Every body is different though so we’ll just have to see how much/little rest he needs.

  10. Nice analysis. I actually would have gone even further than you did, though, and suggest that fielding percentage is basically meaningless in modern baseball. Why? Back in the day, when fields had uneven terrain, rocks and pebbles in the dirt, un-raked infield dirt, equipment that wasn’t uniform, no lights for nighttime play and non-existent/rudimentary gloves, a lot more errors were made, something you mentioned briefly.

    One of the things we need in order to properly evaluate a player’s quality of play is context – how does this performance compare to the player’s peers. In the 19th and early 20th centuries, because errors were much more common you had an actual continuum of performance to look at. If Player X made 100 errors playing 3rd base and Player Y made only 10, that was an easily-demonstrable difference in their defensive play.

    Now, though, with strictly-manicured fields, lights coming on when the sun starts going down, state-of-the-art gloves, etc., even the most error-prone players make between 20-30 errors as opposed to 60-100. This also compresses the overall distribution of error performance, so that it’s much harder to find statistically-significant variances. When the difference between Suarez and the middle of the pack 3B, in terms of errors made, is only 12 errors, that’s much harder to be sure of when you’re trying to assess true error-prevention skills.

    And Suarez is an outlier in this regard. When comparing between most players, you’re looking at comparisons like .990 fpct. vs .982, or .979 vs .988. The standard deviation of the distribution of errors has shrunk enormously, rendering the stat more-or-less irrelevant in today’s game. A player who makes 30 errors is still probably more error prone than a player making only 3 or 4 in the same number of chances, but when comparing anyone other than the two extremes, it’s not even hardly worth talking about.

    Also, as mentioned above, errors are essentially a proxy for “does this player convert batted balls into outs or not”. Back in the old days, even a player with good range was likely to make errors, as were players with good arm strength, due to the field and equipment conditions. As such, errors made a good proxy – those other performance factors didn’t distinguish players as much. Now that errors are so much less frequent, they no longer work nearly as well in this proxy capacity, and we have to peel back the onion, as you wrote about, and examine those other, long-neglected factors.

Comments are closed.