Baseball Strategy

In Defense of The Closer

With a bolstered cadre of arms, bullpen management has become a hot topic in the Nation. How should Rasiel Iglesias be used to limit his risk of injury? What about Michael Lorenzen? Is Tony Cingrani the closer? Why does Bryan Price continue to use Ross Ohlendorf in high-leverage circumstances?

Different managers take different approaches to handling their bullpens. A key variable here is the extent to which distinct roles are assigned to distinct relievers. Dearly departed Dusty Baker was well known for his rigid bullpen roles: at occasionally varying times, Arthur Rhodes the LOOGY (Lefty One Out GuY), Jonathan Broxton in the 8th, and Aroldis Chapman in the 9th.

The argument in favor of this management style is a simple one: relief pitchers, like all workers, perform best when they know what is expected of them. Pitchers perform better if they know when (early? late?) and under what circumstances (close? blowout?) they’ll be called upon, and what will be expected of them (one out? multiple innings?). As Jonathan Papelbon summarized earlier this year, “when you’re in the bullpen and you can feel comfortable about when you’re going to pitch, it makes your life a lot easier.”

Bullpen roles are not all sunshine, however. Many analysts argue that these roles prevent a team’s best relievers from pitching in the most important moments or against the best hitters. These analysts laud managers who use their bullpen less rigidly.

Amid this variety, one factor is consistent: every manager wants a closer. Some go long stretches without finding one, due to injury or poor performance, but all managers prefer a single 9th-inning security blanket.

This is very bizarre. The appearances of closers—typically the best relievers in a bullpen—are dictated by the parameters of the arbitrary save statistic. The role of closers is to collect saves; to get the last three outs in games where their team is ahead, but the game is not a blowout. In no other circumstance, in baseball or other sports, is strategy so consistently and transparently beholden to statistical accounting.

This seems the epitome of backwards thinking. But is it? Is there anything behind the argument that closers are more effective when fulfilling a role? Is the frailty of closer psyches—which are only human psyches, after all—such that it may actually be beneficial for managers to remain beholden to the save?

This post will make the statistical case that, yes, the closer role makes a meaningful difference on relief pitcher performance.

**

One fact that stands out in the attempt to make this case is that closers perform noticeably better—according to advanced metrics—when pitching in traditional save situations. Or, at least they did throughout the 2011-2015 seasons.

For purposes of this analysis, a closer is a pitcher (a) with 40 or more relief appearances who (b) makes at least 40% of these relief appearances in a traditional save situation: 9th inning, 1-3 run lead, no outs, and no runners on base.

Closer Situation K% BB% HR%
Non-save 26.3% 8.3% 1.9%
Save 28.2% 7.4% 2.1%
Difference 7% better 11% better 6% worse
Statistically significant? Yes Yes No

As you can see in the table above, the average closer has a 7% better strikeout rate, a 11% better walk rate, and a 6% worse home run rate in save situations. Using an A/B test, the improvements in K% and BB% are statistically significant, while the worse HR% is not.

At a glance, these numbers are consistent with the theory that closers perform better when pitching in their expected role.

However, these outcomes could also be plausibly explained by the pressure of the 9th inning. Down to their final at bat, hitters are tempted to be more aggressive and swing for the fences. This riskier approach sometimes pays off (hence the improved HR%), but is often unsuccessful, explaining the worse K% and BB%.

We need to unpack this data a bit more to demonstrate that the role-ness of the 9th inning makes a difference to closers.

**

One way to isolate the impact of the closer role from the impact of 9th inning batter aggression is to separate first-time closers from veteran closers.

The narrative here is straightforward: being a closer is not an easy job, and it takes pitchers a while to become comfortable with the role.

The following charts show the performance of first-time closers and veteran closers. First-time closers are those who did not close in the previous season. Veteran closers are those who did. With 2011 used as a baseline, this data includes only 2012-2015.

First-Time Closer Situation K% BB% HR%
Non-save 26.9% 7.9% 2.0%
Save 27.9% 7.3% 2.3%
Difference 4% better 8% better 11% worse
Statistically significant? No No No

 

Veteran Closer Situation K% BB% HR%
Non-save 27.7% 8.4% 2.0%
Save 29.7% 7.4% 2.0%
Difference 7% better 14% better 3% better
Statistically significant? Yes Yes No

There are two notable differences between these two groups. First, veteran closers significantly improve their K% and BB% in save situations. First-time closers also improve their performance in these areas, but the gain is smaller and not statistically significant.

Second, veteran closers actually reduce their HR% in save situations—in contrast to first-time closers, whose HR% increases in the same situations. (Note that neither of these variations is statistically significant.)

This data supports the narrative stated above. Comfortable in their role, experienced closers raise their game in save situations. First-time closers, by contrast, show the variation one might expect, if this variation could be attributed solely to batter aggressiveness.

This is compelling evidence that roles can bring out the best in a closer.

**

It’s clear that a rigid closer role is in the best interest of the pitcher. In addition to racking up saves, consistently pitching in save situations has historically improved a reliever’s peripheral stats—a recipe for a larger contract. Even outside of the competitive drive to become the bullpen ace, it makes financial sense for relievers—such as, say, Rasiel Iglesias—to angle for this role.

It is not clear that a rigid closer role is in the best interest of the team. There may be moments more important to the outcome of a game than the last 3 outs. It may make sense for managers to use their best reliever—presumptively, the team’s closer—in the 7th inning to escape a 2-on, no-out jam and preserve a 1-run lead. Or in the 8th inning against the heart of the opposing team’s lineup. Etc., etc.

However, in deciding to use a closer in high-leverage but non-save situations, managers should be aware that they are likely not getting the best version of their reliever.

Closers pitch in two types of non-save situations: (1) in a blowout, to get in regular work; and (2) in close games, along the lines described above. The following chart excludes blowouts and compares closer performance in save situations to closer performance in high-leverage non-save situations—when the game is within three runs.

Closer Situation K% BB% HR%
High-leverage, Non-save 25.9% 8.9% 1.9%
Save 28.2% 7.4% 2.1%
Difference 17% better 19% better 9% worse
Statistically significant? Yes Yes No

The difference in K% and BB% is fairly astounding (note that the HR% change is not statistically significant). Considering K%-BB%, closers in save situations pitch like 2016 Madison Bumgarner, while closers in high-leverage non-save situations pitch like 2016 Jon Gray. I’d rather have Jon Gray pitch the 8th inning of a tie game than Ross Ohlendorf, but I may be tempted to roll the dice with Ohlendorf if it saves Bumgarner for the 9th—or for tomorrow night’s game.

(I’m lying. I would never roll the dice with Ohlendorf. But you get the point.)

**

Managers can go overboard but it’s clear that the inclination to slot relievers into particular roles—or at least into a closer role (8th inning guys showed no comparable boost)—does tend to make a difference on performance. Managers would be foolish to ignore this comfort level when making bullpen management decisions.

**

A brief comment on method: I used Retrosheets event files to isolate reliever situations and extract rate data. I found Retrosheets a very rich source of data, and am happy to share my code if others are interested.

The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at “www.retrosheet.org.”

37 thoughts on “In Defense of The Closer

  1. Not sure why a 9% worse home run rate is not statistically significant when a 7% better strike out rate is. Also, the performance differences between veteran closer and inexperienced closers might indicate that the job takes time to learn, but it also might indicate that veterans tend to be proven commodities, while inexperienced pitchers are not and may never become viable mlb pitchers. Interesting premise, though, and I do like seeing a contrarian (for RLN) view presented for discussion.

    • Hello,

      Thanks for reading! I’m happy to get into the statistical stuff a bit more down here.

      The answer to your first question is that A/B tests are (appropriately) skeptical of large marginal changes in low conversion rates.

      For example, if Aroldis Chapman gives up 1 HR in his first 100 batters, and 2 HR in his second 100 batters, that’s a 100% change, but can be easily explained as randomness.

      If Chapman strikes out 30 of his first 100 batters, and 45 of his second 100, that’s only a 50% change, but the higher volume makes us more confident that something non-random is happening.

      To your second point, remember that each chart compares a pool of relievers to itself. The same proven commodity veteran closers excel in save situations and (relatively) struggle in non-save situations.

      • Thanks for clarifying, Eli. I should read more carefully before I react. It was ever thus.

    • That’s called “survivor bias” – the pitchers who aren’t good enough never become veterans.

      • Yup. Pitchers who don’t excel as closers relative to pitching at other points during the game may not be asked to close any more. But this is consistent with the impact of the closer role, no?

        To clarify again: the argument here is not that second- or third-time closers are better than first-time closers. It’s that they get a bigger marginal performance boost from pitching in save situations. In other words, veteran closers outperform themselves by a greater margin when pitching in save situations than do first-time closers.

      • Presumably this could be teased out by taking the crop of “veteran” relievers i.e. survivors and take a look at their first x (statistically powerful enough number) innings. Did they improve over time? Or were they just really good relievers to begin with?

    • In regards to your statistical significance question, I’d hazard a guess that its because K% is calculated in comparison to every batter faced, while HR% is calculated in comparison to every fly ball induced. For instance, for every 100 batters faced:

      K% – 20% would be 20 strikeouts, 27% would be 27 strikeouts (difference of 7)

      HR% – 2.1% would be .525 HRs (100AB * 25%FB rate * 2.1%HR rate) while 1.9% would be .475 HRs. A difference of .05 HRs; basically splitting hairs.

      Its a matter of scale.

      But I could be wrong.

  2. Just pointing out that the 9th inning represents a random set of batters, whereas a high-leverage situation represents a situation where the pitcher is more likely to face a lineups best hitters. So it makes sense that a pitcher would have better statistics in a save situation where they’re just as likely to face the worst 3 batters as they are the best 3 batters. If I were a pitcher I would want a role as well. If my salary is based on my statistics then why would I want my statistics to be made against a teams top hitters in the highest leverage situations when I could build my stats versus 3 random batters and get a save stat to boot?

    • Great thought on a player’s stats the connection to salary. From a players perspective, why risk it?

    • Hi Bob,

      Your point about 9th inning hitter randomness is a good one. This probably explains at least some of the variation from Madison Bumgarner to Jon Gray.

      I don’t think it explains all of it, though. The improved save-situation performance of closers who have served as closers in previous years–and who face the same random collection of 9th inning batters as would a first-time closer–indicates that there’s something about the situation / role that brings out the best in pitchers who are comfortable with it.

      • Also consider that the “high leverage” situation may itself face some randomness. For example, though there may be 2 on and none out, it could also be that the reliever is facing the bottom of the order.

  3. Very cool. I love numbers and charts. I can’t help but wonder what would happen if the ‘save’ statistic was removed and only the ‘hold’ statistic was used. Would that make the 8th inning seem just as important as the 9th inning if they earned the same stat?

    • Eddie,

      This response is tangential to your question, which imagines a different statistical universe, but I performed the same analysis for setup guys as for closers.

      I found (1) setup guys / 8th inning guys are much less common than closers; and (2) for the few setup guys that have existed over the past five years, it’s hard to identify any performance variation between a typical hold situation (8th inning / 1-3 run lead) and a non-hold situation.

      So perhaps the logic of bullpen roles can only be extended so far.

    • Interesting idea. The “hold” stat is rarely discussed, but often just as, if not more, important.

  4. Excellent counter argument to other posts. I really enjoy seeing the debate and a champion from both sides of things. Ingredibly insightful and thank you for this post.

  5. Good write up, Eli! This is making me question something I thought, intuitively, would have had no difference, which doesn’t happen to me very often!

    I’m curious if I could I ask the sample size for the non-save vs save situations? I am thinking the save situations sample should be much, much larger, but I don’t know how much larger.

    Thanks again.

    • Patrick,

      Overall, the samples are fairly similar. Here are the specifics:

      Overall closers
      Save situations: 14,726 batters
      Non-save situations: 12,936 batters
      High-leverage situations: 8,517 batters

      First-time closers
      Save situations: 5,924 batters
      Non-save situations: 5,928 batters

      Closers who have closed in previous years
      Save situations: 5,720 batters
      Non-save situations: 4,326 batters

      • So first-time closers are used far more often in non-save situations than veterans. Makes sense.

  6. When you compare first-time and veteran closers and use 2011 as a baseline, I think you also need to show how relief pitching changed from 2011-2015. For instance, if pitching improved, and all relievers, not just closers, had a 7% better k% and a 14% better bb%, the data means something different than if pitching did not change at all.

  7. In my mind it all comes down to depth. If your dominant reliver pitches in the high leverage situations then you still need to use the guys that suck in the 9th. Ohlenhomer is a bad pitcher in the 7th…he’s a bad pitcher in the 9th.

    The Nasty Boys had ” generally” defined roles. Had Charlton been the closer and Myers the 7th inning guy I believe the end result still would’ve been about the same.

    • I believe the point about depth is valid to what happens in the real world.

      A manager is likely to hold hold his best reliever until the end of the game because at that point he has fewer or even zero outs left to come back if the lead is lost.

      However if a manager is in a situation where he has one or more other pitchers he trusts almost as much as his top guy, he may then be more willing to use his top guy in a higher leverage situation in the 7th or 8th because if the top guy is successful, not only has he held the lead, he has set the table for the following pitcher to face a softer part of the opponent’s order. I see a lot of this approach in how Price is using Iglesias, although I’m not sure how much of it is by design versus him just backing into it because of the unusual circumstances surrounding Iglesias’ availability.

    • Isn’t that “Ohlenhoover”? I love that moniker. It’s a shame about Hoover. I really like him and he generally outperformed his peripherals. Guess that luck finally caught up with him.

  8. This is very interesting, but I wonder if you are comparing the right things when comparing closers in high leverage non-save situations vs. closers in save situations.

    In a high leverage non-save situation, someone has to pitch. The closer has, theoretically, been the best bullpen arm. If he isn’t pitching in that high leverage, non-save situation someone who is by definition worse than your closer is.

    So I think it would make more sense to compare closers in high leverage non-save situations vs. non-closers in high leverage non-save situations. You could also then compare closers in save situations vs. non-closers in save situations.

  9. If relief pitchers perform better in clearly defined roles, then it seems like it must be psychological. I mean there’s no physical or biological reason for it. So it seems to me that it could be changed. If you groom your relievers to always be ready to come into any situation you should get better results. When you are used to doing one specific task at one certain time, you get comfortable with it and its tough to go outside that comfort zone. But if you are trained from the very start to perform many different tasks at many different times you can in time become comfortable with all of the tasks.

    I think a smart organization would change their bullpen usage throughout the entire organization so that guys are becoming comfortable in more fluid roles.

    • All brains do not work the same. There are underlying physical reasons for many aspects of “psychological” behavior.

      It is likely some guys would take to the change and perhaps be even more comfortable. Some would be neutral; some would struggle but adapt successfully. Others would not successfully adapt.

      • I agree, but think the reverse is also true: there are underlying psychological reasons for many aspects of physical behavior. To TCT’s point: I don’t know (I mean, I really don’t) whether comfort in fluid situations is a commonly possible human adaptation, since it seems likely that people have been striving for predictability since time began. Some people (clutchy people) can deal with uncertainty, but maybe not constantly or indefinitely. Just a thought. Great discussion.

        • I pretty much agree right down the line. Like you I would suspect (but admit I have no way of knowing), that folks who “automatically” thrive outside of routine would be the outliers.

          Seems to me that many folks who pride themselves in the ability to function in that mode are actually very much into control via bringing things back to an orderly state which is the ultimate in routine seeking, right?

        • Plenty of professions where you have to be ready to go at a moments notice. Think firemen, SWAT teams, emergency doctors, etc.. Heck, basketball players who come off the bench do it. Its not like they get 10 warmup shots before they check in.

          I think you have to have a certain degree of mental toughness to make the major leagues in the first place. I find it hard to believe that a pitcher who could thrive in a traditional role couldn’t learn to be successful coming in at different times and different situations in the game.

          • Exactly. Why couldn’t their “role” be to come in at different times of the game. The whole thing seems silly to me. Strict roles also make the players unprepared when a deviation is necessary.

            Joe Maddon lets his players know at the start that he’s going to move them around the lineup and use them in different positions. His teams seem to do OK. Yes, he still uses a closer pretty much the same way everyone does. I think that’s the influence of money more than anything – keeping the closer happy.

  10. Personally, I’d take Iglacias, Cingrani, and Lorezen and have them set to pitch the 8th and 9th innings every three days. If there’s a blowout, give them a day. But make sure they work those two innings. I’d use the rest of my relievers situationally.

    • What if you added Finnegan and had a 4 man availability for innings 7-9…. 2 lefties and 2 righties would give symmetry and depth for any situation.

      • If Disco goes 8 or Homer 7 and 1/3….fine…..you have your traditional Iglesias closer….but if Straily runs out of gas with 2 on and 1 out in the 6th….you have your shut down bullpen.

  11. I think part of this is the routine for “closers” going into the game. A lot of guys start getting lose around the 6th or 7th. They also usually have a good amount of time in the pen to get loose before coming into the 9th. In some of the other situations that come up, they might not have done that 6th/7th/8th inning routine and may not have much time to get warmed up. Some of it may be “between the ears” but I’m willing to bet a lot of it is that prep routine leading up to the 9th inning and their appearance. Ballplayers, especially pitchers, tend to like routine so perhaps there is some psychological stuff in there as well.

  12. Any chance you looked at the numbers based on the amount of the lead? Would be interesting to see the differences between save situations with a 1 run lead, 2 run lead, and 3 run lead. Because if the metaphysics of being in a save situation itself is causing pitchers to “bear down” or whatever, we should see the effect lessen as the lead grows.

    I’m not asking you to do more work, just curious what your thoughts are on that, because I could see a scenario where numbers are a lot better with a 1 run lead, perhaps, but then with 2-3-4-5-6-blowout are all the same, and the 1-run lead numbers are skewing the 1-2-3 run lead (save) situations.

    • This is an interesting question. I added a “1 run save” flag to my closer appearances table and recalculated, both for closers overall and for first-time and veteran closers separately.

      In 1-run save situations, as compared to other game situations, veteran closers have:
      * 11% higher K% (statistically significant)
      * Unchanged BB%
      * Increased HR% (not statistically significant)

      In 1-run save situations, as compared to other game situations, first-time closers have:
      * Slightly worse K% (not statistically significant)
      * Slightly worse BB% (not statistically significant)
      * Unchanged HR%

      It’s hard to draw too many conclusions from this, except that veteran closers really turn on the Ks in 1-run save situations.

      Happy to share the data and/or code if you’d like to investigate further!

Comments are closed.