2017 Reds

Bullpen fairy tales

A two-part narrative is developing around the Reds bullpen. It’s a simple story with a scary ending. First part: The bullpen has been bad lately and getting worse. Second: That’s because the relievers are getting tired from being used too much. The villain in this story is a shady cabal known as the Starting Rotation. Lately, every time the Reds bullpen gives up a run, people point an accusing finger at the bullpen workload. As if that’s the only reason for pitchers to perform worse one month than the weeks before.

Before this claim hardens into conventional wisdom, let’s take a closer look to see if there’s anything to it.

Once upon a time, June was so bad …

Is the Reds bullpen actually getting worse as the season has worn on? Let’s look at the right data broken down by month.

First caveat – months are arbitrary endpoints. There’s no principled reason to divide the bullpen’s performance into three parts (April, May and June) instead of two or four or five other than convenience. Given that reservation, here’s the data. The most important columns are xFIP and SIERA: [Data from FanGraphs and from before Monday’s game.]

The Reds bullpen pitched great in April, the only month that Bryan Price practiced anything revolutionary in using his relievers. That’s probably a coincidence, but the performance wasn’t entirely a product of luck – where the balls fell – it had a super-high strikeout rate, a relatively low walk rate, a normal home run rate and a low hard-hit ball rate. Its xFIP and SIERA was a full run lower than league average bullpen performance.

May was a much worse month for the bullpen than April, even worse than June has been so far. That’s based on factors that pitchers control (measured by xFIP and SIERA). Reds relievers had a lower strikeout rate and higher walk rate in May compared to June.

The “June bullpen is so bad” tale is fiction born from reliance on one statistic – ERA – for that claim. General criticisms of ERA and ERA-based metrics – fielding dependence, random sequencing, inherited runners, park factors, luck on balls in play – are known. But ERA is a particularly lousy way to measure bullpen performance over the few innings a bullpen throws in a month. The underlying measures – strikeouts, walks, swinging-strikes, pitch velocity, first-pitch strikes and hard-hit balls don’t support the “June so bad” narrative. A big factor in the June ERA rise has been a higher BABIP, an outcome mostly out of the pitcher’s control.

So the frightening “bullpen getting worse lately because they are too tired” narrative doesn’t make it through the first chapter when examined closely. The bullpen is not getting worse. May was rock bottom, at least so far.

In a faraway land the bullpen is so tired …

Next, let’s look at the notion that the bullpen is tired, which is based on two claims. The first evidence is really just inference based on the (now disproven) claim that the bullpen has pitched worse recently. The second basis is that because the Reds starting rotation has been so poor, the Reds bullpen has pitched more innings than other bullpens. This is true, to a point.

To determine and compare bullpen workloads, we have to decide the best way(s) to measure it. There are two obvious metrics. One is looking at the total number of appearances made by a reliever/bullpen. The second is the total number of innings the reliever/bullpen has pitched. Because most relief appearances are about an inning long, these numbers tend to move in lockstep, but not always.

Photo: Sam Greene, Cincinnati Enquirer

The Bullpen Is So Tired narrative is based entirely on the number of innings pitched. The Reds bullpen has pitched 301 innings, the most in the NL by 21 innings over the Orioles. The median number of bullpen innings pitched is 260 and the lowest is 211.1 by the Nationals.

The number of reliever appearances tells a different story. The Reds rank twelfth out of fifteen in the NL. Bryan Price has used his bullpen 232 times. The New York Mets lead the NL with 264 appearances, while Dusty Baker’s Nats again are last with 217. The median is 241. The Reds are closer to the bottom than the top of this measurement.

The reason the Reds are first in one measure of workload and nearly last in the second is obvious. Bryan Price has had to use relievers for multiple innings (variations of long relief) many times compared to other teams. On the one hand, that Reds relievers have fewer appearances mitigates the heavy workload because warming up is a big part of the stress on pitchers. Reds relievers have warmed up at a lower rate than average.

But there’s no getting around the total number of innings the bullpen has pitched. That’s also real workload. So let’s keep digging on that part of it.

Using yearlong totals for bullpen innings pitched is an unreliable way to measure how much the current relievers have pitched. That’s especially true for a team like the Reds that cycles its long relievers to and from its AAA affiliate. Pitchers who are no longer in the Reds bullpen have thrown 76.1 of the 301 innings. That list includes Robert Stephenson (24.2 IP), Cody Reed (12 IP), Barrett Astin (8 IP), Lisalverto Bonilla (11.1) and Jake Buchanan (14.1) among others.

It doesn’t make sense to look at the total number of innings (or appearances) by the bullpen to determine if a specific reliever, such as Wandy Peralta or Michael Lorenzen are tired or overworked. The five who have been there all year are Wandy Peralta (34 IP), Drew Storen (31 IP), Michael Lorenzen (40.1 IP), Blake Wood (35.2 IP) and Raisel Iglesias (36.1 IP). The other two members of the current pen are Austin Brice (26.1 IP) and Tony Cingrani (12.1 IP).

Long, long ago they pitched too many innings … 

Let’s look at the five relievers who have pitched for the Reds all year. These charts use numbers from before Monday’s game.

Photo: Kareem Elgazzar, Cincinnati Enquirer

Raisel Iglesias is on course to pitch 79 innings. His strikeout and walk rates have been pretty consistent, which is reflected by his month-to-month xFIP being highly stable. His numbers across the board are better in May than April, other than ERA. The ERA spike is due to one game where he gave up more earned runs (4) than he has the rest of the season. See: ERA is a lousy stat for relievers, especially over short time frames. He is on pace for 67 appearances, right at league average. Iglesias’ numbers show no sign of his tiring.

Photo: Kareem Elgazzar, Cincinnati Enquirer

Michael Lorenzen is on course to pitch 86 innings. He had a good first month, with his highest K% and lowest BB%. But his BABIP has been all over the place, which is reflected in his ERA. Note that his xFIP for May and June are about the same and higher than league average. Lorenzen is on pace for 66 appearances, below league average. Not much evidence of fatigue with Lorenzen, unless you think a guy in his shape was worn out by May 1.

Photo: Shae Combs, Cincinnati Enquirer

Wandy Peralta pitched like one of the greatest relievers in the history of baseball in April. There was en element of luck in his ERA (BABIP of .143). But most of his success goes back to strikeouts and walks, variables that he controls. His xFIP was just 1.66 in April. His numbers are a lot worse for both May and June. You could look at the monthly data and suggest he’s tiring, but Peralta has only pitched 11 innings per month, which smack dab league average, not overuse. He’s on pace to pitch 70 innings.

Photo: Sam Greene, Cincinnati Enquirer

Drew Storen had a good April; a terrible, awful May and an average June. June is about what you’d expect from him based on his long career arc. The reason his xFIP spiked in May was because he walked as many hitters as he struck out. You can argue about his effectiveness, but there is zero indication that he’s fatigued based on June being worse than May. He also hasn’t been overused, on pace to pitch 68 innings

Photo: Sam Greene, Cincinnati Enquirer

Blake Wood is on pace to pitch 78 innings, which is at the upper end of reliever use. As he’s slid down the bullpen pecking order, Wood has taken on more game-out-of-reach innings. He actually pitched well in April, but had terrible luck with batted balls (.406). Note how his ERA is 5.23 in April but xFIP was his best month (2.75). But again, his worst month was May, based on what he controls. The big jump in ERA from May to June is a BABIP artifact. Look at his strikeouts (way up) and walks (way down) go in the other direction. With a below league average xFIP in June, it’s hard to point at him getting worn down.

That leads to the question of how much workload is typical and how much is excessive?

Over the past three seasons, the average full-time relief pitcher throws about 11 innings per month. The median number of innings pitched for a season in 2016 (67 IP) and 2015 (69.2 IP) are consistent. The average number of appearances – 67 in 2016 and 68 in 2015 – is also pretty stable.

In 1990, Rob Dibble pitched 98 innings and Randy Myers 86.2. It’s true that Norm Charlton only threw 50.2 innings from the Nasty Boys bullpen, but he also had 100+ innings as a starter that year. Tim Layana (remember him?) threw 80 innings that year. In 2001, Scott Sullivan threw 103.1 innings, the same year that Danny Graves and Jim Brower each had 80 IP.

Even today, many relievers throw well above average. Last year, six relievers threw more than 80 innings. Another 27 were between 70-80 IP.

Raisel Iglesias and Michael Lorenzen are on pace to be at the high end of IP numbers (although not appearances). But keep in mind they are both former starting pitchers who appear to be in excellent condition. Using them for more innings has been a declared feature, not a bug, of the team’s plan. It’s hard to imagine an extra inning or two at this point of the year has affected their pitching.

Bottom line, for the “tired bullpen” theory to be right, a pitcher would have to be throwing an above-average number of innings or appearances and be getting worse in June. There isn’t a single pitcher who fits that description.

And they pitched happily ever after … 

Our expectations for the Reds bullpen were unrealistic based on how well it pitched in April. As fans, I suppose we can choose to believe that the underlying talent is exceptional. In that case, it’s natural to look for excuses like workload to explain the bullpen’s regression. We can also believe in the Easter Bunny, Big Foot and the Tooth Fairy.

The simplicity of the beleaguered bullpen makes an attractive story, but facts stubbornly prove otherwise. The relievers haven’t been overused, June hasn’t been their worst month, and they were never going to keep up their April excellence. Another factor – one that I didn’t look at in detail here – is that the competition might simply be tougher in June. Six games vs. the Dodgers, three against the Nationals, etc. Maybe that variable, not fatigue, is the dominant one.

It’s no surprise the bullpen fell to earth with a thump in May. Remember, the vast majority of relief pitchers are erratic. This bullpen has actually rebounded a bit in June. But Wandy Peralta, Drew Storen and Blake Wood continue to see a lot of important innings. There’s no track record on which to build a belief that bullpen will be outstanding. Even Michael Lorenzen has been worse than league average for two months.

So far, the front office has been successful in soaking up some of the long-relief and other innings with pitchers who cycle back to Louisville. There may come a time when the young, sacrificial arms run out and that neat trick quits working. Then you might see appearances and/or innings pile up above average for the current staff. So far, that isn’t the case.

Overuse is a tidy June narrative, but false. The bullpen just isn’t that great.

31 thoughts on “Bullpen fairy tales

  1. Just a note, you are right on target. The “tired bullpen” tale being pushed by others, including the “Fox” announcers (Thom & Chris), is getting tired. Anyway of sending them this information? On second thought, they might not understand it!!!

    • Chris is usually one that agrees with advanced stats and more forward-thinking in baseball. I watch FSO broadcasts, but can’t remember – is it just Thom! pushing this narrative or is Chris doing it as well.

    • A small quibble: Being in generally good shape isn’t necessarily the same as having a healthy pitching arm. Neither Lorenzen nor Iglesias has been a starter recently enough for that to reasonably figure in their arm condition now, or so it seems to me. And it’s worth noting that both got injured as starters. I agree that it doesn’t seem as though the workload in half a year should have resulted in a decline in performance, but something does, even though, as Steve documents, the decline is not as great as the narrative has it.

  2. A very excellent breakdown. Steve is becoming a myth buster. Is a TV show on the horizon? I would have liked to have seen the stat WHIP included, as I believe it is one of the better stats to use to judge and evaluate relievers.
    Great analysis and summation.

    • AS OF 6/27/17

      NAME  ERA. innings WHIP
      Hernandez, A 1.35 6.2 .45
       Iglesias, R 1.73 36. .96
       Peralta, W 3.71 34 1.09
       Cingrani, T 2.19 12 1.14
       Lorenzen, M 3.35 40 1.19
       Storen, D 2.61 31 1.26
       Brice, A 5.47 26 1.37
       Wood, B 4.54 35 1.54
       

      • Dammed chart , sorry
        WHIP
        Hernandez .45
        Iglesias .96
        Peralta 1.09
        Cingrani 1.14
        Lorenzen 1.19
        Storen 1.26
        Brice 1.37
        Wood 1.54

  3. Here here for this article! The tired bullpen myth doesn’t hold water. Most of the guys rarely pitch on back to back days (it seems). And like the article mentioned, it appears (no proof other than watching a lot of games) that there’s not a ton of warming up without coming into the game. I’d like to see Iggy and Lorenzen get to 100 IP’s this year.

  4. Drew Storen has hit 6 batters in 31 IP so far. He has also struck out the side on 9 pitches in a game. Talk about Jekyll and Hyde.

  5. WHIP is a fantasy baseball stat. And we’re all used to using it to measure pitchers in that context. But it has severe weaknesses, especially with small samples.

    WHIP measures Hits, Walks and the occasional HBP per inning. So basically Hits and Walks. Walks is included directly in BB% and also play a big role in the ERA-estimator stats like FIP, xFIP and SIERA.

    The problem with WHIP is the Hits component, which has a number of the same weaknesses as ERA. Hits are highly sensitive to fielding, parks and luck. That’s why you won’t find hits in any of the estimators like FIP, xFIP or SIERA.

    • I understand the argument against WHIP. But the thing about WHIP is, it lets you know which relievers are putting guys on base and which ones are not. It is important for all relievers, but for the 7th-8th-9th inning relievers, and middle of the inning relievers, that is inherently important.
      I know, roles in the bullpen. But that is a horse of another color.

      • But it’s not the reliever putting guys on base if it’s the fielder putting guys on base or if it’s just luck that the ball was hit where it was. That’s random chance putting guys on base. WHIP includes a lot of stuff outside the pitcher’s control The part that isn’t – walks – is easy to capture directly or with other stats.

  6. FIP would be preferably to xFIP of SIERA in this case because it’s not adjusting the home run rate. These guys either did or didn’t give up the home runs they did or didn’t give up.

    • Yeah, but the idea is that most of the time, home runs are a function of fly balls, not pitching. So they’re lumpy data that skews stats. Better to measure fly balls (SIERA) than actual home runs. If you like the “did or didn’t give up” stats, then we’re back at ERA. xFIP and SIERA put emphasis on strikeouts and walks, which is where I like it for relievers and starters.

        • That’s right. Problem is that FIP treats all home runs as entirely a function of pitching. It doesn’t recognize at all the role that being a fly ball pitcher plays. So the data (home runs) come in lumpy. Using all the stats is a good idea. FWIW, FIP doesn’t support the idea that the bullpen is collapsing in June, either. Only ERA does.

          xFIP and SIERA consistently do better than FIP in predicting future performance. All three beating ERA.

          • I see issues like this as a break point between what we can do with our somewhat limited resources and what the teams are probably doing in their analytics departments with greater resources in human power, data capture and processing power.

            The common Sabremetrics are good for establishing norms and making level field comparison among players.

            However if I’m running a team, I’m less interested in a level field comparison and more interested in what a guy is going to do pitching in the parks in my division to the players in my division primarily then my league at large which speaks to Doug’s point. I would guess such analysis is part of what the teams are paying big money to develop and hold closely as trade secrets.

          • But isn’t a fly ball pitcher more likely to give up home runs? Now and in the future?

          • Yep. Ground balls are more likely to be hits. Fly balls are more likely to be doubles and home runs. Home runs are a function of fly balls. About 13% of fly balls become home runs.

  7. Hey Steve–while this involves a whole other set of data, would you mind chiming in on the total innings workload for those relievers being shuttled back and forth between the Majors and MiLB, since throwing innings in the minors is presumably as taxing as doing so in the Majors. The shuttling doesn’t take place in a vacuum, after all. Thanks!

    • The guys on the minors-majors shuttle are generally used for long relief with the Reds. Games that are usually already out of hand. If we start to see pitchers coming back up to the Reds from Louisville getting important innings and costing the Reds games, then we should take a closer look at whether they are overused. Like I said in the post, there may well be a limit to mitigating the stress on the major league bullpen this way. That might produce an overused Reds bullpen. I’m not making the argument that overuse can’t occur, just that it hasn’t with the key five bullpen pitchers so far. Soon Cingrani should be in that mix. He’s thrown 7.2 innings in June.

  8. Steve, great article, and I like the statistics for the entire bullpen by month. For the individual stats by month, it concerns me that one bad performance can skew things so much, which is why I would be hesitant to include them. For instance, if you take out Lorenzen’s three walk game, his BB% is between 4-5%. That’s a drastic difference and would affect the predictive stats as well.

    Does that concern you at all? I don’t think it hurts your main conclusion. Just trying to figure out if I’m missing something as I think through it.

    • Think that was part of his reservations for focusing on month to month stats. Arbitrary time periods with short sample sizes. But using that framework does try and address the narrative that the bullpen is supposedly getting worse as the season wears on.

    • You’re right. One bad performance can have a big impact. But part of looking at the question of overuse has to be looking at specific pitchers, not just the bullpen as a whole. And that includes measuring outcomes somehow. ERA is way too clumpy for relievers and dependent on too many factors outside the reliever’s control But we have to measure reliever outcomes somehow. Lorenzen did walk those three hitters. But yeah, reliever stats are highly sensitive to single bad performances and when you break them down into time periods to analyze trends, you’re getting even smaller sample sizes.

  9. The longer statistics are compiled for a pitcher or the staff, the more likely that outliers (good and bad) will appear. You expect a normal distribution around a Mean for a given pitcher (good, bad, or average). Every pitcher will have “bad days”, i.e., Iglesias melt down against the Dodgers two weeks ago. The longer (measured over the season) and more often they pitch, it becomes almost a certainty. And the game he lost on Friday night against the Nationals (perhaps the best hitting team in the NL).
    Iglesias is very good, but not Superman.

    Combine this with the fact that over the last few weeks, the Reds have played against a series of very good teams, teams that are better than they are.
    Soon, this run of playing very good teams will be over, and the Reds will have a run of playing more mediocre teams (similar level of talent). I would guess that there will be a run where the stats will get better. Call that a SWAG. And they might win more than lose, for a stretch.

    The Reds (the team, not just the Bull Pen) have lost a lot of games recently because they are not as good as the opposition, and all their weaknesses (especially their starting pitching) are exposed. And the more their bullpen is exposed (more innings, like that debacle on Saturday when Homer pitched), the more likely those performance outliers (getting bombed) will show up.

    Yes, getting clobbered will make you tired. You will feel shell-shocked, I think. Nobody that pitches likes to get clobbered.

  10. Interesting to see Blake Wood’s stats, he is certainly not as bad as I thought he had been. I think one week where he seemed to do really poorly stuck in my mind instead of actual stats beyond an inflated ERA. The high BABIP makes me wonder . . . isn’t Wood known to be a ground ball pitcher with few home runs? And aren’t ground balls more likely to fall in for hits?

    I do think that Wood is not as bad as I had him in my mind, but as a more mediocre and older option in the bullpen I wouldn’t mind if some guys succeeding in AAA got a chance to show something. Maybe not even at Wood’s expense, just in general.

    • Ground balls do have a higher BABIP than fly balls. Of course, less likely to become doubles or home runs. 🙂

  11. It appears that Austin Brice has been sent down and Kevin Shackleford called up. It is good to see someone doing well get a chance, but I am surprised that Brice gets sent down. I know last night he gave up quite a few runs, but overall I thought he had done fairly well so far.

    Also, when may Reds announce Finnegan’s immediate future? DL? 10 or 60-day? Roster move?

    • I think that Brice had been used a lot (tired, I ain’t no ways tired!), but they just wanted a fresh arm for the next few days, since Brice pitched a lot yesterday. And over the weekend.
      AAA is going to be R & R for Brice; he will be used more sparingly for a couple of weeks. As Arnold might say…..I’ll be back!

  12. Brian Price (mlb network) said that finnagan will be out “at least” several weeks. So, whose up next?

  13. Looking at the 2-14 slide starting with the west coast trip to LA, the Reds have only got 4 quality starts (6/14, 6/16, 6/19 & 6/25). Three of those were for 6 innings and they won two of those games. Over that stretch, the starters have pitched 69.33 innings basically averaging 4 1/3 a game.

    Bullpen has gotten cocked for 4 or more runs 7 games in that stretch. To be fair, really only 3 of those games cost the Reds a close game (6 run blowup 6/11 vs. Dodgers, 6/14 vs. Padres and the extra inning game against the Nats 6/23).

    I think a big part is that the Reds pitching/bullpen ran into the best two offenses in the National League and they carved them up further exposing the rotation. Nats and Dodgers have been doing that to a few teams this year – LA is on a crazy hot streak right now.

Comments are closed.