2015 Reds / Thinking Inside the Box

Mythbusters: ‘SABR-teams’ don’t steal bases

On average, how many games do you need to attend before you see a stolen base? It seems like Billy Hamilton steals a few every night, so for Reds fans, the stolen base is about as sure of a bet as Pete Rose. Yet if you look closely, the stolen base is slowly fading from the game: this year there will be the fewest stolen bases per game since 2005. In 2015, if you randomly attended a game, there is about a one in twenty five chance you will see a successful stolen base. What is driving this decline in thievery? Are players getting slower? A newfound morality regarding the sanctity of the base progression? Tighter pants?

From fangraphs, it is fairly easy to show the decline in stolen bases per game. Here are the number for the last fifteen years:

Year Games SB SB/gm
2000 62068 2924 0.047
2001 61325 3103 0.051
2002 61602 2750 0.045
2003 61148 2573 0.042
2004 61111 2589 0.042
2005 59935 2565 0.043
2006 60495 2767 0.046
2007 60785 2918 0.048
2008 60759 2799 0.046
2009 59989 2970 0.050
2010 59813 2959 0.049
2011 58099 3279 0.056
2012 59122 3229 0.055
2013 58596 2693 0.046
2014 58819 2764 0.047
2015 55913 2403 0.043

Graphically, these numbers are a bit easier to digest:

sbgmchart

What is interesting is that teams began stealing more after the steroid era than they did before. If this trend was rationally driven it could be explained that teams did not want to risk one out for one base when they had sluggers up and down the lineup.

Yet the number of stolen bases per game peaks in 2011 and quickly collapses. Part of this may due to the decline in the number of base runners. As we have extensively written about in the past, from 2010 onward, baseball could be called “revenge of the pitcher”. A slight push back against this is that the relationship between OBP and number of stolen bases is zero (r^2=0.001).

One common explanation for the decline in stolen bases is that analytically oriented teams don’t try to steal many bases. This argument states that prior to SABRmetric thinking, teams were both overvaluing the benefit of moving a runner one base while also downplaying the risk of that runner getting thrown out.

The run expectancy charts are brutal on stolen bases: with zero out, if a runner moves from first to second a team’s run expectancy only increases by about 0.20 runs. Yet if that team loses both a runner and creates an out, the drawback is 0.60 runs (from 0.85 to 0.25). Or put another way, teams would be no better off stealing three bases if they are thrown out once.

If a team can diminish the risk of getting thrown out, then the benefit of moving up a base becomes more reasonable. Due to the small benefit of moving up but deep drop off should the runner get caught stealing most methods predict the “break even” point to be about a 75% success rate. And that’s just to get back to a point where you would would be no better off than before.

If you are a normal reader of Redleg Nation, none of sounds especially surprising. The next thing you would expect is a distribution of stolen base percentages over the last 5 years that shows teams are increasingly adopting this “break even point” and that’s explaining the drop off in stolen bases. This was supposed to be a nice, easy 500 word post for the last Friday of the year. I downloaded the data from Fangraphs. And that’s when this column started getting weird.

Here is the stolen base percentage (stolen bases/[stolen bases+caught stealing]) for all teams in MLB over the past five years (grouped by year):

SBpercentbyyear

Green diamonds are 95% confidence intervals

Not only has the league never stolen at a 75% success rate, it doesn’t look like teams are making a discernible effort to become more successful than in the past. For all the talk about MLB front offices turning into computer labs, 2015 is the first time in the past five years that the stolen base percentage will fall below 70%. While for statistical reasons we cannot reject the idea that 2015 is any different than the other years, we also can’t say there is a trend in these data, either.

Yet maybe SABR team’s are not trying to steal that many bases. Stolen base % is declining, so these analytically-oriented front offices are keeping runners parked at first; its a perfectly consistent argument. So here is the same data by team:

sbbyteam

 

SBA is stolen base attempts, same calculation as above. Green triangles are the 95% confidence interval around the mean (green horizontal line). Five years of data 2010-2015.

That’s a lot of data there to digest, but the top five teams that attempt to steal the most bases are: the Rays, Astros, Rangers, Padres and Royals. These five teams were labeled either “All in” or “Believers” when ESPN ranked the most analytically oriented baseball teams. The bottom five: the Braves, Cubs, Cardinals, O’s, and Tigers. Strangely,this is a more mixed bag than the top of the charts due to the Cubs and Cardinals being “All-in” on analytics while the Braves and the Tigers are “skeptics”.

Keep in mind the limitations of these data: it could very well be that regardless of front office philosophy, some teams just have better base stealers than others. This is certainly true in the short run, but over a five year span, player selection and development becomes more reflective of front office ideology than it does when talent is a fixed aspect of a club. Five years might not be enough for talent to be that fluid, so this is something to keep an eye on over the next few years. Another possibility could be that some teams are just better at getting on base and therefore create more opportunities for stolen bases. I tried to control for this by adding in team OBP over this span, but this is an imperfect control because high OBP players might be terrible base stealers (see: Dunn, Adam).

These numbers raise more questions than answers. On the one hand, it seems that analytically sophisticated teams do not steal more or less frequently than the “traditional” baseball clubs. Yet on the other hand, we see the Astros, Rays, and Padres all grouped together at the top of the chart. If anything, there are more analytically oriented teams in the top half of the distribution than there are in the bottom half. This poses an alternative narrative: traditional front offices are driving the decline in stolen bases. Why? Because traditional front offices have their “stolen base guys” and their “sluggers”. As the league OBP declines, “stolen base guys” get fewer opportunities to steal bases. Yet analytically oriented teams might make decisions about when the steal based off of other factors, such as if a pitcher-catcher combo has a low caught stealing percentage or the club’s ability to predict when pitchers are most likely to throw a breaking pitch.

I have no data to support this theory, but its becoming increasingly unlikely we can say that the decline in stolen bases is due to ‘SABR’ front offices dropping anchor at first. The overall trend is there: stolen bases are declining and we need to find a more nuanced explanation for this trend.

Rest assured Nation, we know that the Reds analytically-savvy front office will soon have an answer. That is, as soon as Walt gets off the phone so they can dial into their American Online account.

22 thoughts on “Mythbusters: ‘SABR-teams’ don’t steal bases

  1. Clearly you can gain value by baserunning (both real and perceived) but SB numbers are not a good way to measure that. Some guys get cheap SB in blow outs or just when there is a situation with no real value added. SB% is OK, but there are clearly runners who do not steal much but are excellent at advancing and getting that extra base and not baserunning gaffs. Heck, the coaching is a part too. I am certain the Reds gave up 3-5 games getting thrown out at home by Steve Smith!

  2. Mike – I agree that there are at least as many questions as answers in this data but I couldn’t get past some obvious errors here. Does it really seem right to you that you’d need to watch 25 games to see a successful stolen base? It’s not! Think about it – Hamilton stole one on average every three games this year. One guy. Where did the data come from? 62,000 games a year? How about 2,480 (give or take)? 30 teams, 81 home games a year. So that’s about 1.0 steal per game, right? I’m fascinated by this question of whether attempted steals add value but this data looks really off to me and as a result not very informative. Am I missing something? Email me at Cfdeblois on my yahoo account if you want to communicate directly. And either way thanks for delving into this interesting question.

    • It appears Mike was a victim of the FanGraphs “League” games calculation that summed games for each player. That really isn’t relevant to the rest of the post, though. Stolen bases down recently. The 75 percent rule. No correlation between stolen bases and SABR-ness of the organization.

  3. Let me offer this thought, The stolen base becomes a weapon when the opposing catcher doesn’t make an accurate throw. Knowing the number of times a runner actually advances past second base to third due to an errant throw, that ends up in centerfield, by the catcher might change the calculation and result. Just a thought.

    • While definitely a lucky advantage, my guess would be that advancing on a throwing error happens infrequently enough so as to be statistically insignificant compared to the chances of being thrown out.

  4. That is a stunning dropoff from 2011. Over a 25% reduction in just a few short years.
    The stolen base can still be an effective weapon. It seems now that teams have to be more choosy on when to send runners. If a pitcher is bad at holding runners, then run all day. Like Chapman not even looking at a runner on 2nd base. The same with bad arm catchers, run like gazelles on them. But with the likes of Molina and the Russell Martins, or pitchers with good moves, just put those SB’s in your pocket and save them for another day. Pick and choose wisely when employing the SB.

  5. Maybe WJ is a genius to go with America Online. It is so antiquated that the Cardinals, the Russians and the Chinese cannot hack into their system.

    • I love that idea… “I cannot stand to wade through this archaic nonsense to see what these fools have. It is an unbreakable barrier.” Little do they know that the system only has lunch orders in it.

  6. These are all interesting questions, but I really don’t see any trends in the data provided. It looks like stolen bases went up, then down, then up, then down again, which seems a lot like random variation. The point where we’re at now isn’t even the lowest it’s been in the last 15 years, so even the premise that stolen bases are going down seems suspect, considering they’ve gone up since 2004.

    Some of the variation in SB/game could be caused by the league having more and less baserunners, as pointed out in the article, which could make the variation presented look even less like a trend.

    But to address the idea of “SABR” teams, I think basically the idea that they would steal less came from Moneyball, which depicted Billy Beane as being staunchly anti-steal. But really, any analyst worth their pay would tell you that if you can steal above the break-even line (which is actually lower than 75% in most situations http://www.fangraphs.com/blogs/breaking-down-stolen-base-break-even-points/), then you should steal as much as possible.

    So what I would like to see is the break down of “SABR” teams not by stolen base attempts, but by stolen base percentage. What I would expect is that teams run by old-school front offices and managed by guys like Dusty Baker would pay less attention to their success rate than more analytical teams, and likely have a lower stolen base percentage because of it.

  7. There is so much noise in pure SB totals that I’m not sure what conclusions we can ever reach.

    The Astros, Rays, Royals, Rangers, and Padres were all singled out but those teams all went through major youth movements at some point during the past 5 years (well, Rays are perpetually in a youth movement), and/or have single players that were driving SBs totals.

    Elvus Andrus had 150 SBs,
    Jose Altuve had 169 SBs
    Dyson, Escobar, and Cain had 137, 130, and 80, respectively
    Desmond Jennings had 91
    Everth Cabrera & Will Venable had 101 and 94, respectively

    It’s pretty widely accepted that speed peaks early so it kind of makes sense that teams that went through the rebuilding process (or attempted to) have more stolen bases. I’ve seen people say speed starts declining as early as 22 years old so teams with veteran hitters aren’t going to be at the top of the charts. Often the teams with the “best” hitters are packed with guys in their prime years, 26-30. You look at a team like the Cardinals, for example, and their top guy over the past 5 seasons, is John Jay with 41 total SBs. I don’t think it is organizational philosophy, necessarily, it’s just of their positions were filled with guys like Matt Holliday. As they start replacing some of the aging guys, you’ll start seeing the upward ticks in SBs, for example: Kolten Wong has 38 SBs in his short stint with the Cards.

    • That’s also not to say teams with more veterans won’t also show SB totals. The Reds will be showing up high on the leader boards due solely to one person.

  8. It seems like base stealing would be one of the easiest areas to improve upon using data and analytics. If you know the amount of time it takes a guy to get from 1st to 2nd base and you know the amount of time on average it takes a given pitcher catcher combo (from the beginning of the wind up to when the catcher’s throw reaches 2nd base) then you should be able to predict that a guy’s success rate in stealing. There would also be a sliding scale of success based on the type of lead the runner is able to establish and the probability of an off-speed pitch, low pitch, inside, etc. So data, telling Brandon Phillips that with a 6 ft lead will be able to steal 2nd base 90% of the time on an off-speed pitch against today’s pitcher or whatever….it just seems like saying that instead of using sabremetrics to ask whether or not to steal, clubs should be asking how they can use sabremetrics to decide when to steal.

  9. I think it primarily boils down to roster composition. If you have fast players you are more successful at stealing bases and therefore you steal more. If you have Billy Hamilton, you steal more. I think the post by CRAIG is spot on as to how sabremetrics should be used.

  10. A couple of things I didn’t see in there –

    – The SB can be useful or non-useful. It depends upon the situation in the game, something stats won’t show. For instance, last of the ninth, tie-game, first hitter gets on first, someone is going to try to say it’s better to just keep hitting or bunt the runner over rather than stealing second, if the runner “has some good wheels”? If that runner has a good chance getting to 2nd and in fact does it, the team has 3 chances to get the runner home. Shoot, I remember seeing Ricky Henderson getting on first, stealing 2nd and 3rd before the first out was made, where a sac fly would bring him in. Shoot, there were even people on here thinking the NL All Star managers should be considering taking Hamilton, just in case they got to that point in the game, situations where a SB could be meaningful.

    Or, for instance, with Hamilton on first and Votto batting next, if Hamilton steals 2nd, the other team is going to be “more likely than before the SB” to walk Votto to set up forces everywhere and the DP. But, if Hamilton stays on first, the other team would be more likely to still pitch to Votto, where Votto could possibly get a double or more and drive Hamilton in. A situation where a SB wouldn’t mean much.

    And, that’s the thing. Stats never tell the situations, the strategy being used. People see the stats showing something like SB’s are declining, so they must not be useful. When, in fact, they can be. It all depends upon what you are trying to achieve with them. I’ve said before about this team, I loved the running I see, much better than with that previous manager I felt. But, now, they need to learn to run with the heads and not just run. When do you run?

    – I didn’t notice anything in there about the SB runner having a good batting average. For, I will agree, just like with Stubbs and many others, if you can’t get on base, it doesn’t make that much of a difference how fast players are. Running the bases can be one of the 5 baseball skills (hitting, hitting with power, running, glove, and throwing), yes. But, if you can’t get on base, being able to run fast doesn’t matter much offensively. Where, like with Hamilton and Stubbs, I would tell them don’t worry about running at all. Just concentrate on hitting and getting on base. For, if you don’t, you’re not very useful to us. But, a SB threat who is also a good hitter, aka what Ricky Henderson was, being able to score a run almost on his own without having to hit a HR, I would think any team would look for that.

    • Your criticism of “stats” is really a criticism of basic stats. What you’re saying is that a simple stat of SB doesn’t tell the situation. One of the most important innovations in the sabermetric era is the creation of new stats that do break down events based on the importance of situations. Why would you think there couldn’t be a stat focused on important game situations? There are many of them. “Stats never tell the situation” is as wrong as could be. Anything you can observe and count, there’s a statistic for it.

      • OK, Steve. Then, I guess the stats would tell me “Why” the stolen base numbers has been dropping. So, tell me that, then, for none of the information above shows that. Has it been that teams have simply found it isn’t effective to “go after” at all? Or, has it been that teams have learned that there are better times to “go after” it? For, either one would cause a drop in the number of stolen bases. But, one still shows “stolen base” to be an effective offensive tool, one doesn’t.

        Rather than criticizing someone else about their opinion on stats, Steve, if there were numbers to prove them wrong, then you would be able to show them some numbers that prove them wrong. Given you haven’t been able to show it would seem to prove the point.

        I gave you good and bad situations where stolen base numbers would decline. Give me a stat that shows “why” stolen base numbers would be declining. What, because of the success rate? Again, Steve, as I stated, were they running against someone like Molina? For, that would prove ineffective, even for good baserunners. Were they running against pitchers with slow motions to the plate, aka the time it takes for them to start their motion to the time the catcher receives the ball, what most all would consider an advanced SABR number (except for probably you; since I mentioned it, you will probably deny it)? If unsuccessful then, then that player would possibly be considered a poor runner and shouldn’t be stealing bases. Show me some of your proof, Steve. If you are right, then you can show me “Why”. Until then, your posts like this pretty much prove my point.

        • I didn’t write the initial post about the decline of SB. It may simply be because of less opportunity due to declining OBP. That’s a theory presented in the article. I’ve done little work on the decline of SB. It may or may not be the reason you suggest. SB are worth the risk when they have a success rate in the neighborhood of 75% or better. Mike wasn’t asserting a definitive explanation. In fact, his research rejected his original hypothesis.

          I was only responding to your other narrow claim that “statistics can’t take circumstances into account.” You were making a vague, unsupported claim that stats can’t measure situation. I responded by saying there are many stats that show situations – leverage stats, for example. Like I said before, anything that can be observed – ex. how many SB attempts against Molina – can be counted.

          See, it’s one thing to offer an explanation that should be tested. It’s another to make inaccurate statements about what statistics can and can’t measure. I was just pointing out that the shortcoming of SB that you identified only points to needing more specific stats, not that stats can’t measure it.

          Whether that proves your point or not, I don’t know. I honestly can’t figure out what your point is.

        • Give me anything, Steve. Anything. Give me one stat that tells me “Why” something happened. It doesn’t have to be stolen bases. Anything. I could even show you a stat that says red-haired people having higher IQ’s. “Why” that would be true no one can answer (not without getting into the research of actually studying the genetic makeup of all people red-haired and not). And, why can’t they? Because it’s a “correlation”. They only tell that a relationship “seems” to exist. But, no one can tell why.

          • I’ve read this several times and I have no idea what you’re asking.

        • Then, you don’t understand the true meaning behind statistics, Steve. Anyone can run numbers, as deep/advanced as you want to go. It’s one thing to run the numbers. It’s another thing to tell what they mean, or “Why” the numbers are what they are.

          For, the question was simple. You dissed me for my posts, which was solely on numbers don’t tell you “why” something occurred, but provided no proof. So, I asked you to show me this. The request was simple, using what you would call even a simple statistic, tell “why” a statistic occurred, any statistic, your pick. For example, why was there a drop in the stolen base attempts? “Because teams tried less”. Yes, but “Why” did they try less? I gave you two examples why, one shows poor strategy, one shows good strategy. But, you failed so show any reasoning as to “why” for any statistic.

          And, that’s because statistics can’t show “why”. And, those who don’t understand this, what you just admitted to (“I have no idea”), don’t understand the true meaning behind the numbers. They think just running more and more “advanced numbers” makes them some numbers guru. It’s entirely another thing to tell “why” the numbers are what they are.

          Not even I can tell “why” the number of stolen base attempts dropped, which has been my point. Because, like I said, teams could simply have quit trying to much, or they could have determined what times they believe trying them are most beneficial. There could even be some more possibilities. The thing is, no one can tell straight from the numbers. You have to go in and do more research. Not just running more numbers. It could very well mean actually going in and interviewing the participants and asking them “why” they are doing something, which would possibly go into simple things like “it’s our strategy”.

          Which has been my point all along. Running a team with “only the numbers” isn’t a good thing, just like running a team only “the old fashioned way”. It has to be a mixture of the two. For, it’s well known that Johnny Gomes has rarely been a good “numbers” guy for teams; so, why even have him on the team? But, it’s also well known that he was an excellent addition to one of the Red Sox WS-winning teams. His attitude, his clubhouse demeanor, was one of the key things the Red Sox needed that season, something the numbers will never show about Gomes.

          • See, what I don’t understand is why you think it’s such a big revelation that we haven’t identified the reason for the decline in stolen bases. If you read it carefully, that was the conclusion of Mike’s post. He proposed a theory and then rejected it based on the data he found. Science.

            My only “dis” (it wasn’t personal, don’t be so sensitive) was when you broadened your claim to say that stats can’t show situations, which is obviously 100% wrong. Many stats are broken down based on situation and context. As I said, sabermetrics has pushed the idea of putting stats in context.

            As for your broader claim about the “why” behind the stats … I get it, you’re Mr. Need More Data guy, except then you go on to say that there isn’t any more data that can matter. (Except apparently you’re the only one who can discern “why” like when you state that Jonny Gomes helps teams in the locker room, etc. That entire paragraph is one piece of unsupported speculation after another, posed as truth.)

            Finding statistical connections and then speculating on the reason behind them leads to knowledge, even if partial. We know that elbow injuries are correlated with greater pitching velocity (the “why”). We know that the lower run scoring environment is correlated to PED testing (the “why”). Those are two of a million similar examples.

            Does correlation confirm truth? Of course not. But when the connection is sufficiently strong, it can indicate probable truth – like the link between cigarette smoking (the “why”) and lung cancer. Or the link between higher OBP (the “why”) and greater run scoring. Just because data doesn’t give us a 100% explanation doesn’t mean it can’t give us a partial explanation. And those are worth knowing.

            It’s fine to be Mr. Skeptical Guy, if that’s your thing, but your doubts are more useful when specific to the situation. We haven’t figured out why SB have declined, which is what Mike said in his post. (Although the explanation could be something as simple as a decline in opportunities as the major contributor. If that’s true, then we would know.) But uncertainty about SB doesn’t mean we don’t know that smoking cigarettes is harmful or that staring at the sun (the “why”) can hurt your eyes. Your broader philosophical point is absurd.

            By the way, Jonny Gomes has been an excellent “numbers guy” for clubs when he’s used in the right situation – pinch hitter vs LHP. That a “why” proven based on statistics that, you know, are about certain situations.

        • “He proposed a theory and then rejected it based on the data he found. Science.”

          Where did he reject it?

          “some teams just have better base stealers than others”

          ” Another possibility could be that some teams are just better at getting on base and therefore create more opportunities for stolen bases. I tried to control for this by adding in team OBP over this span, but this is an imperfect control”

          Other possibilities?

          Where? Where did he prove “Why” the number of stolen base attempts declined. Simply saying someone proved it isn’t the same as actually proving it. Where did he? I’m asking you, prove it to me. You don’t even have to use science. Use cut-n-paste. I don’t need more data. I need reasons. I need answers.

          Not being Mr. Skeptical Guy at all. You need to quit being Mr. Dissing-anyone-who-questions-anything-with-numbers-when-numbers-never-tell-why. I’ve said before and will say again. Of course I would look for the numbers. I just don’t let the numbers make decisions for me. Because, it’s fact, numbers never tell the entire truth. You have to look at numbers, the makeup of your players, team chemistry, your coaching staff, etc., many variables. Those who don’t consider numbers at all are just as foolish as those who consider only numbers, they are never working with a full deck. It’s people like you who, when anyone steps up and says, “I would decide against the numbers there”, have to come in and diss that person, not just me, simply because that person questions the numbers

          And, your inability and/or inactivity to answer my simply question of showing me “why” the number of stolen base attempts has declined only proves my point. Thank you.

          Gomes, a great numbers guy? 13 seasons in the majors, 7 of those with negative WAR, only two seasons over 2 WAR, none over 2.7? That’s your definition of a good numbers guy? Alright. Low standard for a “good numbers guy” there. Oh, as a pinch hitter against LHP, which happens just how often? 300 times a season for 1 player? 200 times a season? So, let’s see, I guess that gets even better when it’s against the American League West teams, right? Probably only after the 7th inning? Also, you have the numbers to prove this, right? Oh, not there? You’re getting to a bit of minutia now. Not to leave out some more numbers, let’s see, you have said 1 WAR costs about $5 million. So, who’s going to pay that kind of money for a pinch hitter against LHP? Good numbers guy? Hardly. If he was a good numbers guy, why would so many teams have let him go, some of those you have said who endorse “the numbers”?

          And, that comes from using your own argument with numbers. Funny when the numbers work against a “numbers guy”.

Comments are closed.