Warning: This post will have plenty of numbers and math and “advanced” things. I have to make up for this debacle.
Before the season started, I wrote a piece about lineups. If you want another affirmation for lineups not mattering that much, you can click on the link and read what I had to say.
When talking about lineups not mattering that much, no one ever puts a number on it. Probably because it’s a lot of estimation and speculation. If that’s the case, you may not be convinced of the veracity of the analysis. You’re in luck today, however. I happen to love estimation and speculation! I also happen to love trying to convince people of the veracity of my analysis. I might have my work cut out for me. Onward!
In order to try and determine the true cost of a sub-optimal lineup, I’m going to examine a specific case; Brandon Phillips batting 3rd behind Joey Votto. This will represent somewhat of an outlier for an example given Votto’s prolific career at the plate. Their nearly 100-point spread in wOBA is about as big as you’ll ever see in the middle of lineup.
For this analysis, we’ll use the RE24 framework. I also wrote a piece about RE24 earlier in the year. If you need a refresher, please check it out. I’ll write the rest of this piece assuming you’ve read the previous piece. It’s a pretty good piece, if I’m going to toot my own horn for a minute. One Mr. Tom Tango, the new Senior Database Architect for MLB Advanced Media, agreed:
Ok, enough about that! Let’s get into it!
In order to try and measure the true cost of a sub-optimal lineup, we’re going to look at every (or some approximation of “every”) situation that can occur following a Votto at-bat in two different situations. The first situation will be a lineup of Phillips, Bruce, Duvall, and Suarez following Votto. The second situation will be a lineup of Bruce, Duvall, Suarez, and then Phillips following Votto. For this analysis, it doesn’t really matter whether Votto is batting 2nd or 3rd.
The reason why we can look at this problem in this specific way is because of some work done by the fantastic Jonah Pemstein and Sean Dolinar. They developed what they’re calling the “Batter-Specific Run-Expectancy Tool.” Essentially, it modifies the “league average” tables we’ve grown accustomed to looking at using data and math so we can get a more accurate view of RE based on the hitting skill (measured in wOBA) of the batter at the plate.
Here are the players in question, as well as some wOBA information and the “wOBA bucket” I’ll be using on the Tool to determine their personal RE24 table:
On this table you can see the normal stratification you’d expect from these players. I’m slightly weighting the bucket I’ll use based on recent performance. So what does one of these player-specific run expectancy tables look like? I’ll show the extremes to give you an idea:
Votto, as you’d expect, gives a higher RE in every situation. That is the power of not making as many outs at the plate.
How should we use these tables to answer our question, though? Getting to a specific, quantifiable cost is a bit tricky and takes some brute force.
First, we need to determine how often each batter in the analysis will be put in each situation described by each base-out state. To start, here is a simple example. If Votto were to lead off the inning, we know he’ll get out about 60% of the time (this assumes an OBP of .400). So, we know that in this situation Phillips’ at-bat will occur with 1 out and the bases empty 60% of the time. Makes sense? Excellent! Here are the values I’ll be using. You can probably justifiably quibble with some of the numbers, but I assure you, they don’t change the magnitude of the final answer in any non-trivial way:
Armed with these numbers, we can now begin a brute force attack on the question!
As a reminder, this analysis assumes a sequence beginning with Joey Votto batting, followed by some order of the other players. Votto is our constant. First thing we need to do is determine all the things that could happen after Votto bats with 0 outs, 1 out, and 2 outs.
Doing this is a series of probability calculations, which I did using a tabular method so I could keep everything straight. Here’s an example of one of those calculations straight from my spreadsheet:
The first chart shows the odds that Phillips comes to the plate in each given situation. The box in the top left filled with 4.2% represents what? Well, how many ways can Phillips come to the plate with the bases empty and 0 outs assuming Votto just batted? A home run, of course! You’ll notice the 4.2% in the box matches Votto’s HR rate from the previous table.
If we look at the same box in the 2nd chart, you’ll see it filled with 0.416. That is the player-specific run expectancy for a player with a .305 wOBA in that situation. Since BP is in that situation 4.2% of the time, we multiply 4.2% by 0.416 and we get 0.017, which you can see in the corresponding box in the 3rd chart. This process is repeated for every box and the results are then summed. The bold 0.413 you see on the side is the total of all RE for each situation Phillips can be in following a Votto at-bat which started with 0 outs. The way to interpret this number is “Phillips creates an average of 0.413 runs when he bats following Votto when Votto batted with 0 outs.”
Ok, so that was a lot of work for a single, tiny situation wasn’t it? Good thing for you I am a disturbed individual. The next chart you’ll see is a glimpse into the larger process. I can’t really explain it without showing it, so bear with me. We need to be able to account for each permutation of things that can happen in an inning and assign a somewhat accurate likelihood of each event occurring. Here’s how I did it:
On this larger slice of the same spreadsheet, you can see how I’ve laid out the rest of the batters. Bruce following BP has a lot more percentages filled in. This is because there are a lot more things that can happen after 2 at-bats (Votto then Phillips) than with just a single at-bat. I specifically highlighted the top-left Jay Bruce cell for a reason. It says 1.00%. If you look at the Excel formula bar, you see (0.042*0.025) + (0.358*0.025). This formula accounts for two things. First, the odds that Votto and BP will homer back-to-back. That would bring Bruce up in a bases empty, no out situation. Also, if Votto reaches in any non-homer manner ((0.400-0.042)=.358) and then BP homers (0.025), that would also get Bruce to this situation.
Without belaboring the method anymore, I recorded as many of these permutations as I could and then summed them up to see how close I got. In each case, I hit more than 98% of all possible things that can happen in an inning. The ones I missed are likely very fringe cases, like a player advancing from 1st to 3rd on wild pitch that occurred on a steal attempt. Since things like that don’t happen often, missing them shouldn’t change our results.
I used the above method to work up BP batting 3rd, as well as BP batting 6th, thus moving all the other players up one slot. Remember, we’re still only working with the situation where Votto begins with sequence with 0 outs. Here are the results:
In the “BP 3rd” column, you see BP’s cumulative RE from the spreadsheets above of 0.413, followed by Bruce’s 0.459, Duvall’s 0.340, and Suarez’s 0.086. Seems like a large drop-off a the end, and it is. The odds that the Suarez will even come to the plate in this situation is only 21%. Another way to say that is there exists a 79% chance that Votto, Phillips, Bruce, Duvall will create 3 outs before Suarez can bat. The rest of the lineup can be ignored, since the odds of each extra player coming up to bat in this sequence drops very quickly.
Ok. We are going to use some estimation and math to finish out quickly here before I lose the 5% of folks who are still with me.
Since Votto will come up in varying situations, we need a way to account for that. The way I am going to estimate the rest of the situations is as follows. First, we’ll estimate the likelihood of Votto coming up in each of the 0, 1, and 2 out situations. Over a large sample, we’d expect a somewhat close relationship, with 0 out being the smallest given the nature of not batting 1st in the batting order. I chose 30% for 0 outs, and 35% for 1 out and 2 outs. Then, we need to estimate the decrease in overall run expectancy in all situations given that sometimes the sequence starts with 1 out and sometimes it starts with 2 outs. To do that, I used the league average RE tables and reduced the “BP 3rd” and “BP 6th” numbers by the same proportion from 0 outs to 1 out as the normal league average RE decreases when going from 0 out to 1 out.
We then sum all those figures together, which gives us a “per sequence” run expectancy. We can then assume we’ll see this sequence 4 times per game, so we multiply by 4, giving us a “per game” run expectancy. Then, we simply multiply by 162 for a “per season” run expectancy. That’s what we were looking for, right? I think so… Here it is in tabular form, in all its glory!
The final answer is 14.75 runs per season. That is the true cost of batting BP 3rd instead of somewhere like 6th. Given FanGraph’s current measure of 9.64 runs per win, Bryan Price’s decision to construct the lineup in this manner is costing the Reds a whopping 1.53 wins per season.
See? Lineups don’t really matter that much. We all knew that before this long and arduous Friday morning article. However, the point I’m trying to make is that sometime in the future, 1.5 wins will probably matter. If a simple decision like this that often gets shrugged off can result in a swing of more than a win, so can other decisions that occur during a season. A good manager (and front office) need to identify things like this, low hanging fruit, we’ll say, and try to eke out as much advantage as they can. As sports fans, we know one thing is true… good teams don’t leave wins on the table.
Note: wOBA data and tables generated from the wOBA Tool are courtesy of FanGraphs. The rest of this mess is courtesy of my noggin’.