You might remember last Saturday night, the fateful seventh inning against the Milwaukee Brewers. The Reds were holding on to a 1-0 lead after six brilliant innings by Mat Latos. Manager Bryan Price made the decision to lift Latos and rely on his bullpen to finish the game. Who would Price choose to pitch the high-leverage seventh inning, against Milwaukee’s 3-4-5 hitters?
Answer: Logan Ondrusek.
Bryan Price chose Logan Ondrusek. When asked after the game why he chose Ondrusek instead of Sam Lecure, Price said, “I kind of felt like it was a situation in that exact spot in the order, Ondrusek’s numbers against those guys are off the charts. There were just some really good matchups for him.” (Mark Sheldon)
Price has also used past matchup data to determine playing time. Against Bronson Arroyo, Price started Ramon Santiago because of “past success” Santiago had against Arroyo. Turns out that was three at bats. Price also started Skip Schumaker that game because of his track record against Arroyo. At least in that case, there was a history of 51 plate appearances. Of course, all but six of those were from 2011 or earlier, stretching back a decade.
Price isn’t alone in using data from specific past matchups. Big league managers do it all the time. It makes sense, right? There’s all this specific data, right in one of those thick binders that Mitt Romney isn’t using any more. That match-up data right in front of you, within easy reach. Use it! It sounds perfectly modern, even sabermetric. Also, the media, always looking for something to fill air time, spouts past matchup numbers as gospel truth.
The problem? That data is completely unpredictive. And that’s not abstract theorizing. It’s based on real life major league baseball.
The theory that certain hitters “own” certain pitchers or vice versa has been disproven by numerous studies. For example, Colin Wyers (Baseball Prospectus 2011) studied hitter-pitcher matchup data over sixty years of baseball history. He found that “ten, fifty or even a hundred plate appearances aren’t enough to tell us whether there’s a special edge or sample-size fluke.”
Tom Tango and Mitchel Lichtman (The Book: Playing Percentages in Baseball, Chapter 3) studied major league hitters and pitcher data from 1999-2002. They used 1999-2001 as the “before” period and 2002 as the “after” period. They stipulated the pitcher-hitter matchup had to take place at least 17 times before and at least nine times after. That let them identify 300 pairs of pitchers and hitters. Their findings:
“We found thirty hitters with fabulous hitting records against thirty pitchers. And yet, given the chance to prove this skill in subsequent confrontations, they fail miserably.” Looking at the pitchers who “owned” certain hitters, “once again, the identity of the opponent was irrelevant. These pitchers didn’t own these hitters.”
That’s not to say all batters and pitchers have the same chance of succeeding. Certainly the overall career numbers for a hitter or pitcher matter. Joey Votto has a better chance of getting a hit than Willy Taveras. Aroldis Chapman has a better chance of getting an out than Jimmy Haynes. The handedness of the batter or pitcher can matter. Handedness has been demonstrated to influence outcome. But that’s based on league-wide data over many years. Not on one hitter versus one pitcher.
Tango and Lichtman’s conclusion: “We’re not saying that it doesn’t matter which pitcher is facing which hitter. …However, you can’t tell by looking at the numbers from twenty-five or sixty plate appearances. There is simply too much noise masking the truth under those numbers.”
In other words, sixty highly specific plate appearances (same pitcher and hitter) is not enough evidence to overwhelm the knowledge from 1,500 random plate appearances by that hitter and/or pitcher over the course of their careers.
Instead of basing his decision on Ondrusek’s matchup data against Lucroy, Gomez and Ramirez, which was comprised of ten or fewer plate appearances per batter, Price should have been looking at Ondrusek’s career numbers or at least those from all of 2014. And by that criteria, of course, the choice was certainly Sam LeCure. If you want to push the #closerrulesstink envelope, bring in Chapman. But really, ABO.
[Interestingly, Lucroy had batted ten times against Ondrusek and only had one hit. But he had walked three other times, giving him an on-base-percentage of .400 against the Reds pitcher. (Please, please tell me that Price doesn’t look at batting average.) By the way, Lucroy led off the inning with a walk.]
Not only are the sample sizes too small, but over time, the hitters and pitchers change. I remember a case where Dusty Baker explained that he chose Edgar Renteria to face a certain pitcher based on matchup data that stretched back over ten years, as if Edgar Renteria at age 34 was the same hitter as Edgar Renteria at age 24. It’s an utter waste of time.
If the data is so conclusive about the futility of using specific hitter-pitcher data, why do managers do it? Beats me.
Playing time, pinch-hitting and bullpen decisions can be complicated and therefore hard. Maybe that itty bit of data seems like an oasis in a sea of indeterminacy. It’s ultimately a mirage. When you hear Bryan Price explain that he used specific history for a decision, it should make you cringe a little. Because while he’s using data, it’s useless. And it’s important for managers to know which data matters and which doesn’t.
The matchup data does give managers a ready excuse when the beat reporters come around asking about a specific decision.
Last Saturday, it also gave us Logan Ondrusek in the seventh.