[This is the first official post for Patrick Jeter, long-time commenter here as JDX19, at Redleg Nation, but it won’t be his last. Patrick will write a regular column for us this season. Welcome aboard! – SPM]
Every baseball fan is likely to have a certain set of statistics they trust above all others. For some fans, the traditional grouping of batting average, home runs, and runs batted in provide solace in analysis. For other fans, rate stats like BB% and HR/FB% reign. Yet another group of fans stakes their reputation on indexed and valuation stats like wRC+ or WAR. Every statistic has its own merits. Knowing which tool to use in varying situations is half the battle. Today, however, I want to explore an idea on which we can all agree; being completely and utterly dominant in any particular stat is good thing. This should not be a contentious statement.
With that in mind, let us look at the most accomplished player in each of a few selected statistics among current MLB players with at least 1000 plate appearances:
Many of these stats should be familiar to the loyal readers of Redleg Nation. A few, however, may need some explanation. Feel free to explore and then come back. You’ll be glad you did.
I chose these stats because they are the kind of stats I value. Batting average is not particularly important to me, but it was included because it is one of the most common stats and everyone understands it. I avoided counting stats (like HR) in favor of rate stats (like HR/FB) because playing time affects counting stats. Likewise, I avoided WAR because of the same playing time concern.
What can we tell from looking at the chart alone? Well, not a lot. We see that players such as Miguel Cabrera, Albert Pujols, and Votto have been very good over their careers in relation to their peers. We see that Ryan Howard and Giancarlo Stanton have a lot of power. We see that Mike Trout is, by one measure at least, the best active offensive player. None of these statements should come as a surprise.
Now, let’s take a look at a chart showing the league average of each of these statistics from 2015, as well as their standard deviation.
[Quick Primer for Standard Deviation: In any normally distributed population (baseball is fairly normal), about 68% of samples fall within 1 standard deviation of the mean (league average, in our case), 95% of samples fall within 2 standard deviations of the mean, and 99.7% of samples fall within 3 standard deviations of the mean. If you look both positively and negatively, the entire population, shown as a bell curve, encompasses about six standard deviations.]
Now, onto the chart!
Given the idea introduced in the “quick primer for standard deviation” above, let’s look at what the range of each of these statistics becomes when we both subtract 3 standard deviations (low) and add 3 standard deviations (high) to the 2015 league average:
Given what we know about standard deviation and normal distributions, this information tells us that we can expect just about every major leaguer to fall somewhere between a .236 OBP and a .398 OBP for their career, to fall between a 43 wRC+ and a 157 wRC+, and for their batting average to be no lower than .188 and no higher than .320. Based on these numbers, a theoretical “worst hitter of a generation” would be something like a .188/.236/.243 sort of hitter, also known as a pitcher. The “greatest hitter” would be something like a .320/.398/.567 hitter, also known as Miguel Cabrera. Seriously. His career line of .321/.399/.562 is eerily close to an exact match.
But, this post is not about Cabrera. By now some of you may have wracked your brains and deduced it’s about Joey Votto. Other clever readers may have simply read the title of this article! I recommend the latter. Nevertheless, now that we have all the above established, let’s go back to our initial graph of the career leaders in each category and substitute in “standard deviations above-average” for the actual measure. This measure is a proxy for what I’ll call “dominance.” The more you are above the average, the more dominant you must be. We will sort the chart in descending order with the highest standard deviations above-average (SDAA) at the top:
How should we interpret this information? The statement “Joey Votto is better at getting on base than anyone else is at any other offensive baseball skill” is how I would put it.
At nearly 4 SDAA, Votto is what we like to call an “outlier.” At 3 standard deviations you’ll recall a normally distributed population will have accounted for around 99.7% of all samples. Votto is nearly a full standard deviation above that point. Here is a histogram as a visualization aide for his deviance:
The vertical axis represents the number of players in the group, while the horizontal axis represents the OBP of the group.
I don’t have to tell you who the single data point on the far right belongs to. His peers aren’t close, statistically speaking. For fun, the unfortunate soul on the left of the histogram is Mike Zunino, who should be on a short rope in Seattle this season.
We have arrived at our final chart. Aren’t you glad you stuck with me?! No? Ok, we’re almost done, then! Here are the same stats, but this time showing the difference between the player in first and the player in second, again sorted in descending order:
From this perspective, we see that Votto’s OBP is nearly a full standard deviation above the player in 2nd; the aforementioned Cabrera. This represents the largest gap for any of these stats between the player in first and the player in 2nd. Not only does Votto hold the mark for most SDAA of any statistical leader, but he also holds the mark for largest margin between the leader and the player in second place.
We know Joey Votto is a great hitter. Some of us may even realize he has been historically good. However, I hope we can now all realize there is another word that aptly describes Votto …