Thankfully, the charts and graphs kind of “speak for themselves” here, and we can show things with examples much better than text-heavy explanations.
And, all we are doing each time is comparing two sets of numbers to each other. Here’s what we’re driving at:
Are these two things generally heading down the same highway in the same direction? Maybe at different speeds and in different lanes, but generally heading down the same highway in the same direction?
In short: does more of one (hard-hit balls) mean more of the other (offense)? or does less of one (strikeouts) mean more of the other (offense)?
Or, alternatively, is there not really much connection?
Those are the kinds of issues that we’re exploring.
Now, there is one scary-sounding math-y thing that we’ll introduce: the “R-squared.” Here’s what it’s about:
Any time we compare numbers we can use the math to create a “model” of the relationship.
- A hitter will strike out 2.6 times for every one walk.
That’s a “model” of a relationship: strikeouts equals 2.6 times walks.
We can then “test” that “model” for what is called it’s “goodness of fit.” Does “2.6 K per 1 BB” actually explain anything or is it just the way the numbers work out? That’s the key for what we’re examining here. We want relationships that explain something.
And “goodness of fit” is another way of saying “the explanatory power of the model.”
We measure the “goodness of fit” or “explanatory power” of our “model” with the “R-squared.” A “fit” that is close to 0 means our model doesn’t really “explain” very much of the relationship between the numbers. A “fit” that is close to 1 means our model is strong. It appears to “explain” a relationship that is more than just “random occurrence” or “noise.”
When there is a limited amount of data (frequently casually referred to as “small sample size,” sometimes accurately and sometimes not), there is also the issue of “statistical significance.” But since I’m dealing with all of the batted-ball data from 2002 to 2014 for qualified hitters with at least 1,500 career plate appearances — which amounts to 1.9 million plate appearances! — we are not dealing with limited data. When we’re dealing with that data in the Batted Ball Project, when there is a relationship it will be statistically significant, So I’m skipping over the specifics of that.
OK … examples.
Let’s start with walks and strikeouts. We can “do the math” to conclude that the average hitter will strike out 2.6 times more than he’ll walk. But is there any “relationship” between one’s BB-rate and one’s K-rate?
Here’s what we get from those 1.9 million PAs:
We can see that the data is a “blob.” There is no indication that the two sets of numbers are related in an explanatory way.
And our “model” (the line) indicates, accurately of course, that BB-rate will be generally lower than K-rate, but that “model” explains nothing. The R-squared “goodness of fit,” as you can see, is 0.0067. Nowhere close to the 1.0 that we’re looking for.
Therefore, we can conclude that a hitter’s BB-rate and his K-rate are pretty much “independent” of each other. There’s no evidence that striking out less results in more walks; no evidence that more walks leads to fewer strikeouts.
We can’t tell anything of importance about a hitter’s K-rate just from knowing his BB-rate and vice versa.
Interesting to know.
Now let’s go the opposite direction.
Let’s compare OPS and wOBA. Both are measures of overall offensive production. In recent times, wOBA has become the preferred measure. So how much does knowing OPS allow us to explain wOBA?
No “blob” here. Instead, everything lines up in a neat line. OPS “explains” pretty much all of wOBA. Our “model” (that wOBA will essentially equal 0.4 times OPS) explains 99.2% of the variation (that’s the R-squared). Not even Barry Bonds is really an outlier.
In other words, statistically speaking, over 1.9 million plate appearances, there is no meaningful difference between OPS and wOBA. All of which doesn’t necessarily “prove” that wOBA isn’t “better,” particularly maybe in individual cases, but over the course of the last 12 years, in the aggregate, there’s no real difference.
Unless I’m missing something.
But the point was not to indict wOBA (though maybe I did inadvertently). The point was the difference between “blob” relationships and “line” relationships. We’ll be looking for the latter. And they’ll be the ones with the higher R-squared.
Here’s part 2.