The Philosophy of the Stats — Details


What a great picture, by the way.

== Numbers Nuts & Bolts … with pie! ==

One of my “things” is that you shouldn’t have to have an advanced degree to understand what folks are saying about baseball.

So what I’ll do is just make up a whole bunch of complicated stuff and ask you to trust me.  And if you don’t get it, I’ll sneer at you and belittle your input.


I will try to explain what I’m getting at as directly as possible.  The starting point is my Mild Protest Against the “One-Number” Stats, in which I decry the tendency to try to cram everything into a single number.

I don’t recall the actual point at which I decried, but clearly that’s what happened.

12182233045_fd7f36eb93_nTherein, in addition to decrying, I offered what I view to be an alternative approach.  Rather than adding more and more components into the World’s Tallest Thermometer …

and, yes, there it is … in Baker, California on the edge of Death Valley …

and not far, actually, from the home of the High Desert Mavericks…

which, after all, says volumes about why it’s not a great idea to play baseball in a place called “High Desert” …

But, yes, the serious point: Greatness lies at the overlap of the otherwise paradoxical.

In other words, instead of adding, adding, adding more and more factors to make our thermometer taller and taller, we need to think differently.  We need to examine the ability to do the seemingly self-contradictory.

And not just on occasion.  Baseball greatness entails the consistent ability to do the paradoxical.  So we reject the thermometer and replace it with the Venn Diagram.  And I found so many cool Venn Diagrams that I couldn’t pick just one.

4027443168_44da453d00_o4889135495_e91b886fcd_zAnd best of all ….


== And back to baseball ==

Here’s what I’m trying to measure, from the prior article:

For hitters:

  • Don’t make outs
  • Produce offense
  • Do both at the same time

For pitchers:

  • Get outs
  • Deny offense
  • Do both at the same time

And the idea is:

  • A stat that measures ability to avoid non-random outs for hitters, and ability to get non-random outs for pitchers
  • A stat that measures ability to produce non-random offense for hitters, and ability to suppress non-random offense for pitchers.
  • A stat that combines the two — looking for the sweet spot in the Venn Diagram of Greatness (It appears to me to be blueberry + peach + custard.)

In addition, I added a “cross-check” — a (relatively) simple “one-number” stat that captures the same idea.

Also, I wanted stats that could be applied to pitchers and hitters (just in opposite directions, obviously), so everything is keyed to plate appearances or batters faced (same thing).  This is a bit different for those used to pitching stats based on innings pitched rather than batters faced, and I try to point out the differences where needed.

Finally, I wanted stats that could be applied to minor leagues and historical eras.  I started out examining minor leaguers, so a lot of the analysis entailed coming up with stats that could apply even where there was a lot of data missing, and the same thing goes for historical data.

There is an admitted loss of “perfect accuracy” in accommodating the need to find measures that apply to the low minors and to the 1930s.  That’s granted.

== So here we go ==

Plate Skills Advantage (PSA)

A measure of ability to avoid non-random outs (for hitters) or induce non-random outs (for pitchers)

On-base percentage is a “yes/no” or “on/off” stat.  Either you reach base or you make an out.

So, in OBP terms, a home run and a walk are equal.  And, in PSA analysis, they are worth the same.  They are both non-random avoided outs.

PSA uses the 10-year MLB average BABIP (batting average on balls in play) — which is .298 — as the “starting point.”  In other words, if every plate appearance was a “random-y” ball in play, then every hitter would have a slash line of .298/.298/.298.

If a hitter does nothing to change that equation, then PSA would be 100.  Hard-hit balls and walks move the dial upward from .298; strikeouts move the dial downward from .298.  If the positive movement outweighs the negative movement, then the hitter’s PSA will be over 100.

And everything is set so the 10-year MLB average PSA is 100.  Cumulatively, all the net positive equals all the net negative.

For pitchers, we just reverse the polarity, so that inducing outs moves the number above 100 and vice versa.

.900 Conversion (Conv)

A measure of ability to produce non-random offense (for hitters) or suppress non-random offense (for pitchers)

Unlike OBP, offense is “weighted” and not “binary” .. a home run is worth a lot more than a walk.  SLG takes the weighted-ness into account, of course.

But SLG doesn’t take into account the value of a walk (as an avoided out) or the cost of strikeouts (as both an out and a ball-in-play denied).  All of this is explained in my original “thought experiment”:  The Allegory of the Window.

I originally called this stat “Plausibility Index” because it measured the “plausibility” of a hitter’s success.  I’ve now changed it to “Conversion,” because it is a measure of the “necessary conversion rate” of “random-y” balls in play to “random-y” singles for a hitter to achieve a .900 OPS.

The 10-year MLB average Conversion is .370.  In other words, the average MLB hitter would need to convert balls-in-play to singles at a rate of .370 to attain a .900 OPS.  Therefore, it is not “plausible”  for an “average” hitter to obtain a .900 OPS.

The more a hitter produces hard-hit balls, draws walks and avoids strikeouts, the lower the “necessary conversion rate” will be, and the more “plausible” it becomes.  For a pitcher, the more he denies hard-hit balls, avoids walks and gets strikeouts, the higher he pushes the “necessary conversion rate” for his hitters.

Therefore, a .370 Conversion is set at 100, and a hitter will rank higher than 100 if he reduces the “necessary conversion rate” below that figure.  For pitchers, again, we reverse the polarity.


Combined ability to avoid non-random outs and produce non-random offense (for hitters) and get non-random outs and suppress non-random offense (for pitchers)

Composite combines PSA and Conv in the same way that OPS combines OBP and SLG.  You may not know this, but you can get OPS+ by taking OBP+ (OBP compared to league average) and adding SLG+ (same) and subtracting 100.

In other words, a perfectly average hitter would be 100-100-100 (the third number is 100 plus 100 minus 100).  A hitter who excels at OBP but without power could end up with an average OPS+ of 100, arrived at via 120 + 80 – 100 = 100.

Our Composite does the same thing with our first two stats, after they’ve been converted to the “100” scale.  So:

PSA + Conv -100 = Composite

Again, an average hitter will end up at 100; below-average will be below 100; and the most valuable hitters will climb up well above 100.

Hope you got it.

== And the cross-check stat ==

Plate Value Index (PVI)

A different summary measure of ability to avoid non-random outs and produce non-random offense (for hitters) and get non-random outs and suppress non-random offense (for pitchers)

Obviously, the above analysis is a long way to go to explain how I get the “three numbers.”  Plus, I wanted a different measure that would “cross-check” the results.

So I came up with a simpler approach:

OPS from non-singles minus strikeout rate

This is Plate Value Index, my own “one-number” stat, but rather than being an attempt to solve all of baseball, it is just intended to be a simple and more-comprehensible means to “get at” the same concepts as the “three numbers.”

“Non-singles OPS” is just the combination of:

  • that portion of OBP derived from extra-base hits and walks (and HBP and few other things) plus
  • that portion SLG derived from extra-base hits

And the latter is nothing more than the familiar ISO (SLG minus BA).

So the final formula is:

  • XBH + BB (essentially) as a percentage of plate appearances (non-single OBP) plus
  • ISO minus
  • K as a percentage of plate appearances

Or, again, non-single OPS minus K-rate.

The 10-year MLB average for PVI is .147, so I set that as 100 on the re-scaled “PVI+” version.  And, of course, we reverse the polarity for pitchers.

PVI generally ranks players in a similar manner as Composite, but it’s a useful cross-check, and it doesn’t require several paragraphs of background information or pies baked into Venn Diagrams.  But it’s getting at the same idea: avoid outs, produce offense, and do both at the same time.

== And on to the rankings … ==

All of which sets up our ranking of the most valuable MLB players of 2013, and eventual analysis of how the 2014 Mariners fit into that picture.

More to come …


2 thoughts on “The Philosophy of the Stats — Details

  1. Why is it so hard for baseball to come up with something the equivalent of, say, pro footballs QB rating? For those of us who just like to sit down and watch without having a computer and/or statistical dictionary to reference the analysis of how well a particular batter is going to perform on a 2 out appearance in the month of June with 2 on against 29 year old right handed 88 – 90 mph curve ball pitchers who wear contact lenses during the day with sun shining away from the plate on a natural turf stadium. I mean statistics are well and good, but baseball does seem to obsess over them. A batter with, say, a 90 rating (against a 100 scale) facing a pitcher with a 60 rating just would seem to be so much easier to understand that your fielders better be having a good day. As well as cutting down on the endless drivel that sometimes comes out of the booth.

  2. Pingback: The 20 Most Valuable Hitters of 2013: 20 through 11 | Mariner Brainstorm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s