Editor's note: On Tuesday, ESPN The Magazine released its Analytics Issue, with a cover story by Sam Miller about why baseball fans should embrace WAR (wins above replacement). We asked Keith Law to break down his "desert island stats," which are the metrics he can't live without when evaluating players.
Having a player's WAR (wins above replacement), even if you know which version of WAR it is, is not in and of itself terribly useful unless you know the breakdown of the numbers that went into it.
WAR is just the end of a process of normalizing different areas of a player's game -- for a hitter, that's offense, defense, and baserunning, with an adjustment for the offensive standards of his position -- so they can be added together into a single number.
Any system of valuing production should distill a player's contributions into a number that represents runs added/saved if it's a positive contribution and runs cost if it's a negative one. If you just have a player's WAR, you have no idea how he contributed to his team's success or lack thereof, and you can't assess the figure's reliability because you don't know how much came from, say, offense, which is one of the easiest things to measure accurately, and how much from defense, which is one of the hardest.
And because players create value in different ways, it affects the way we evaluate them. For example, I'm a little less confident in Michael Bourn holding his value going forward because so much of his WAR over the last few years came from his legs -- great defense and value added on the bases. Not only are the defensive metrics a little less precise than the offensive ones (although they are a huge improvement over what we had 10 years ago), but having the breakdown allows us to see how much Bourn depends on his speed to be a valuable player. If his legs go with age, or he suffers one or more significant leg injuries, his value will drop quickly. If his value largely came from his bat, we might have other concerns, but not the same one about his speed.
Unless you're comparing two players whose WAR figures were so far apart that there is no question who was the more valuable player -- say, Mike Trout versus Miguel Cabrera last year -- having WAR by itself is nothing but a starting point without an end.
With that in mind, here's a glance at a few of the stats I always use when I pull up a player's stat page on FanGraphs or Baseball-Reference to at least get me started in thinking about the player's value:
This is the best single metric I've seen so far for measuring a hitter's production on a rate basis. That is, it tells you how productive the hitter was when he played, but doesn't address how much he played and thus is missing one component required to tell you exactly how much value he contributed.
Since I'm more often concerned with looking forward than with assessing past value, I spend far more time looking at rate stats than at cumulative stats. wOBA takes the seven ways a batter can reach base safely for which he should receive some credit -- singles, doubles, triples, homers, unintentional walks, times hit by pitch and times reaching on error -- weights each of them relative to their run-producing value and divides it by plate appearances. The weights produce a ratio that usually sits in the .300-.450 range, making it similar to the scale for OBP and thus a little familiar to our eyes. If you want one number to tell you how good a hitter was, this is my choice.
OBP and slugging
The two best basic indicators of what a hitter did -- incomplete, to be sure, but a solid starting point. OBP tells us how often the hitter reached base; the converse of this, 1 minus OBP, tells us how often he made an out. A hitter with a .300 OBP made an out in 70 percent of his plate appearances, which is not a desirable trait in anyone but a pitcher.
Slugging percentage, and its sibling isolated power (SLG minus AVG), give a quick and familiar measure of power production. Slugging is flawed because it weights each base achieved equally; the hardest base to reach is first, and the difference between a double and a triple is usually in the hitter's speed rather than its ability to advance runners already on base. Despite that, slugging, like OBP, is a good starting point for further analysis.
But for the love of Pythagoras, please don't add the two things together and pretend the result means anything. It is a massive mathfail, something the Millennium Bridge engineers might understand. You have two fractions with different denominators -- OBP gives us a rate per plate appearance, while slugging gives us a rate per at-bat -- so you can't simply add the two without accounting for that.
OPS, the fauxbermetric stat that results from a straight addition of the two, ignores the difference in value between the two -- a point of extra OBP is worth a lot more to a team's run-scoring potential than a point of slugging.
Consider two players with an .800 OPS: One has a .350 OBP and a .450 slugging, and one has a .400 OBP and a .400 slugging. The second player is clearly more valuable: He makes fewer outs than the first player, and the number of additional times he's on base exceeds the number of extra bases added by the first player. OPS wouldn't tell you that, but wOBA would.
Moving from wOBA to OBP and slugging helps you further understand what made a player valuable or not valuable, without losing sight of just how good he was overall relative to his peers or forcing you to connect two pieces that just don't fit together.
Strikeout and walk percentage
There are two things a pitcher can do on his own that he can "control," in the vernacular of baseball analytics -- he can strike a guy out, and he can walk him. As the famous sabermetrician Captain Obvious once said, you want pitchers who do a lot of the former and not much of the latter.
These ratios are not subject to the noise present in pitcher stats that incorporate hit rates, balls in play or even home run rates -- those data are important, too, but they require further interpretation, including park effects and adjustments for defensive help, bullpen help or harm, and just plain old randomness. If a pitcher can miss bats, it'll show up in his strikeout rate -- and if a plus slider isn't missing bats, maybe it's not plus after all.
If a pitcher has plus control, he shouldn't walk guys. If he walks too many guys, it may be control, or it may be mechanical, or it may be approach, but whatever the reason, it is, to use the technical term, no bueno. As an aside, I always prefer to remove intentional walks from pitching ratios, since they're a manager's decision, not a reflection of pitcher skill.
It's more instructive to use strikeout and walk percentage -- as opposed to strikeouts and walks per nine innings -- because some pitchers face more batters per inning than other pitchers, which means they get more chances to strike out or walk them.
Ground ball percentage
Again, it's best expressed as a ratio of the total, in this case of all balls put into play if possible, although we often make do with field outs (ground ball/fly ball ratio) as a proxy. A pitcher's ability to keep the ball on the ground indicates two things -- that he might not be homer-prone (assuming he doesn't have a below-average fastball or a nasty habit of hanging curveballs) and that he might be able to generate double plays.
A ground ball in play is slightly more likely than a fly ball in play to become a hit, but less likely to go for extra bases; as you might imagine, a line drive put into play is the most likely type of batted ball to end up as a hit, but line drive rates don't appear to be within most pitchers' control and the data is rife with classification problems. Ground ball data is more reliable, and can help answer the question of whether that sinker actually sinks in meaningful terms.
Batting average on balls in play is simply the rate at which a pitcher allowed a hit on balls put into play -- so we're deleting strikeouts, walks and typically home runs (although I think it's fair to ask whether HR should always be removed here), and just looking at balls that entered the field of play and whether they became hits or not.
The central conceit is that pitchers have little or no control over this rate, given a large enough sample size, if we adjust for park and defense. Knuckleballers are an exception, and really awful pitchers are an exception in that they can only control how quickly they walk off the mound after eight or nine straight hits on balls in play.
One of the most interesting areas of research right now is into how much of the year-to-year variation in pitchers' BABIPs is noise, and whether there's any signal in there at all that might help teams make better decisions on pitcher transactions or usage.
The leaders from 2010 to 2012 in BABIP, for example, include some players who have been helped by great defenses, like Jeremy Hellickson and Jered Weaver (a fly ball pitcher in a fly ball park who's had Peter Bourjos and Mike Trout behind him a lot), but also includes guys like Matt Cain, Clayton Kershaw and Justin Verlander, who are all great power pitchers but don't have obvious explanations for low BABIPs besides general awesomeness. Is that just luck, or randomness, or are they able to cause small reductions in their BABIPs because of the type of contact they induce?
There are a lot of good pieces online that attack these questions, and I'm sure even more proprietary work done by teams' analytics departments (except the Phillies, who would like to remind you that they don't have one). In the meantime, though, I want to see a pitcher's BABIP, this year and in the past few years, when I start to think about how to assess his performance and look forward from it.