Talking Baseball

Your weekday baseball fix. Some days.

Posted by Jon on Wednesday, January 28, 2004

Class is in Session
Posted by Jonathan Shestakofsky

It’s almost February. Transaction lists are desolate. The comforting flow of baseball news has slowed to a barely-audible trickle. And the beginning of a brand new season is only a couple of weeks away. Now, during this calm before the glorious storm, is the perfect time to reacquaint ourselves with tools that will help us evaluate the upcoming season.

Since we began posting on this blog, all four writers have been using various complex statistics to support their assertions. Sometimes, even to us, the surplus of the statistics can be overwhelming. What’s the difference between a hitter’s Batting Average and Equivalent Average? Is there a difference between ARP and AARP (not for Jesse Orosco)?

It’s not difficult to find numbers to prove almost anything. Consequently, we must understand the stats that we throw out there. So here’s a refresher course for both the readers and the writers of Talking Baseball. If Alan Schwartz’s recent article on ESPN.com can be considered “Statistics 101 – Baseball Stats for the Unindoctrinated”, then what follows is the next class: “Statistics 102 – Beyond OPS and WHIP”. What follows are explanations of important statistics that don’t get a lot of attention in mainstream baseball media, but nevertheless act as useful predictive statistics. There are many more stats out there than what I present today, but we’ll have to wait on those for the next class.

Win Shares
Bill James came up with his Win Shares system to “summarize each player’s value each season into a simple integer” (James’ Historical Baseball Abstract, 2001: pg. 331), thus providing a means for objective evaluation of players’ overall values. The Win Shares system allows for the comparison of players in different eras of baseball (comparisons between modern players to those who played in the 1800s), and allows players of every position to be compared to one another.

I won’t get into the specifics of Win Share calculation, but basically the amount of wins a particular player contributed to his team can be determined based on the entire team’s marginal runs scored (runs scored above the league average) and marginal runs saved (runs saved below 1.5 times the league average). From these numbers, a team’s record can be predicted. Knowing the ratio of marginal runs to wins for the team, we can then determine the number of wins contributed by individual players. This system allows fair comparisons of wins contributed by starting pitchers, position players, fielders, base stealers, relief pitchers, and players on bad teams.

The Win Shares system is the predominant win contribution calculation system today. Essentially, due to the ratios involved in its computation, one win share equals a third of a win. For perspective, a 30-Win Share season (responsible for ten wins) usually puts a player in the running for MVP awards. Win shares are an invaluable tool for evaluating trades and players from different eras. My source for this information was James’ Historical Baseball Abstract, pages 331-339. For more information, I highly recommend consulting this book or James’ Win Shares, also published in 2001.

Last season’s top producers, in terms of Win Shares:
1. Barry Bonds, 42
1. Albert Pujols, 42
3. Todd Helton, 36
4. Gary Sheffield, 35
5. Carlos Delgato, 33
5. Alex Rodriguez, 33

Notice that four players who most contributed to winning games were in the National League last year. But the same is not true for pitchers, who play fewer games and thus impact fewer decisions. The most Win Shares by a pitcher this season? Roy Halladay and Tim Hudson both enjoyed 24-Win Share seasons, leading the league in Win Shares for a pitcher.

Notable fact: Milwaukee Rookie Scott Podsednik ranked 8th among National League outfielders with 24 Win Shares in 2003. That’s more than Sammy Sosa, Jim Edmonds, and Andruw Jones – and a lot more than both NL Rookie of the Year winner Dontrelle Willis (14 Win Shares in 2003) and AL winner Angel Berroa (16 Win Shares).

Updated Win Share computations are available at baseballgraphs.com.

For decades, a pitcher’s Earned Run Average was the basis of the evaluation of his talent. But ERA is a flawed statistic because it depends upon, among a number of things, luck: the luck of whether batted balls are playable by the pitcher’s fielders, something the pitcher cannot always control. A pitcher’s ERA, while providing a measure of a pitcher’s success at limiting runs allowed, cannot be relied upon as a measure of overall effectiveness. For these reasons, it is difficult to compare pitchers’ talent using ERA. Wouldn’t it be great if there was a tool we could use to compare a pitcher’s ERA to the league average? ERA+ does just that, presenting the ratio of the league’s ERA, adjusted to the pitcher’s ballpark, to that of the pitcher. An ERA+ of one hundred is league average, so an ERA+ greater than 100 is above average, and an ERA+ below 100 is sub-par. ERA+ is a statistic for which both hitting- and pitching-friendly eras are compensated. The result is a better, but not perfect, measure of pitching effectiveness.

ERA+ allows us to compare different pitchers over time and the same pitcher to his previous years, all in regards to league averages. We can postulate that Tom Glavine’s career as an effective pitcher is over because his ERA+ dropped from 139 (well above average) in 2002, to 94 in 2003. Bob Gibson experienced a similar drop-off, falling into mediocrity before he called it quits. (Source: baseballprimer.com)

Adjusted Runs Prevented
Using ERA to evaluate relief pitching is even worse than using it to evaluate starting pitching. With a smaller sample size of innings pitched, luck is bound to rear its ugly head to an even greater extent, resulting in decent relievers posting poor ERAs and poor relievers posting decent ERAs. The solution is the Adjusted Runs Prevented statistic, which gives the value of a reliever by examining “the number of runs that the reliever prevented over an average pitcher, given the bases/outs situation when he entered and left each game, adjusted for league and park” (Source: baseballprospectus.com).

Using ERA to measure a reliever’s performance works about as well as basing a batter's season solely on the number of home runs he cranked out of the park. A low ERA indicates nothing about effectiveness with runners on base who are officially credited to the previous pitcher(s). More prominent statistics are not necessarily the most useful.

The top relievers in 2003 according to ARP:
1. Eric Gagne, 32.6
2. Rheal Cormier, 30.2
3. Billy Wagner, 29.8

Earned Run Average substantially skews a reliever’s perceived performance. The following table is perfect example:

Buddy Groom...................5.36...................4.7
Kent Mercker....................1.95...................5.5

By the look of their ERAs in 2003, Kent Mercker appears to be a substantially better reliever than Buddy Groom. But their 2003 ARPs show that Mercker was only slightly more effective than Groom in preventing inherited runners from scoring. A pitcher’s ERA cannot be trusted to adequately express performance, especially if the pitcher is a reliever.

Equivalent Average
I find batting statistics a bit easier to deal with than pitching statistics. There are more numbers available for hitters, from batting average to slugging percentage and on-base percentage. Most of us are familiar with these more conventional hitting stats. I use Equivalent Average (EqA) as merely one example of ‘adjusted’ statistics.

The term ‘adjusted’ merely refers to compensating for unequal factors across the league or across time. Each year, we crown the player with the highest batting average in each league as the batting champion. The batting ‘champion’, though, may not have been the best hitter in the league. After adjusting for home park differences, team pitching, and league offensive levels, an adjusted batting average, referred to as a hitter’s Equivalent Average, results in a better measure of total offensive production per out.

According to Baseball Prospectus, the EqA scale is deliberately set to approximate batting average, with the league-average hitter earning an EqA of .260. By setting a baseline of .260 for average production every season, we can again compare players across different eras of fluctuating offensive production, from the deadball era and the time of the raised mound, to the power-packed 1990s. Adjusted statistics are powerful tools that level the playing field and enable objective comparison.

As an example, let’s look at the AL batting title. Bill Mueller won in 2003, with his teammate Manny Ramirez ending up a close second. But their EqAs, a measure of total offensive production per out, are substantially different:

Bill Mueller...................326.....................315
Manny Ramirez............325.....................340

Ramirez led the league in offensive production per out, which is more important than batting average. Hitting statistics are as variable as any other. A good ‘hitter’ (a player capable of getting hits at a high rate) may not be the best hitter (the player who produces the most per out). By adjusting for park factors and pitching, and valuing more productive offensive categories, a better idea of who may be the best hitter emerges.

There are all sorts of other statistics, from VORP to WARP (I kid you not). What makes baseball so wondrous is its intricacies and peculiarities. My hope is that “Statistics 102 –Beyond OPS and WHIP”, has laid down a new foundation for a further understanding of baseball. At the very least, I hope that I cleared up some of the vague and thorny issues concerning the computation of these complicated, yet valuable, advanced statistics. There is always more to learn about the game, and the ways in which we examine it.

Keep studying. We’re only a few weeks away from the return of baseball, as it roars back to life after its New Year’s lull of restless hibernation.

### So what do you think? We want to know. | | E-mail us ###