David Wright (AP Photo)Let me start by saying I have no love of David Wright. Too many times he has put me in Home Run Derby Hell, especially with 1 Home Run in his first 100 at bats this season and 3 in his first 150. Yes, I know that he had a great year when it was all said and done (.325 BA, .546 SLG, 30 HR, 107 RBI).

Good for him.

But as I said back on May 10th, he is dead to me, and in Greektown those designations are permanent.

Why do I bring this up? Well yesterday, Richie had a post on why Matt Holliday was going to win the MVP, which of course (in true Richie fashion) he got wrong. In the comment thread, a commenter named “Sky” said the following:

“You know who got screwed? David Wright. And Albert Pujols. Chipper Jones, and Chase Utley, too. Those four were clearly the four most productive players in the NL. All were huge on offense and added a lot of value defensively. If you want to include some sort of voodoo for playing on a playoff team, fine. But that doesn’t mean Holliday or Rollins were better players.”

Sky was kind enough to include a link to his blog, Skyking162, which has a great tag line, “baseball with a hint of lime.” His latest post, and the basis for his comment, uses the statistical measure, total runs above replacement value (TVAR) to determine who was most deserving of the MVP award (David Wright in this case) in the National League.

For those readers unfamiliar with the concept of replacement value in baseball, it tries to measure the production of a certain player, both offensively and defensively, when compared against the strawman “replacement player”, or average player. It comes from the sabermetric world, where the likes of Bill James, Rob Neyer, the Baseball Prospectus and many others who have made a significant contribution the game live.

Let me state, I am a big fan of the Sabermetrics world and I really appreciate what several passionate baseball fans have done to help make the game even better for those who follow it. Let me also state, I haven’t taken a statistics class since I was starting my MBA at Kellogg in 1999, so it has been a while.

I would argue two things. First, the use of TVAR to determine the winner of the MVP is inappropriate, as the method is built on faulty assumptions. Second, statitistics should only be one factor in determining someone’s value on the field.

So what’s wrong with TVAR? The first underlying assumption of “replacement player.” It assumes that, in this case, if David Wright wasn’t playing third base for the Mets, the Mets would be able to pencil in an average amount of production by the player they chose to put at third. The problem is two-fold. First, by definition, half the league is below average, and half is above average (cut the old bell curve in two). What makes you think the Mets would automatically get someone that is, let’s say 1 or 2 standard deviations from the median?

Which actually leads me to my next point. Finanically speaking, not all teams are created equally. The Mets, with their major market status, have traditionally been in the top 5 or 10 of payroll every year since the financial descrepancy between teams in the league became a real issue (i.e. the 1990’s). Their payroll guarantees they would not have to settle for an average player in that position, if they so choose.

My feelings on statistics are not limited to TVAR, but to many sabermetrics measurements. Again the use of statistics is very helpful, and really to me, makes the game more enjoyable. But there are some awfully big assumptions in play to make some measurements meaningful. Take OPS for example, On Base Percentage + Slugging Percentage, it takes the measure of getting on base and adds it to the measure of power, but when you look at the components it takes many of the same variables into account. Yes, the stat correlates to our very best hitters in the game, but is that surprising given that your are taking two metrics that measure two key components of hitting and adds them together? The measure is slightly redundant, but still a favorite of sabermetric purists.

Sabermetrics is trying to measure, objectively, the production or value of individual players compared to their peers. However, when trying to statistically represent defensive range or a strawman replacement player, objectivity gets called into question. Is the assumption of measuring against the average reasonable? or that that +10 TVAR is the equivalent of 1 win? If that true, David Wright’s 89 TVAR means his performance led to 9 more wins than if the Mets had the strawman at third. I can’t reconcile that with his performance in April and May or the teams peformance in September.

The problem for me is we are not talking about measurable independent events with a set number of outcomes. The result of an at bat isn’t simply measured by either getting on base, or making an out. There are so many different outcomes to a single event in baseball, that the use of probabilities and statistics can not possibly capture the value creation of any one player.

That is why I don’t think you can base your decisions on the MVP award on stats alone. They can’t measure leadership. David Wright certainly had a very productive September (.352 BA, .602 SLG, 6 HR, 20 RBI in 108 ABs), much better than his April (.244 BA, .311 SLG, 0 HR, 6 RBI in 90 ABs), yet his team went 5-12 over the last three weeks of the season, and blew a 7 game lead over the Phillies during those 17 games. Yes, David Wright produced, but he did not make his team better, which is the hallmark of “Value”. I don’t know how you argue a player whose team had one of the greatest collapses of all time could be the MVP of the league.

Thriller Besides, did I mention, the man is dead to me?

BallHype: hype it up!

16 Responses to “The Over-Statisticalization of Baseball (if that’s a word)”
  1. Sky says:

    Nick, thanks for the link and willingness to discuss the topic without using some combination of the words “computer”, “parents”, and “basement” . I’ll try to get around to a more thorough response, but here are some quick hits:

    - Do I think that baseball’s over-statisticalized? Probably not, although I’m sure some people don’t like stats. That’s fine, as baseball’s entertainment. But it doesn’t make statistics invalid.

    - Do I think statheads have it all figured out? No way, nor do they claim to. My article didn’t include baserunning (other than SB/CS), which is important. Defensive numbers have a long ways to go. And while leadership is overrated, I’m sure there are times that a player gave another guy a tip and helped him play better. Of course, better play should show up in the stats of that other guy. Can statistics perfectly account for every contribution of every player? No. But there’s no way I’d trust any one person to have a better grasp on those contributions via subjective means.

    - In September, David Wright most definitely did make his team better. With a scrub in his spot, the Mets likely win a game or two fewer. Now, in the grand scheme of things, that doesn’t affect the playoff picture. But it sure as hell reflects David Wright’s talent. MVP might not equate to “best player”, but I find it much more interesting and important going forward.

    - Replacement players are not average players. Not sure if that’s what you’re saying or if I’m reading you wrong. But the Mets definitely DO have to settle for average players at many positions: second base, right field, and first base (even if they didn’t expect it at first base, if they could have upgraded, they would have.) Even so, David Wright most definitely should be compared to a replacement level player. Let’s say the Mets could replace him with an average guy — Ronnie Belliard, maybe? Does that mean Ronnie Belliard has zero value? He’s worthless? No way. He’s still a bit better than the absolute minimum talent that any team should put on the field. That’s the baseline you should compare players to. When answering the question, “who did the most to help their team win”, it doesn’t make sense to compare players on different teams to different standards.

    - About half the PLATE APPEARANCES are taken by players who are below average. But there are many many more PLAYERS who are below average than above-average. That’s because the distribution of talent in the majors is the right tail of a bell curve. For every Miguel Cabrera, you’ve got five Miguel Cairo’s and ten guys sitting in AAA. This point has nothing to do with anything else, though : )

    - What do you mean that you can’t reconcile the statement that Wright was worth 9 more wins to the Mets than a scrub with his April? Do you think that a 9 win performance spread evenly over 6 months is better than 9 wins bunched up in hot streaks? I don’t. Given that the Mets won about 50 more games than a replacement-level team, how many of those would YOU give to Wright? That’s actually an interesting excercise — divide up 50 marginal wins among all the Mets players. Not many would go to the pitchers outside of John Maine and maybe Glavine, and only Wright, Reyes, and Beltran would get a significant credit as position players. Alou and LoDuca would get some, but Alou’s defense is costly, and LoDuca’s pretty overrated.

    - Regarding OPS, that’s a strawman of your own. OPS has many flaws and is incomplete, as most statheads will tell you. It’s better than just using AVG or RBIs, but that’s not saying much. There are many other approaches and metrics that do a better job of measuring baseball talent, which is what we should be using.

  2. allonthefield says:

    Interesting post and counterpoints by Sky. For the most part, I agree with the sentiment that awards should not be distributed on the basis of statistics alone. But having said that, I’d like to see the MVP award “defined,” so to speak. Is it the most outstanding player? If so, you give it to the guy that puts up the huge numbers. Who cares what team he’s on? But most valuable to the success of the team, well, that’s a different story. I’d like to see the award be a blend of the two (David Eckstein might be THE most critical component of a team’s success, but he’s no MVP), but again, I think the fans deserve to know on what basis the award is being distributed.

  3. Nick the Greek says:

    Sky -

    I agree, stats are very valid. In fact, I fail to understand how anyone really can enjoy baseball without embracing the numbers that are such a big part of the game.

    As far as David Wright’s September, looking at the those last 17 games, in the 5 wins the Mets had during that stretch, he batted .500 in those games (11-22), but only batted in 6 runs with no homers in those games that the Mets won by a combined score of 44-18 (only 1 of those 5 games was a one run affair). So, I am not so sure the Mets were 1 or 2 wins better than if they had a scrub in his place.

    But back to my issue with the idea of David Wright being the MVP (put aside my “he’s dead to me” joke), is I feel that the MVP, while certainly there entitled to slumps, his April was atrocious and his May was decent but not spectatular. He was a very good hitter with men on and 2 out (.298, 27 RBI in 94 ABs).

    Compare to Matt Holliday whose only month with a BA under .300 was July at .287, and never slugged under .530 in any month (compared to Wright’s .311 slugging % in April). He was a great hitter in the clutch, men on and 2 out (.337, 34 RBI in 104 ABs).

    Then go to the basics, Matt Holliday had a higher BA (.340 to .325), more HR (36 to 30), more RBI (137 to 107), and a higher SLG (.607 to .546). Couple that with the Rockies charge to playoffs with the Mets Collapse, and there is the underlying reason for Matt Holliday being a much better MVP candidate than David Wright. Yet the TVAR analysis says its backwards with Wright #1 and Holliday #6.

    Maybe my problem is that some composite stats like TVAR take away the ability to look at individual components of the calculation and weigh them in an approprate manner. Sometimes the basics matter (which is the opposite of your sentiment when you say “OPS is better than AVG or RBI”).

    That then leads to All on the fields point, define what is the basis for the award.

    I must admit, I hastily put this together today (at work no less), so I certainly could have spent more time picking apart the statistical calcutions (I got a bit lazy in my critique of OPS). Maybe I will do a followup.

    Thanks again for engaging. Have a Happy Thanksgiving.

  4. Sky says:

    Good stuff.

    I disagree that there’s any importance to consistency, beyond what you’d get from a WPA analysis. (If you hit seven HRs in one game, that’s really not very helpful past the first three.) For Wright to put up the numbers he did with a couple poor months just means that his other months were disgustingly good. The flip side is that Holliday or Rollins or whoever never had a month that matched Wright. How can you be the MVP if you never performed at stratospheric level? Morneau won the MVP last year precisely because his first couple months were mediocre, and then he picked it up to “lead” the Twins to the playoffs.

    Don’t forget Holliday deserves a Coors penalty.
    Its about a dozen runs of value. Rollins and other Phillies’, too, but it’s about half the magnitude of Holliday’s.

    A large part of the discussion has to deal with the definition of MVP. I totally agree that Rollins and Holliday were the biggest stories, for whatever reason. I’d rather give the stories an award, not the players. Many people will say right now that the MVP isn’t necessarily the best player award, but then will treat it like it is when looking back in a few years. I don’t necessarily think MVP means “who was the best player”, but that’s the question I’m tackling, because I actually care about it. I enjoy the storylines, but that’s different.

    No need to point out the flaws with various advanced statistics. They exist and anyone who pretends they don’t isn’t being honest. That being said, those stats are better than what used to be around and by addressing the flaws we’ll make them better. The positives outweigh the negatives and a flawed stat (even one as flawed as OPS) will be more accurate than a typical fan’s or sportswriter’s opinion almost every time.

    I’m not sure I get your point about wanting to look at the basics instead of a composite statistic. Specific stats will tell you about specific things, but when trying to look at the big picture, you need to combine all contributions in a way that gives each contribution appropriate importance. No?

  5. Sky says:

    Also, I’d like to apologize for the acronym TVAR being used outside of my blog. We really don’t need any more acronyms. I support simply using the generic phrase “total value”. Words are awesome.

  6. Nick the Greek says:

    No need to apologize, I was the one who brought it outside your blog and decided to use it.

  7. jonathan says:

    Ok that’s just terrible. I just wrote a novella of a comment and it was eaten because apparently I mistook a 0 for an O or something and am a bot. I hate the internet.

  8. jonathan says:

    Basically my point was we’re talking apples and oranges in terms of “value”. One is whose numbers you would most want on your team which is pretty much nailed down by math, and in the long run translates directly into wins. The other is who was the most involved in a memorable season, which is entirely subjective and subject to luck, situations, and how the drama unfolded- but no less valuable to a lot of fans. We need a “MMP” for Most Memorable player that has nothing to do with who who was the overall “best” at playing the game that year.

  9. Nick the Greek says:

    Jonathan,

    Sorry about the lost novella. As for your apples and oranges comment, I disagree, not all value can be captured by the math. Sure, most of it is, but sometimes context is important. That is why there is no way David Wright should be considered the MVP this year.

  10. Justin says:

    The problem with context, by which I assume you mean winning, is that you’re automatically discounting a lot of players who just don’t have the supporting cast. I guess it all boils down to your definition of “value.” If player A has a stellar season and - to the best possible estimates using the best statistical tools - adds the equivalent to a dozen wins to his team, but his teammates tank and winds up out of the playoff picture, is he more or less valuable than player B, who has a very good season worth about eight wins on a playoff team?

    In the last week of the season, for instance, the Mets went 1-6, but in five of those six losses, their pitchers gave up seven, eight, nine, 10 and 13 runs. Was Wright less “valuable” because Glavine crapped the bed in his last game? With the rest of his team effectively tanking, there’s no way he - or any one player - could have put up enough runs to win under those circumstances.

    The Phils, meanwhile, went 4-2 in the last week, thanks largely to a pitching staff that came around. Can you honestly argue that Rollins somehow “led” his pitchers to pitch better, whereas Wright’s lack of “leadership” led to his staff imploding? Things like leadership and chemistry are all fine and good, but curiously, you only hear about those things when a team’s winning. Look at this year’s Yankees - for the first two months, everyone was suggesting Torre had to be fired and that he had somehow lost his ability to “lead” the team. Once the Yanks righted the ship (as anyone who looked at the simple run differential stats predicted they would), he was being touted as Manager of the Year material and it was scandalous that the Yankees didn’t bring him back. In short, leadership doesn’t create a winning atmosphere, a winning atmosphere creates claims that a team has leaders, chemistry, blah blah blah.

    More than any other “team” sport, baseball relies on individual accomplishments, with each hitter having roughly the same opportunity to impact the outcome over the course of a season, assuming they all play a similar number of games.

    David Wright didn’t let the team down by contributing to (for instance) Delgado’s down year, or to the pitching woes. He only helps or hinders his team based on his own accomplishments (apart from the predominantly team-dependent statistics such as runs or RsBI). Based strictly on the numbers, he helped his team far more than Rollins, whose low OBP led to him making the highest number of outs by any NL MVP ever, or Mr. Coors Field.

    But hey, Rollins led the league in Smiles Above Replacement Player and in pre-season predictions that he was lucky enough to have come true, so it’s not surprising that the voters went with him.

  11. Richie Rich says:

    Jonathan -

    I too am saddened by the loss of your novella. Keep up the good work over at The Mockingbird and as always, GO JAYS!

    Nick will probably never forgive me for this, but David Wright certainly wasn’t to blame for the Mets’ demise the last 2+ weeks of the season.

    In 17 games and 73 AB during the fateful blowage of a 6.5 game lead, Wright had a .397 Avg, .451 OBP, .575 SLG, and an 1.027 OPS. All above his 2007 averages. He had a hit in every game and had 11 RBI and 15 Runs (only 2 HR however).

    I still wouldn’t have considered him for the 2007 NL MVP. I’m a real believer in the MVP going to a member of a playoff team. Hence the farcical Andre Dawson Awards.

  12. Nick the Greek says:

    No worries Richie, I forgive you. But please keep in mind the Mets were able to go 15-9 in April, despite David Wright’s heaping pile of crap month he had (.244 BA, .311 SLG with 0 HR and 6 RBI). Sorry people but Wright lost his chance at MVP with his April performance. Yet the Win-shares or VORP mentality would say we could forget that month.

    Justin,

    At the end of the day, statistical analyis is very important, but not the only factor in determine one’s value. It is important to realize that using some of the more complex statistical methods are not unbiased as many would suggest. In order for them to work, there needs to be some underlying assumptions. Like:

    “each hitter having roughly the same opportunity to impact the outcome over the course of a season, assuming they all play a similar number of games”

    That’s a pretty big assumption to make, and in my opinion, not a given. Context is important to understand, as in how they players actually peform. That is something that can not be found in a calculation that translates performance in to theoretical wins. Whether or not the team really wins is more important.

  13. Jonathan says:

    This is getting dangerously close to the “clutch hitting” morass, but I think it’s very hard (and quite often misleading) to try and look at the context of when a player was hitting. Players go up and down, and teams seem to “need” them at different times. Would David Wright had been a better MVP candidate if he was great in April and terrible in September so the Mets crashed even harder, or totally consistent the entire year? The number of games a team wins in a year translates VERY directly into how many runs their players produces, so it’s hard to argue that it matters when they scored them. The wins may be “theoretical”, but they predict actual wins extremely well.

    A great example is Frank Thomas for the Jays. Last year he went on fire in September when Oakland was in the thick of it and carried them into the playoffs. He was a hero, came 4th in MVP voting, etc, etc. This season he did the exact same thing but the Jays were not even close by then, so it was widely suggested it was a BAD thing that he waited until the “season was over” to put up his numbers. If you’re looking at a context that is largely based on what situation the rest of the team has put a player in, it’s not really fair to talk about an individual’s value.

    And just my two cents on the “same opportunity to impact the outcome over the course of a season” thing- that’s not necessarily true, but how significant the situations a player is put in over a season can be measured, and it’s pretty clear that no player performs any better in more important situations from one year to the next. But now we’re into the clutch hitting thing…

    Richie- thanks, will do! Are you a closet Jays fan, or just confident in wishing them well when they are stuck in third place for eternity? :) Those awards are great- I think there should be more farcical awards of all kinds. In fact, that gives me an idea…

    Haha! Copied my comment before submitting it this time! Boy I’m bad at getting those letters right…maybe I AM a bot.

  14. Nick the Greek says:

    I can attest to Richie’s life long affinity for the Jays. I can recall sitting in the bleachers at Comiskey (before it was the Cell) with him, with his Blue Jays hat (old school with the Maple Leaf on it) firmly planted on his head. Of course, they are second fiddle to his Cubbies.

  15. Justin says:

    Perhaps I was overly simplistic in my “same opportunity to impact the outcome” statement. What I meant to say is that two hitters who play the same number of games will have (roughly) a similar number of opportunities to help their team (to varying degrees by getting on base, getting an XBH, hitting a home run) or hinder their team (by making an out).

    They can find themselves in varying situations, yes, but circumstances such as number of men on base, number of outs and so on are largely team-dependent. Basically, a hitter only really affects what happens when he’s at bat, with what the players batting ahead of or behind him wholly out of his control. You could get into arguments over the nebulous concept of “protecting hitters,” but smarter men than I have concluded that that idea is much ado about nothing.

    I stand by my assertion that whether the team wins or not is also largely team-dependent. Better performances by individual players obviously help the team’s chances, but by most any conventional metric, Wright had a better season than Rollins. That one player’s team imploded around him down the stretch isn’t really a reflection on that player so much as it is a condemnation of that player’s team, particularly when the player in question performed well.

    If you simply switched the performances by the Mets’ and Phillies’ pitching staffs on the last week of the season, for instance, the Mets win the division. You can try to credit Rollins with some mythical ability to make his pitchers step up and deliver, but you’d have a hard time making that argument hold water with me.

    Conversely, if you swapped Rollins’ September (or season-long) numbers with Wright’s, I’d argue that the Phillies would have won the division by a few more games.

    Yes, Wright had a terrible April, but I think his struggles were magnified by the fact that it was April. In May, Rollins hit .250 and OBPed .279 (despite his noted slump, Wright got on base at a .370 clip in April). Rollins went homerless in May, but he had a bit of a buffer in that he had already enjoyed a stellar April, so his struggles didn’t seem so pronounced. If Wright had been terrible in July, but had already hit 20 bombs and was clipping along at .330, I don’t think it would have been nearly as noticeable. Similarly, if the Mets had slumped in July and roared back in September to finish a game out, Wright would have been touted as a major reason why, and not wrongly fingered as part of the problem. Say what you will about the timing of the slump, but you can’t possibly blame Wright for his team choking.

    As Jonathan said, talk of “context” can often lead to the “clutch hitting” red herring, with some players being credited with special powers that help them elevate their game when it’s all on the line. So called “clutch” hitting has been shown to be as much a matter of fluke as hitting in any other situation. Though they’re not particularly predictive, though, I think they can be used retroactively - if a player performed extremely well in a limited sample size of “clutch” situations, you can point to that come awards time as a measure of how a player did in certain instances in the past. You just can’t say that player is definitively clutch, as those numbers will likely regress to the mean in the future.

    Even using those metrics, however, Wright has the edge:
    .310/.431/.544 with RISP
    .346/.447/.590 Close and Late

    Rollins:
    .272/.339/.538 w. RISP
    .255/.318/.490 C&L

    Wright did struggle to a .200/.366/.400 line with RISP and two outs, but Rollins only managed a .239/.302/.534 line.

    As I said before, it depends on your definition of “value.” If someone wants to say that a very good player on a playoff team is necessary more “valuable” than a similar or better player on a non-playoff team because the objective is to make the playoffs, I’m not likely to change their mind. Hell, I thought that way myself in the not-too-distant past. I just think there’s a lot to be said for the impact a supporting cast has on a team’s playoff fortunes.

    In any case, I’m with Richie in the sad and Sisyphean world of Blue Jays fandom, so I’ve wasted waaaay too much of all of our time writing about players on a couple of teams I don’t particularly root for.

  16. Nick the Greek says:

    Please note, I am not suggesting Jimmy Rollins deserved the award.

    On the contrary, Matt Holliday, for reasons stated, really deserved the award.

Leave a Reply