Graph of the Day: Big Papi on the World Series Legend Curve

Nothing fancy here folks, just a little perspective.

David Ortiz had a crazy-good (Matsui-esque) World Series, taking MVP honors and his third title overall. His 1.948 OPS is 7th all-time and tops many legendary championship campaigns, such as Reggie Jackson’s 4 HR in 4 straight appearances in 1977 (1.792 OPS for the series). Amazingly, both Babe Ruth (2.022) and Lou Gehrig (2.433!) posted higher numbers in 1928 (a Yankees sweep).

Of course, Papi has been instrumental in all three of Boston’s post-Ruth championships, and his combined OPS of 1.372 is the highest for anyone with 50+ WS plate appearances:
WS LegendsBy my reading, Ortiz fits clearly in an elite class of outliers that includes Bonds, Jackson, Gehrig and Ruth, though Ruth stands out the most. Note that this includes Ruth’s first three WS appearances with Boston, when he was still a pitcher. I.e.:

If we expand the purview to all postseason games, we see a considerable drop in Papi’s OPS, though he still maintains an elite position overall:

PS Legends

Of course there is no change for Ruth and Gehrig, since there were no divisional rounds when they played (though we wouldn’t expect too much of a drop for them anyway, since their playoff OPS aren’t substantial deviations from the rest of their careers). Otherwise, the outlier curve is dominated by the big names of the modern era, with Papi comfortably nestled in the middle.

Sports Geek Mecca: Recap and Thoughts, Part 2

This is part 2 of my “recap” of the Sloan Sports Analytics Conference that I attended in March (part 1 is here), mostly covering Day 2 of the event, but also featuring my petty way-too-long rant about Bill James (which I’ve moved to the end).

Day Two

First I attended the Football Analytics despite finding it disappointing last year, and, alas, it wasn’t any better. Eric Mangini must be the only former NFL coach willing to attend, b/c they keep bringing him back:

Overall, I spent more time in day 2 going to niche panels, research paper presentations and talking to people.

The last, in particular, was great. For example, I had a fun conversation with Henry Abbott about Kobe Bryant’s lack of “clutch.” This is one of Abbott’s pet issues, and I admit he makes a good case, particularly that the Lakers are net losers in “clutch” situations (yes, relative to other teams), even over the periods where they have been dominant otherwise.

Kobe is kind of a pivotal case in analytics, I think. First, I’m a big believer in “Count the Rings, Son” analysis: That is, leading a team to multiple championships is really hard, and only really great players do it. I also think he stands at a kind of nexus, in that stats like PER give spray shooters like him an unfair advantage, but more finely tuned advanced metrics probably over-punish the same. Part of the burden of Kobe’s role is that he has to take a lot of bad shots—the relevant question is how good he is at his job.

Abbott also mentioned that he liked one of my tweets, but didn’t know if he could retweet the non-family-friendly “WTF”:

I also had a fun conversation with Neil Paine of Basketball Reference. He seemed like a very smart guy, but this may be attributable to the fact that we seemed to be on the same page about so many things. Additionally, we discussed a very fun hypo: How far back in time would you have to go for the Charlotte Bobcats to be the odds-on favorites to win the NBA Championship?

As for the “sideshow” panels, they’re generally more fruitful and interesting than the ESPN-moderated super-panels, but they offer fewer easy targets for easy blog-griping. If you’re really interested in what went down, there is a ton of info at the SSAC website. The agenda can be found here. Information on the speakers is here. And, most importantly, videos of the various panels can be found here.

Box Score Rebooted

Featuring Dean Oliver, Bill James, and others.

This was a somewhat interesting, though I think slightly off-target, panel. They spent a lot of time talking about new data and metrics and pooh-poohing things like RBI (and even OPS), and the brave new world of play-by-play and video tracking, etc. But too much of this was discussing a different granularity of data than what can be improved in the current granularity levels. Or, in other words:

James acquitted himself a bit on this subject, arguing that boatloads of new data isn’t useful if it isn’t boiled down into useful metrics. But a more general way of looking at this is: If we were starting over from scratch, with a box-score-sized space to report a statistical game summary, and a similar degree of game-scoring resources, what kinds of things would we want to include (or not) that are different from what we have now?  I can think of a few:

  1. In basketball, it’s archaic that free-throws aren’t broken down into bonus free throws and shot-replacing free throws.
  2. In football, I’d like to see passing stats by down and distance, or at least in a few key categories like 3rd and long.
  3. In baseball, I’d like to see “runs relative to par” for pitchers (though this can be computed easily enough from existing box scores).

In this panel, Dean Oliver took the opportunity to plug ESPN’s bizarre proprietary Total Quarterback Rating. They actually had another panel devoted just to this topic, but I didn’t go, so I’ll put a couple of thoughts here.

First, I don’t understand why ESPN is pushing this as a proprietary stat. Sure, no-one knows how to calculate regular old-fashioned quarterback ratings, but there’s a certain comfort in at least knowing it’s a real thing. It’s a bit like Terms of Service agreements, which people regularly sign without reading: at least you know the terms are out there, so someone actually cares enough to read them, and presumably they would raise a stink if you had to sign away your soul.

As for what we do know, I may write more on this come football season, but I have a couple of problems:

One, I hate the “clutch effect.” TQBR makes a special adjustment to value clutch performance even more than its generic contribution to winning. If anything, clutch situations in football are so bizarre that they should count less. In fact, when I’ve done NFL analysis, I’ve often just cut the 4th quarter entirely, and I’ve found I get better results. That may sound crazy, but it’s a bit like how some very advanced Soccer analysts have cut goal-scoring from their models, instead just focusing on how well a player advances the ball toward his goal: even if the former matters more, its unreliability may make it less useful.

Two, I’m disappointed in the way they “assign credit” for play outcomes:

Division of credit is the next step. Dividing credit among teammates is one of the most difficult but important aspects of sports. Teammates rely upon each other and, as the cliché goes, a team might not be the sum of its parts. By dividing credit, we are forcing the parts to sum up to the team, understanding the limitations but knowing that it is the best way statistically for the rating.

I’m personally very interested in this topic (and have discussed it with various ESPN analytics guys since long before TQBR was released). This is basically an attempt to address the entanglement problem that permeates football statistics.  ESPN’s published explanation is pretty cryptic, and it didn’t seem clear to me whether they were profiling individual players and situations or had created credit-distribution algorithms league-wide.

At the conference, I had a chance to talk with their analytics guy who designed this part of the metric (his name escapes me), and I confirmed that they modeled credit distribution for the entire league and are applying it in a blanket way.  Technically, I guess this is a step in the right direction, but it’s purely a reduction of noise and doesn’t address the real issue.  What I’d really like to see is like a recursive model that imputes how much credit various players deserve broadly, then uses those numbers to re-assign credit for particular outcomes (rinse and repeat).

Deconstructing the Rebound With Optical Tracking Data

Rajiv Maheswaran, and other nerds.

This presentation was so awesome that I offered them a hedge bet for the “Best Research Paper” award. That is, I would bet on them at even money, so that if they lost, at least they would receive a consolation prize. They declined. And won. Their findings are too numerous and interesting to list, so you should really check it out for yourself.

Obviously my work on the Dennis Rodman mystery makes me particularly interested in their theories of why certain players get more rebounds than others, as I tweeted in this insta-hypothesis:

Following the presentation, I got the chance to talk with Rajiv for quite a while, which was amazing. Obviously they don’t have any data on Dennis Rodman directly, but Rajiv was also interested in him and had watched a lot of Rodman video. Though anecdotal, he did say that his observations somewhat confirmed the theory that a big part of Rodman’s rebounding advantage seemed to come from handling space very well:

  1. Even when away from the basket, Rodman typically moved to the open space immediately following a shot. This is a bit different from how people often think about rebounding as aggressively attacking the ball (or as being able to near-psychically predict where the ball is going to come down.
  2. Also rather than simply attacking the board directly, Rodman’s first inclination was to insert himself between the nearest opponent and the basket. In theory, this might slightly decrease the chances of getting the ball when it heads in toward his previous position, but would make up for it by dramatically increasing his chances of getting the ball when it went toward the other guy.
  3. Though a little less purely strategical, Rajiv also thought that Rodman was just incredibly good at #2. That is, he was just exceptionally good at jockeying for position.

To some extent, I guess this is just rebounding fundamentals, but I still think it’s very interesting to think about the indirect probabilistic side of the rebounding game.

Live B.S. Report with Bill James

Quick tangent: At one point, I thought Neil Paine summed me up pretty well as a “contrarian to the contrarians.”  Of course, I’m don’t think I’m contrary for the sake of contrariness, or that I’m a negative person (I don’t know how many times I’ve explained to my wife that just because I hated a movie doesn’t mean I didn’t enjoy it!), it’s just that my mind is naturally inclined toward considering the limitations of whatever is put in front of it. Sometimes that means criticizing the status quo, and sometimes that means criticizing its critics.

So, with that in mind, I thought Bill James’s showing at the conference was pretty disappointing, particularly his interview with Bill Simmons.

I have a lot of respect for James.  I read his Historical Baseball Abstract and enjoyed it considerably more than Moneyball.  He has a very intuitive and logical mind. He doesn’t say a bunch of shit that’s not true, and he sees beyond the obvious. In Saturday’s “Rebooting the Box-score” panel, he made an observation that having 3 of 5 people on the panel named John implied that the panel was [likely] older than the rest of the room.  This got a nice laugh from the attendees, but I don’t think he was kidding.  And whether he was or not, he still gets 10 kudos from me for making the closest thing to a Bayesian argument I heard all weekend.  And I dutifully snuck in for a pic with him:

James was somewhat ahead of his time, and perhaps he’s still one of the better sports analytic minds out there, but in this interview we didn’t really get to hear him analyze anything, you know, sportsy. This interview was all about Bill James and his bio and how awesome he was and how great he is and how hard it was for him to get recognized and how much he has changed the game and how, without him, the world would be a cold, dark place where ignorance reigned and nobody had ever heard of “win maximization.”

Bill Simmons going this route in a podcast interview doesn’t surprise me: his audience is obviously much broader than the geeks in the room, and Simmons knows his audience’s expectations better than anyone. What got to me was James’s willingness to play along, and everyone else’s willingness to eat it up. Here’s an example of both, from the conference’s official Twitter account:

Perhaps it’s because I never really liked baseball, and I didn’t really know anyone did any of this stuff until recently, but I’m pretty certain that Bill James had virtually zero impact on my own development as a sports data-cruncher.  When I made my first PRABS-style basketball formula in the early 1990’s (which was absolutely terrible, but is still more predictive than PER), I had no idea that any sports stats other than the box score even existed. By the time I first heard the word “sabermetrics,” I was deep into my own research, and didn’t bother really looking into it deeply until maybe a few months ago.

Which is not to say I had no guidance or inspiration.  For me, a big epiphanous turning point in my approach to the analysis of games did take place—after I read David Sklansky’s Theory of Poker. While ToP itself was published in 1994, Sklansky’s similar offerings date back to the 70s, so I don’t think any broader causal pictures are possible.

More broadly, I think the claim that sports analytics wouldn’t have developed without Bill James is preposterous. Especially if, as i assume we do, we firmly believe we’re right.  This isn’t like L. Ron Hubbard and Incident II: being for sports analytics isn’t like having faith in a person or his religion. It simply means trying to think more rigorously about sports, and using all of the available analytical techniques we can to gain an advantage. Eventually, those who embrace the right will win out, as we’ve seen begin to happen in sports, and as has already happened in nearly every other discipline.

Indeed, by his own admission, James liked to stir controversy, piss people off, and talk down to the old guard whenever possible. As far as we know, he may have set the cause of sports analytics back, either by alienating the people who could have helped it gain acceptance, or by setting an arrogant and confrontational tone for his disciples (e.g., the uplifting “don’t feel the need to explain yourself” message in Moneyball). I’m not saying that this is the case or even a likely possibility, I’m just trying to illustrate that giving someone credit for all that follows—even a pioneer like James—is a dicey game that I’d rather not participate in, and that he definitely shouldn’t.

On a more technical note, one of his oft-quoted and re-tweeted pearls of wisdom goes as follows:

Sounds great, right? I mean, not really, I don’t get the metaphor: if the sea is full of ignorance, why are you collecting water from it with a bucket rather than some kind of filtration system? But more importantly, his argument in defense of this claim is amazingly weak. When Simmons asked what kinds of things he’s talking about, he repeatedly emphasized that we have no idea whether a college sophomore will turn out to be a great Major League pitcher.  True, but, um, we never will. There are too many variables, the input and outputs are too far apart in time, and the contexts are too different.  This isn’t the sea of ignorance, it’s a sea of unknowns.

Which gets at one of my big complaints about stats-types generally.  A lot of people seem to think that stats are all about making exciting discoveries and answering questions that were previously unanswerable. Yes, sometimes you get lucky and uncover some relationship that leads to a killer new strategy or to some game-altering new dynamic. But most of the time, you’ll find static. A good statistical thinker doesn’t try to reject the static, but tries to understand it: Figuring out what you can’t know is just as important as figuring out what you can know.

On Twitter I used this analogy:

Success comes with knowing more true things and fewer false things than the other guy.

Sports Geek Mecca: Recap and Thoughts, Part 1

So, over the weekend, I attended my second MIT Sloan Sports Analytics Conference. My experience was much different than in 2011: Last year, I went into this thing barely knowing that other people were into the same things I was. An anecdote: In late 2010, I was telling my dad how I was about to have a 6th or 7th round interview for a pretty sweet job in sports analysis, when he speculated, “How many people can there even be in that business? 10? 20?” A couple of months later, of course, I would learn.

A lot has happened in my life since then: I finished my Rodman series, won the ESPN Stat Geek Smackdown (which, though I am obviously happy to have won, is not really that big a deal—all told, the scope of the competition is about the same as picking a week’s worth of NFL games), my wife and I had a baby, and, oh yeah, I learned a ton about the breadth, depth, and nature of the sports analytics community.

For the most part, I used Twitter as sort of my de facto notebook for the conference.  Thus, I’m sorry if I’m missing a bunch of lengthier quotes and/or if I repeat a bunch of things you already saw in my live coverage, but I will try to explain a few things in a bit more detail.

For the most part, I’ll keep the recap chronological.  I’ve split this into two parts: Part 1 covers Friday, up to but not including the Bill Simmons/Bill James interview.  Part 2 covers that interview and all of Saturday.

Opening Remarks:

From the pregame tweets, John Hollinger observed that 28 NBA teams sent representatives (that we know of) this year.  I also noticed that the New England Revolution sent 2 people, while the New England Patriots sent none, so I’m not sure that number of official representatives reliably indicates much.

The conference started with some bland opening remarks by Dean David Schmittlein.  Tangent: I feel like political-speak (thank everybody and say nothing) seems to get more and more widespread every year. I blame it on fear of the internet. E.g., in this intro segment, somebody made yet another boring joke about how there were no women present (personally, I thought there were significantly more than last year), and was followed shortly thereafter by a female speaker, understandably creating a tiny bit of awkwardness. If that person had been more important (like, if I could remember his name to slam him), I doubt he would have made that joke, or any other joke. He would have just thanked everyone and said nothing.

The Evolution of Sports Leagues

Featuring Gary Bettman (NHL), Rob Manfred (MLB), Adam Silver (NBA), Steve Tisch (NYG) and Michael Wilbon moderating.

This panel really didn’t have much of a theme, it was mostly Wilbon creatively folding a bunch of predictable questions into arbitrary league issues.  E.g.: ” “What do you think about Jeremy Lin?!? And, you know, overseas expansion blah blah.”

I don’t get the massive cultural significance of Jeremy Lin, personally.  I mean, he’s not the first ethnically Chinese player to have NBA success (though he is perhaps the first short one).  The discussion of China, however, was interesting for other reasons. Adam Silver claimed that Basketball is already more popular in China than soccer, with over 300 million Chinese people playing it.  Those numbers, if true, are pretty mind-boggling.

Finally, there was a whole part about labor negotiations that was pretty well summed up by this tweet:

Hockey Analytics

Featuring Brian Burke, Peter Chiarelli, Mike Milbury and others.

The panel started with Peter Chiarelli being asked how the world champion Boston Bruins use analytics, and in an ominous sign, he rambled on for a while about how, when it comes to scouting, they’ve learned that weight is probably more important than height.

Overall, it was a bit like any scene from the Moneyball war room, with Michael Schuckers (the only pro-stats guy) playing the part of Jonah Hill, but without Brad Pitt to protect him.

When I think of Brian Burke, I usually think of Advanced NFL Stats, but apparently there’s one in Hockey as well.  Burke is GM/President of the Toronto Maple Leafs. At one point he was railing about how teams that use analytics have never won anything, which confused me since I haven’t seen Toronto hoisting any Stanley Cups recently, but apparently he did win a championship with the Mighty Ducks in 2007, so he clearly speaks with absolute authority.

This guy was a walking talking quote machine for the old school. I didn’t take note of all the hilarious and/or non-sensical things he said, but for some examples, try searching Twitter for “#SSAC Brian Burke.” To give an extent of how extreme, someone tweeted this quote at me, and I have no idea if he actually said it or if this guy was kidding.

In other words, Burke was literally too over the top to effectively parody.

On the other hand, in the discussion of concussions, I thought Burke had sort of a folksy realism that seemed pretty accurate to me.  I think his general point is right, if a bit insensitive: If we really changed hockey so much as to eliminate concussions entirely, it would be a whole different sport (which he also claimed no one would watch, an assertion which is more debatable imo).  At the end of the day, I think professional sports mess people up, including in the head.  But, of course, we can’t ignore the problem, so we have to keep proceeding toward some nebulous goal.

Mike Milbury, presently a card-carrying member of the media, seemed to mostly embrace the alarmist media narrative, though he did raise at least one decent point about how the increase in concussions—which most people are attributing to an increase in diagnoses—may relate to recent rules changes that have sped up the game.

But for all that, the part that frustrated me the most was when Michael Schuckers, the legitimate hockey statistician at the table, was finally given the opportunity to talk.  90% of the things that came out of his mouth were various snarky ways of asserting that face-offs don’t matter.  I mean, I assume he’s 100% right, but just had no clue how to talk to these guys.  Find common ground: you both care about scoring goals, defending goals, and winning.  Good face-off skill get you the puck more often in the right situations. The question is how many extra possessions you get and how valuable those possessions are? And finally, what’s the actual decision in question?

Baseball Analytics

Featuring Scott Boras, Scott Boras, Scott Boras, some other guys, Scott Boras, and, oh yeah, Bill James.

In stark constrast to the Hockey panel, the Baseball guys pretty much bent over backwards to embrace analytics as much as possible.  As I tweeted at the time:

Scott Boras seems to like hearing Scott Boras talk.  Which is not so bad, because Scott Boras actually did seem pretty smart and well informed: Among other things, Scott Boras apparently has a secret internal analytics team. To what end, I’m not entirely sure, since Scott Boras also seemed to say that most GM’s overvalue players relative to what Scott Boras’s people tell Scott Boras.

At this point, my mind wandered:

How awesome would that be, right?

Anyway, in between Scott Boras’s insights, someone asked this Bill James guy about his vision for the future of baseball analytics, and he gave two answers:

  1. Evaluating players from a variety of contexts other than the minor leagues (like college ball, overseas, Cubans, etc).
  2. Analytics will expand to look at the needs of the entire enterprise, not just individual players or teams.

Meh, I’m a bit underwhelmed.  He talked a bit about #1 in his one-on-one with Bill Simmons, so I’ll look at that a bit more in my review of that discussion. As for #2, I think he’s just way way off: The business side of sports is already doing tons of sophisticated analytics—almost certainly way more than the competition side—because, you know, it’s business.

E.g., in the first panel, there was a fair amount of discussion of how the NBA used “sophisticated modeling” for many different lockout-related analyses (I didn’t catch the Ticketing Analytics panel, but from its reputation, and from related discussions on other panels, it sounds like that discipline has some of the nerdiest analysis of all).

Scott Boras let Bill James talk about a few other things as well:  E.g., James is not a fan of new draft regulations, analogizing them to government regulations that “any economist would agree” inevitably lead to market distortions and bursting bubbles.  While I can’t say I entirely disagree, I’m going to go out on a limb and guess that his political leanings are probably a bit Libertarian?

Basketball Analytics

Featuring Jeff Van Gundy, Mike Zarren, John Hollinger, and Mark Cuban Dean Oliver.

If every one of these panels was Mark Cuban + foil, it would be just about the most awesome weekend ever (though you might not learn the most about analytics). So I was excited about this one, which, unfortunately, Cuban missed. Filling in on zero/short notice was Dean Oliver.  Overall, here’s Nathan Walker’s take:

This panel actually had some pretty interesting discussions, but they flew by pretty fast and often followed predictable patterns, something like this:

  1. Hollinger says something pro-stats, though likely way out of his depth.
  2. Zarren brags about how they’re already doing that and more on the Celtics.
  3. Oliver says something smart and nuanced that attempts to get at the underlying issues and difficulties.
  4. Jeff Van Gundy uses forceful pronouncements and “common sense” to dismiss his strawman version of what the others have been saying.

E.g.:

Zarren talked about how there is practically more data these days than they know what to do with.  This seems true and I think it has interesting implications. I’ll discuss it a little more in Part 2 re: the “Rebooting the Box Score” talk.

There was also an interesting discussion of trades, and whether they’re more a result of information asymmetry (in other words, teams trying to fleece each other), or more a result of efficient trade opportunities (in other words, teams trying to help each other).  Though it really shouldn’t matter—you trade when you think it will help you, whether it helps your trade partner is mostly irrelevant—Oliver endorsed the latter.  He makes the point that, with such a broad universe of trade possibilities, looking for mutually beneficial situations is the easiest way to find actionable deals.  Fair enough.

Coaching Analytics

Featuring coaching superstars Jeff Van Gundy, Eric Mangini, and Bill Simmons.  Moderated by Daryl Morey.

OK, can I make the obvious point that Simmons and Morey apparently accidentally switched role cards?  As a result, this talk featured a lot of Simmons attacking coaches and Van Gundy defending them.  I honestly didn’t remember Mangini was on this panel until looking back at the book (which is saying something, b/c Mangini usually makes my blood boil).

There was almost nothing on, say, how to evaluate coaches, say, by analyzing how well their various decisions comported with the tenets of win maximization.  There was a lengthy (and almost entirely non-analytical) discussion of that all-important question of whether an NBA coach should foul or not up by 3 with little time left.  Fouling probably has a tiny edge, but I think it’s too close and too infrequent to be very interesting (though obviously not as rare, it reminds me a bit of the impassioned debates you used to see on Poker forums about whether you should fast-play or slow-play flopped quads in limit hold’em).

There was what I thought was a funny moment when Bill Simmons was complaining about how teams seem to recycle mediocre older coaches rather than try out young, fresh talent. But when challenged by Van Gundy, Simmons drew a blank and couldn’t think of anyone.  So, Bill, this is for you.  Here’s a table of NBA coaches who have coached at least 1000 games for at least 3 different teams, while winning fewer than 60% of their games and without winning any championships:

[table “8” not found /]

Note that I’m not necessarily agreeing with Simmons: Winning championships in the NBA is hard, especially if your team lacks uber-stars (you know, Michael Jordan, Magic Johnson, Dennis Rodman, et al).

Part 2 coming soon!

Honestly, I got a little carried away with my detailed analysis/screed on Bill James, and I may have to do a little revising. So due to some other pressing writing commitments, you can probably expect Part 2 to come out this Saturday (Friday at the earliest).

A Defense of Sudden Death Playoffs in Baseball

So despite my general antipathy toward America’s pastime, I’ve been looking into baseball a lot lately.  I’m working on a three part series that will “take on” Pythagorean Expectation.  But considering the sanctity of that metric, I’m taking my time to get it right.

For now, the big news is that Major League Baseball is finally going to have realignment, which will most likely lead to an extra playoff team, and a one game Wild Card series between the non–division winners.  I’m not normally one who tries to comment on current events in sports (though, out of pure frustration, I almost fired up WordPress today just to take shots at Tim Tebow—even with nothing original to say), but this issue has sort of a counter-intuitive angle to it that motivated me to dig a bit deeper.

Conventional wisdom on the one game playoff is pretty much that it’s, well, super crazy.  E.g., here’s Jayson Stark’s take at ESPN:

But now that the alternative to finishing first is a ONE-GAME playoff? Heck, you’d rather have an appendectomy than walk that tightrope. Wouldn’t you?

Though I think he actually likes the idea, precisely because of the loco factor:

So a one-game, October Madness survivor game is what we’re going to get. You should set your DVRs for that insanity right now.

In the meantime, we all know what the potential downside is to this format. Having your entire season come down to one game isn’t fair. Period.

I wouldn’t be too sure about that.  What is fair?  As I’ve noted, MLB playoffs are basically a crapshoot anyway.  In my view, any move that MLB can make toward having the more accomplished team win more often is a positive step.  And, as crazy as it sounds, that is likely exactly what a one game playoff will do.

The reason is simple: home field advantage.  While smaller than in other sports, the home team in baseball still wins around 55% of the time, and more games means a smaller percentage of your series games played at home.  While longer series’ eventually lead to better teams winning more often, the margins in baseball are so small that it takes a significant edge for a team to prefer to play ANY road games:

Note: I calculated these probabilities using my favorite binom.dist function in Excel. Specifically, where the number of games needed to win a series is k, this is the sum from x=0 to x=k of the p(winning x home games) times p(winning at least k-x road games).

So assuming each team is about as good as their records (which, regardless of the accuracy of the assumption, is how they deserve to be treated), a team needs about a 5.75% generic advantage (around 9-10 games) to prefer even a seven game series to a single home game.

But what about the incredible injustice that could occur when a really good team is forced to play some scrub?  E.g., Stark continues:

It’s a lock that one of these years, a 98-win wild-card team is going to lose to an 86-win wild-card team. And that will really, really seem like a miscarriage of baseball justice. You’ll need a Richter Scale handy to listen to talk radio if that happens.

But you know what the answer to those complaints will be?

“You should have finished first. Then you wouldn’t have gotten yourself into that mess.”

Stark posits a 12 game edge between two wild card teams, and indeed, this could lead to a slightly worse spot for the better team than a longer series.  12 games corresponds to a 7.4% generic advantage, which means a 7-game series would improve the team’s chances by about 1% (oh, the humanity!).  But the alternative almost certainly wouldn’t be seven games anyway, considering the first round of the playoffs is already only five.  At that length, the “miscarriage of baseball justice” would be about 0.1% (and vs. 3 games, sudden death is still preferable).

If anything, consider the implications of the massive gap on the left side of the graph above: If anyone is getting screwed by the new setup, it’s not the team with the better record, it’s a better team with a worse record, who won’t get as good a chance to demonstrate their actual superiority (though that team’s chances are still around 50% better than they would have been under the current system).  And those are the teams that really did “[get themselves] into that mess.”

Also, the scenario Stark posits is extremely unlikely: basically, the difference between 4th and 5th place is never 12 games.  For comparison, this season the difference between the best record in the NL and the Wild Card Loser was only 13 games, and in the AL it was only seven.  Over the past ten seasons, each Wild Card team and their 5th place finisher were separated by an average of 3.5 games (about 2.2%):

Note that no cases over this span even rise above the seven game “injustice line” of 5.75%, much less to the nightmare scenario of 7.5% that Stark invokes.  The standard deviation is about 1.5%, and that’s with the present imbalance of teams (note that the AL is pretty consistently higher than the NL, as should be expected)—after realignment, this plot should tighten even further.

Indeed, considering the typically small margins between contenders in baseball, on average, this “insane” sudden death series may end up being the fairest round of the playoffs.

From the Live Blog: Baseball Haterade (With NFL Regression Tangent)

[For ease of reference—with apologies to those of you who sat through or otherwise already read my NFL Live Blog from this Sunday—I’m once again splitting a few of the topics I covered out into individual posts. I’ve made mostly made only cosmetic adjustments (additional comments are in brackets or at the end), so apologies if these posts aren’t quite as clean or detailed as a regular article. For flavor and context, I still recommend reading the whole thing.]

In support of last night’s screed [Why Baseball and I are, Like, Unmixy Things], especially the claim that “[MLB] games are either not important enough to be interesting (98% of the regular season), or too important to be meaningful (100% of the playoffs),” here’s a graph I made to illustrate just how silly the MLB Playoffs are:

Not counting home-field advantage (which is weakest in baseball anyway), this represents the approximate binomial probability [thank you, again, binom.dist() function] of the team with the best record in the league [technically, a team that has an actual expectation against an average opponent equal to best record] winning a series of length X against the playoff team with the worst record [again, technically, a team that has an actual expectation equal to worst record] going in.  The chances of winning each game are approximated by taking .5 + better win percentage – worse win percentage (note, of course, the NFL curve is exaggerated b/c of regression to the mean: a team that goes 14-2 doesn’t won’t actually win 88% of their games against an average opponent. But they won’t regress nearly enough for their expectation to drop anywhere near MLB levels).  The brighter and bigger data points represent the actual first round series lengths in each sport.

By this approximation, the best team against the worst team in a 1st round series (using the latest season’s standings as the benchmark) in MLB would win about 64% of the time, while in the NBA they would win ~95% of the time.  To win 2/3 of the time, MLB would need to switch to a 9 game series instead of 5; and to have the best team win 75% of the time, they would need to shift to 21 (for the record, in order to match the NBA’s 95% mark, they would have to move to a 123 game series.  I know, this isn’t perfectly calculated, but it’s ballpark accurate).  Personally, I like the fact that the NBA and NFL postseasons generally feature the best teams winning.

Moreover, it also makes upsets more meaningful: since the math is against “true” upsets happening often, an apparent upset can be significant: it often indicates—Bayes-wise (ok, if that’s not a word, it should be)—that the upsetting team was actually better.  In baseball, an upset pretty much just means that the coin came up tails.

Adam asks:

In the MLB vs. NFL vs. NBA Playoffs graph, the chances of best beating worst in first round for NFL for a 1 game series is almost 95%.

Looking at the odds to this week’s NFL games, the biggest favorite was GB verses Denver and they were only an 88% chance of winning by the money line (-700). Denver is almost certainly not a playoff team, so it’s tough to imagine an even more lopsided playoff matchup that could get to 95%. What am I missing?

I sort of addressed this in my longer explanation, but he’s not missing anything: the football effect is exaggerated. First off, to your specific concern, this early in the season there is even more uncertainty than in the playoffs.  But second, and more importantly, this method for approximating a win percentage is less accurate in the extremes, especially when factoring in regression to the mean (which is a huge factor given the NFL’s very small sample sizes).

In fact, the regression to the mean effect in the NFL is SO strong, that I think it helps explain why so many Bye-teams lose against the Wild Card game winners (without having to resort to “momentum” or psychological factors for our explanation).  By virtue of having the best records in the league, they are the most likely teams to have significant regression effects.  That is, their true strength is likely to be lower than what their records indicate.  Conversely, the teams that win in the bye week (against other playoff-level competition), are, from a Bayesian perspective, more likely to be better than their records indicated.  Think of it like this: there’s a range of possible true strength for each playoff team: when you match two of those teams against each other (in the WC round), the one who wins might have just gotten lucky, but that particular result is more likely to occur when the winning team’s actual strength was closer to the top of their range and/or their opponent’s was closer to their bottom.

I’ve looked at this before, and it’s very easy to construct scenarios where WC teams with worse records have a higher projected strength than Bye team opponents with better records.  Factor in the fact that home field advantage actually decreases in the playoffs (it’s a common misconception that HFA is more important in the post-season: adjusting for team quality, it’s actually significantly reduced—which probably has something to do with the post-season ref shuffle: see section on ref bias in this post), and you have a recipe for frequent upsets.

In retrospect, I probably should have just left the NFL out of that graph.  Basketball makes for a much better comparison [both aesthetically and analytically]:

Live Blog Tomorrow, Plus: Why Baseball and I Are, Like, Unmixy Things

As a friend of mine put it, “Posts merely announcing something are pretty lame,” so before I announce tomorrow’s event, let me explain why I will NOT be live-blogging any of tomorrow’s baseball games:

It’s a constant source of guilt for me that I don’t like baseball more, but I can’t help it: To me, the games are either not important enough to be interesting (98% of the regular season), or too important to be meaningful (100% of the playoffs).  That said, I still dabble in baseball analysis myself, and I certainly understand the statistical appeal: the data-sets are huge, the variables are mostly independent, and—even in a post-Moneyball world—the screw-ups are ample.

I have many baseball-loving friends, and talking to them about this subject almost always goes like this exchange from Buffy the Vampire Slayer (with appropriate substitutions, and minus the sexual undertones):

[Me]: [Baseball?]

[Every Baseball Fan Ever]: Yeah.

[Me]: You seriously [watch baseball] for fun?

[Fan]: Well, not [minor leagues] or anything, but yeah. Don’t you?

[Me]: Actually, [no leagues] is more my specialty. I’m an avid [non-baseball watcher].

[Fan]: You’re kidding, right? I mean, you know how to [watch baseball]..

[Me]: Well, I took the class.. [Baseball] and [I] are, like .. un-mixy things.

[Fan]: It’s just because you haven’t had a good experience yet. You can have the best time [watching baseball].  It’s not about getting somewhere.  You have to take your time.  Forget about everything.  Just.. relax.  Let it wash over you.  The air..  motion..  Just, let it roll.

[Me]: We are talking about [baseball], right?

I also don’t entirely believe the hype about it being such an integral part of our national heritage, and I think that perception today has been influenced heavily by nostalgia from influential people like George Will and Ken Burns, and I posted a graph somewhat supportive of that a while back:

image

Note also that the NFL’s relative popularity vs. MLB is nothing new.  Here is a year-by-year plot showing the World Series ratings vs. Super Bowl ratings:

Since it’s inception, the Super Bowl has beaten even the highest-rated World Series game every single year (recently, it has even been beating the entire series combined for total viewers).

OK, so with that out of the way: Once again, I’ll be live-blogging NFL Sunday from 10am until the final whistle tomorrow—now powered by NFL Sunday Ticket!  Here’s the explanation, and here’s last week’s end product.

C.R.E.A.M. (Or, “How to Win a Championship in Any Sport”)

Does cash rule everything in professional sports?  Obviously it keeps the lights on, and it keeps the best athletes in fine bling, but what effect does the root of all evil have on the competitive bottom line—i.e., winning championships?

For this article, let’s consider “economically predictable” a synonym for “Cash Rules”:  I will use extremely basic economic reasoning and just two variables—presence of a salary cap and presence of a salary max in a sport’s labor agreement—to establish, ex ante, which fiscal strategies we should expect to be the most successful.  For each of the 3 major sports, I will then suggest (somewhat) testable hypotheses, and attempt to examine them.  If the hypotheses are confirmed, then Method Man is probably right—dollar dollar bill, etc.

Conveniently, on a basic yes/no grid of these two variables, our 3 major sports in the U.S. fall into 3 different categories:

image

So before treating those as anything but arbitrary arrangements of 3 letters, we should consider the dynamics each of these rules creates independently.  If your sport has a team salary cap, getting “bang for your buck” and ferreting out bargains is probably more important to winning than overall spending power.  And if your sport has a low maximum individual salary, your ability to obtain the best possible players—in a market where everyone knows their value but must offer the same amount—will also be crucial.  Considering permutations of thriftiness and non-economic acquisition ability, we end up with a simple ex ante strategy matrix that looks like this:

image

These one-word commandments may seem overly simple—and I will try to resolve any ambiguity looking at the individual sports below—but they are only meant to describe the most basic and obvious economic incentives that salary caps and salary maximums should be expected to create in competitive environments.

Major League Baseball: Spend

Hypothesis:  With free-agency, salary arbitration, and virtually no payroll restrictions, there is no strategic downside to spending extra money.  Combined with huge economic disparities between organizations, this means that teams that spend the most will win the most.

Analysis:  Let’s start with the New York Yankees (shocker!), who have been dominating baseball since 1920, when they got Babe Ruth from the Red Sox for straight cash, homey.  Note that I take no position on whether the Yankees filthy lucre is destroying the sport of Baseball, etc.  Also, I know very little about the Yankees payroll history, prior to 1988 (the earliest the USA Today database goes).  But I did come across this article from several years ago, which looks back as far as 1977.  For a few reasons, I think the author understates the case.  First, the Yankees low-salary period came at the tail end of a 12 year playoff drought (I don’t have the older data to manipulate, but I took the liberty to doodle on his original graph):

image

Note: Smiley-faces are Championship seasons.  The question mark is for the 1994 season, which had no playoffs.

Also, as a quirk that I’ve discussed previously, I think including the Yankees in the sample from which the standard deviation is drawn can be misleading: they have frequently been such a massive outlier that they’ve set their own curve.  Comparing the Yankees to the rest of the league, from last season back to 1988, looks like this:

image

Note: Green are Championship seasons.  Red are missed playoffs.

In 2005 the rest-of-league average payroll was ~$68 million, and the Yankees’ was ~$208 million (the rest-of-league standard deviation was $23m, but including the Yankees, it would jump to $34m).

While they failed to win the World Series in some of their most expensive seasons, don’t let that distract you:  money can’t guarantee a championship, but it definitely improves your chances.  The Yankees have won roughly a quarter of the championships over the last 20 years (which is, astonishingly, below their average since the Ruth deal).  But it’s not just them.  Many teams have dramatically increased their payrolls in order to compete for a World Series title—and succeeded! Over the past 22 years, the top 3 payrolls (per season) have won a majority of titles:

image

As they make up only 10% of the league, this means that the most spendy teams improved their title chances, on average, by almost a factor of 6.

National Basketball Association: Recruit (Or: “Press Your Bet”)

Hypothesis:  A fairly strict salary cap reigns in spending, but equally strict salary regulations mean many teams will enjoy massive surplus value by paying super-elite players “only” the max.  Teams that acquire multiple such players will enjoy a major championship advantage.

Analysis: First, in case you were thinking that the 57% in the graph above might be caused by something other than fiscal policy, let’s quickly observe how the salary cap kills the “spend” strategy: image

Payroll information from USA Today’s NBA and NFL Salary Databases (incidentally, this symmetry is being threatened, as the Lakers, Magic, and Mavericks have the top payrolls this season).

I will grant there is a certain apples-to-oranges comparison going on here: the NFL and NBA salary-cap rules are complex and allow for many distortions.  In the NFL teams can “clump” their payroll by using pro-rated signing bonuses (essentially sacrificing future opportunities to exceed the cap in the present), and in the NBA giant contracts are frequently moved to bad teams that want to rebuild, etc.  But still: 5%.  Below expectation if championships were handed out randomly.
And basketball championships are NOT handed out randomly.  My hypothesis predicts that championship success will be determined by who gets the most windfall value from their star player(s).  Fifteen of the last 20 NBA championships have been won by Kobe Bryant, Tim Duncan, or Michael Jordan.  Clearly star-power matters in the NBA, but what role does salary play in this?

Prior to 1999, the NBA had no salary maximum, though salaries were regulated and limited in a variety of ways.  Teams had extreme advantages signing their own players (such as Bird rights), but lack of competition in the salary market mostly kept payrolls manageable.  Michael Jordan famously signed a lengthy $25 million contract extension basically just before star player salaries exploded, leaving the Bulls with the best player in the game for a song (note: Hakeem Olajuwon’s $55 million payday came after he won 2 championships as well).  By the time the Bulls were forced to pay Jordan his true value, they had already won 4 championships and built a team around him that included 2 other All-NBA caliber players (including one who also provided extreme surplus value).  Perhaps not coincidentally, year 6 in the graph below is their record-setting 72-10 season:
image

Note: Michael Jordan’s salary info found here.  Historical NBA salary cap found here.

The star player salary situation caught the NBA off-guard.  Here’s a story from Time magazine in 1996 that quotes league officials and executives:

“It’s a dramatic, strategic judgment by a few teams,” says N.B.A. deputy commissioner Russ Granik. .
Says one N.B.A. executive: “They’re going to end up with two players making about two-thirds of the salary cap, and another pair will make about 20%. So that means the rest of the players will be minimum-salary players that you just sign because no one else wants them.” . . .
Granik frets that the new salary structure will erode morale. “If it becomes something that was done across the league, I don’t think it would be good for the sport,” he says.

What these NBA insiders are explaining is basic economics:  Surprise!  Paying better players big money means less money for the other guys.  Among other factors, this led to 2 lockouts and the prototype that would eventually lead to the current CBA (for more information than you could ever want about the NBA salary cap, here is an amazing FAQ).

The fact that the best players in the NBA are now being underpaid relative to their value is certain.  As a back of the envelope calculation:  There are 5 players each year that are All-NBA 1st team, while 30+ players each season are paid roughly the maximum.  So how valuable are All-NBA 1st team players compared to the rest?  Let’s start with: How likely is an NBA team to win a championship without one?

image

In the past 20 seasons, only the 2003-2004 Detroit Pistons won the prize without a player who was a 1st-Team All-NBAer in their championship year.
To some extent, these findings are hard to apply strategically.  All but those same Pistons had at least one home-grown All-NBA (1st-3rd team) talent—to win, you basically need the good fortune to catch a superstar in the draft.  If there is an actionable take-home, however, it is that most (12/20) championship teams have also included a second All-NBA talent acquired through trade or free agency: the Rockets won after adding Clyde Drexler, the second Bulls 3-peat added Dennis Rodman (All-NBA 3rd team with both the Pistons and the Spurs), the Lakers and Heat won after adding Shaq, the Celtics won with Kevin Garnett, and the Lakers won again after adding Pau Gasol.

Each of these players was/is worth more than their market value, in most cases as a result of the league’s maximum salary constraints.  Also, in most of these cases, the value of the addition was well-known to the league, but the inability of teams to outbid each other meant that basketball money was not the determinant factor in the players choosing their respective teams.  My “Recruit” strategy anticipated this – though it perhaps understates the relative importance of your best player being the very best.  This is more a failure of the “recruit” label than of the ex ante economic intuition, the whole point of which was that cap+max –> massive importance of star players.

National Football League: Economize (Or: “WWBBD?”)

Hypothesis:  The NFL’s strict salary cap and lack of contract restrictions should nullify both spending and recruiting strategies.  With elite players paid closer to what they are worth, surplus value is harder to identify.  We should expect the most successful franchises to demonstrate both cunning and wise fiscal policy.

Analysis: Having a cap and no max salaries is the most economically efficient fiscal design of any of the 3 major sports.  Thus, we should expect that massively dominating strategies to be much harder to identify.  Indeed, the dominant strategies in the other sports are seemingly ineffective in the NFL: as demonstrated above, there seems to be little or no advantage to spending the most, and the abundant variance in year-to-year team success in the NFL would seem to rule out the kind of individual dominance seen in basketball.

Thus, to investigate whether cunning and fiscal sense are predominant factors, we should imagine what kinds of decisions a coach or GM would make if his primary qualities were cunning and fiscal sensibility.  In that spirit, I’ve come up with a short list of 5 strategies that I think are more or less sound, and that are based largely on classically “economic” considerations:

1.  Beg, borrow, or steal yourself a great quarterback:
Superstar quarterbacks are probably underpaid—even with their monster contracts—thus making them a good potential source for surplus value.  Compare this:

Note: WPA (wins added) stats from here.

With this:

The obvious caveat here is that the entanglement question is still empirically open:  How much do good QB’s make their teams win v. How much do winning teams make their QB’s look good?  But really quarterbacks only need to be responsible for a fraction of the wins reflected in their stats to be worth more than what they are being paid. (An interesting converse, however, is this: the fact that great QB’s don’t win championships with the same regularity as, say, great NBA players, suggests that a fairly large portion of the “value” reflected by their statistics is not their responsibility).

2. Plug your holes with the veteran free agents that nobody wants, not the ones that everybody wants:
If a popular free agent intends to go to the team that offers him the best salary, his market will act substantially like a “common value” auction.  Thus, beware the Winner’s Curse. In simple terms: If 1) a player’s value is unknown, 2) each team offers what they think the player is worth, and 3) each team is equally likely to be right; then: 1) The player’s expected value will correlate with the average bid, and 2) the “winning” bid probably overpaid.

Moreover, even if the winner’s bid is exactly right, that just means they will have successfully gained nothing from the transaction.  Assuming equivalent payrolls, the team with the most value (greatest chance of winning the championship) won’t be the one that pays the most correct amount for its players, it will—necessarily—be the one that pays the least per unit of value.  To accomplish this goal, you should avoid common value auctions as much as possible!  In free agency, look for the players with very small and inefficient markets (for which #3 above is least likely to be true), and then pay them as little as you can get away with.

3. Treat your beloved veterans with cold indifference.
If a player is beloved, they will expect to be paid.  If they are not especially valuable, they will expect to be paid anyway, and if they are valuable, they are unlikely to settle for less than they are worth.  If winning is more important to you than short-term fan approval, you should be both willing and prepared to let your most beloved players go the moment they are no longer a good bargain.

4. Stock up on mid-round draft picks.
Given the high cost of signing 1st round draft picks, 2nd round draft picks may actually be more valuable.  Here is the crucial graph from the Massey-Thaler study of draft pick value (via Advanced NFL Stats):

image
The implications of this outcome are severe.  All else being equal, if someone offers you an early 2nd round draft pick for your early 1st round draft pick, they should be demanding compensation from you (of course, marginally valuable players have diminishing marginal value, because you can only have/play so many of them at a time).

5. When the price is right: Gamble.

This rule applies to fiscal decisions, just as it does to in-game ones.  NFL teams are notoriously risk-averse in a number of areas: they are afraid that someone after one down season is washed up, or that an outspoken player will ‘disrupt’ the locker room, or that a draft pick might have ‘character issues’.  These sorts of questions regularly lead to lengthy draft slides and dried-up free agent markets.  And teams are right to be concerned: these are valid possibilities that increase uncertainty.  Of course, there are other possibilities. Your free agent target simply may not be as good as you hope they are, or your draft pick may simply bust out.  Compare to late-game 4th-down decisions: Sometimes going for it on 4th down will cause you to lose immediately and face a maelstrom of criticism from fans and press, where punting or kicking may quietly lead to losing more often.  Similarly, when a team takes a high-profile personnel gamble and it fails, they may face a maelstrom of criticism from fans and press, where the less controversial choice might quietly lead to more failure.

The economizing strategy here is to favor risks when they are low cost but have high upsides.  In other words, don’t risk a huge chunk of your cap space on an uncertain free agent prospect, risk a tiny chunk of your cap space on an even more uncertain prospect that could work out like gangbusters.

Evaluation:

Now, if only there were a team and coach dedicated to these principles—or at least, for contrapositive’s sake, a team that seemed to embrace the opposite.

Oh wait, we have both!  In the last decade, Bill Belichick and the New England Patriots have practically embodied these principles, and in the process they’ve won 3 championships, have another 16-0/18-1 season, have set the overall NFL win-streak records, and are presently the #1 overall seed in this year’s playoffs. OTOH, the Redskins have practically embodied the opposite, and they have… um… not.
Note that the Patriots’ success has come despite a league fiscal system that allows teams to “load up” on individual seasons, distributing the cost onto future years (which, again, helps explain the extreme regression effect present in the NFL).  Considering the long odds of winning a Super Bowl—even with a solid contender—this seems like an unwise long-run strategy, and the most successful team of this era has cleverly taken the long view throughout.

Conclusions

The evidence in MLB and in the NBA is ironclad: Basic economic reasoning is extremely probative when predicting the underlying dynamics behind winning titles.  Over the last 20 years of pro baseball, the top 3 spenders in the league each year win 57% of the championships.  Over a similar period in basketball, the 5 (or fewer) teams with 1st-Team All-NBA players have won 95%.

In the NFL, the evidence is more nuance and anecdote than absolute proof.  However, our ex ante musing does successfully predict that neither excessive spending nor recruiting star players at any cost (excepting possibly quarterbacks) is a dominant strategy.

On balance, I would say that the C.R.E.A.M. hypothesis is substantially more supported by the data than I would have guessed.

Why Not Balls and Strikes?

To expand a tiny bit on something I tweeted the other day, I swear there’s a rule (perhaps part of the standard licensing agreement with MLB), that any time anyone on television mentions the idea of expanding instant replay (or “use of technology”) in baseball, they are required to qualify their statement by assuring the audience that they do not mean for balls and strikes.  But why not?  If any reason is given, it is usually some variation of the following: 1) Balls and strikes are inherently too subjective, 2) It would slow the game down too much, or 3) The role of the umpire is too important.  None of these seems persuasive to me, at least when applied to the strike zone’s horizontal axis — i.e., the plate:

1. The plate is not subjective.

In little league, we were taught that the strike zone was “elbows to knees and over the plate,” and surprisingly enough, the official major league baseball definition is not that much more complicated (from the Official Baseball Rules 2010, page 22):

A STRIKE is a legal pitch when so called by the umpire, which . . . is not struck at, if any part of the ball passes through any part of the strike zone. . . .
The STRIKE ZONE is that area over home plate the upper limit of which is a horizontal line at the midpoint between the top of the shoulders and the top of the uniform pants, and the lower level is a line at the hollow beneath the kneecap.  The Strike Zone shall be determined from the batter’s stance as the batter is prepared to swing at a pitched ball.

I can understand several reasons why there may be need for a human element in judging the vertical axis of the zone, such as to avoid gamesmanship like crouching or altering your stance while the ball is in the air, or to make reasonable exceptions in cases where someone has kneecaps on their stomach, etc.  But there is nothing subjective about “any part of the ball passes through any part of . . . the area over home plate.”

2. The plate is not hard to check.

I mean, if they can photograph lightning:

lightning

They should be able to tell whether a solid ball passes over a small irregular pentagon.  Yes, replay takes a while when you have to look at 15 different angles to find the right one, or when you have to cognitively construct a 3-dimensional image from several 2-dimensional videos.  It even takes a little while when you have to monitor a long perimeter to see if oddly shaped objects have crossed them (like tennis balls on impact or player’s shoes in basketball).  But checking whether a baseball crossed the plate takes no time at all: they already do it virtually without delay on television, and that process could be sped up at virtually no cost with one dedicated camera: let it take a long-exposure picture of the plate for each pitch, then instantly beam it to an iPhone strapped to the umpire’s wrist.  He can check it in the course of whatever his natural motion for signaling a ball or strike would have been, and he’ll probably save time by not having players and managers up in his face every other pitch.

3. The plate is a waste of the umpire’s time, but not ours.

Umpires are great, they make entertaining gesticulating motions, and maybe in some extremely slight sense, people actually do go to the game to boo and hiss at them — I’m not suggesting MLB puts HAL back there.  But as much as people love officiating controversies generally, umpires are so inconsistent and error-prone about the strike zone (which, you know, only matters like 300 times per game) that fans are too jaded to even care.  There are enough actually subjective calls for umpires to blow, they don’t need to be spending their time and attention on something so objective, so easy to check, and so important.

(Photo Credit: “Lightning on the Columbia River” by phatman.)

On Nate Silver on ESPN Umpire Study

I was just watching the Phillies v. Mets game on TV, and the announcers were discussing this Outside the Lines study about MLB umpires, which found that 1 in 5 “close” calls were missed over their 184 game sample.  Interesting, right?

So I opened up my browser to find the details, and before even getting to ESPN, I came across this criticism of the ESPN story by Nate Silver of FiveThirtyEight, which knocks his sometimes employer for framing the story on “close calls,” which he sees as an arbitrary term, rather than something more objective like “calls per game.”  Nate is an excellent quantitative analyst, and I love when he ventures from the murky world of politics and polling to write about sports.  But, while the ESPN study is far from perfect, I think his criticism here is somewhat off-base ill-conceived.

The main problem I have with Nate’s analysis is that the study’s definition of “close call” is not as “completely arbitrary” as Nate suggests.  Conversely, Nate’s suggested alternative metric – blown calls per game – is much more arbitrary than he seems to think.

First, in the main text of the ESPN.com article, the authors clearly state that the standard for “close” that they use is: “close enough to require replay review to determine whether an umpire had made the right call.”  Then in the 2nd sidebar, again, they explicitly define “close calls” as  “those for which instant replay was necessary to make a determination.”  That may sound somewhat arbitrary in the abstract, but let’s think for a moment about the context of this story: Given the number of high-profile blown calls this season, there are two questions on everyone’s mind: “Are these umps blind?” and “Should baseball have more instant replay?” Indeed, this article mentions “replay” 24 times.  So let me be explicit where ESPN is implicit:  This study is about instant replay.  They are trying to assess how many calls per game could use instant replay (their estimate: 1.3), and how many of those reviews would lead to calls being overturned (their estimate: 20%).

Second, what’s with a quantitative (sometimes) sports analyst suddenly being enamored with per-game rather than rate-based stats?  Sure, one blown call every 4 games sounds low, but without some kind of assessment of how many blown call opportunities there are, how would we know?  In his post, Nate mentions that NBA insiders tell him that there were “15 or 20 ‘questionable’ calls” per game in their sport.  Assuming ‘questionable’ means ‘incorrect,’ does that mean NBA referees are 60 to 80 times worse than MLB umpires?  Certainly not.  NBA refs may or may not be terrible, but they have to make double or even triple digit difficult calls every night.  If you used replay to assess every close call in an NBA game, it would never end.  Absent some massive longitudinal study comparing how often officials miss particular types of calls from year to year or era to era, there is going to be a subjective component when evaluating officiating.  Measuring by performance in “close” situations is about as good a method as any.

Which is not to say that the ESPN metric couldn’t be improved:  I would certainly like to see their guidelines for figuring out whether a call is review-worthy or not.  In a perfect world, they might even break down the sets of calls by various proposals for replay implementation.  As a journalistic matter, maybe they should have spent more time discussing their finding that only 1.3 calls per game are “close,” as that seems like an important story in its own right.  On balance, however, when it comes to the two main issues that this study pertains to (the potential impact of further instant replay, and the relative quality of baseball officiating), I think ESPN’s analysis is far more probative than Nate’s.