I was just watching the Phillies v. Mets game on TV, and the announcers were discussing this Outside the Lines study about MLB umpires, which found that 1 in 5 “close” calls were missed over their 184 game sample. Interesting, right?
So I opened up my browser to find the details, and before even getting to ESPN, I came across this criticism of the ESPN story by Nate Silver of FiveThirtyEight, which knocks his sometimes employer for framing the story on “close calls,” which he sees as an arbitrary term, rather than something more objective like “calls per game.” Nate is an excellent quantitative analyst, and I love when he ventures from the murky world of politics and polling to write about sports. But, while the ESPN study is far from perfect, I think his criticism here is somewhat off-base ill-conceived.
The main problem I have with Nate’s analysis is that the study’s definition of “close call” is not as “completely arbitrary” as Nate suggests. Conversely, Nate’s suggested alternative metric – blown calls per game – is much more arbitrary than he seems to think.
First, in the main text of the ESPN.com article, the authors clearly state that the standard for “close” that they use is: “close enough to require replay review to determine whether an umpire had made the right call.” Then in the 2nd sidebar, again, they explicitly define “close calls” as “those for which instant replay was necessary to make a determination.” That may sound somewhat arbitrary in the abstract, but let’s think for a moment about the context of this story: Given the number of high-profile blown calls this season, there are two questions on everyone’s mind: “Are these umps blind?” and “Should baseball have more instant replay?” Indeed, this article mentions “replay” 24 times. So let me be explicit where ESPN is implicit: This study is about instant replay. They are trying to assess how many calls per game could use instant replay (their estimate: 1.3), and how many of those reviews would lead to calls being overturned (their estimate: 20%).
Second, what’s with a quantitative (sometimes) sports analyst suddenly being enamored with per-game rather than rate-based stats? Sure, one blown call every 4 games sounds low, but without some kind of assessment of how many blown call opportunities there are, how would we know? In his post, Nate mentions that NBA insiders tell him that there were “15 or 20 ‘questionable’ calls” per game in their sport. Assuming ‘questionable’ means ‘incorrect,’ does that mean NBA referees are 60 to 80 times worse than MLB umpires? Certainly not. NBA refs may or may not be terrible, but they have to make double or even triple digit difficult calls every night. If you used replay to assess every close call in an NBA game, it would never end. Absent some massive longitudinal study comparing how often officials miss particular types of calls from year to year or era to era, there is going to be a subjective component when evaluating officiating. Measuring by performance in “close” situations is about as good a method as any.
Which is not to say that the ESPN metric couldn’t be improved: I would certainly like to see their guidelines for figuring out whether a call is review-worthy or not. In a perfect world, they might even break down the sets of calls by various proposals for replay implementation. As a journalistic matter, maybe they should have spent more time discussing their finding that only 1.3 calls per game are “close,” as that seems like an important story in its own right. On balance, however, when it comes to the two main issues that this study pertains to (the potential impact of further instant replay, and the relative quality of baseball officiating), I think ESPN’s analysis is far more probative than Nate’s.