umpires » Skeptical Sports Analysis

Why Not Balls and Strikes?

To expand a tiny bit on something I tweeted the other day, I swear there’s a rule (perhaps part of the standard licensing agreement with MLB), that any time anyone on television mentions the idea of expanding instant replay (or “use of technology”) in baseball, they are required to qualify their statement by assuring the audience that they do not mean for balls and strikes. But why not? If any reason is given, it is usually some variation of the following: 1) Balls and strikes are inherently too subjective, 2) It would slow the game down too much, or 3) The role of the umpire is too important. None of these seems persuasive to me, at least when applied to the strike zone’s horizontal axis — i.e., the plate:

1. The plate is not subjective.

In little league, we were taught that the strike zone was “elbows to knees and over the plate,” and surprisingly enough, the official major league baseball definition is not that much more complicated (from the Official Baseball Rules 2010, page 22):

A STRIKE is a legal pitch when so called by the umpire, which . . . is not struck at, if any part of the ball passes through any part of the strike zone. . . .
The STRIKE ZONE is that area over home plate the upper limit of which is a horizontal line at the midpoint between the top of the shoulders and the top of the uniform pants, and the lower level is a line at the hollow beneath the kneecap. The Strike Zone shall be determined from the batter’s stance as the batter is prepared to swing at a pitched ball.

I can understand several reasons why there may be need for a human element in judging the vertical axis of the zone, such as to avoid gamesmanship like crouching or altering your stance while the ball is in the air, or to make reasonable exceptions in cases where someone has kneecaps on their stomach, etc. But there is nothing subjective about “any part of the ball passes through any part of . . . the area over home plate.”

2. The plate is not hard to check.

I mean, if they can photograph lightning:

They should be able to tell whether a solid ball passes over a small irregular pentagon. Yes, replay takes a while when you have to look at 15 different angles to find the right one, or when you have to cognitively construct a 3-dimensional image from several 2-dimensional videos. It even takes a little while when you have to monitor a long perimeter to see if oddly shaped objects have crossed them (like tennis balls on impact or player’s shoes in basketball). But checking whether a baseball crossed the plate takes no time at all: they already do it virtually without delay on television, and that process could be sped up at virtually no cost with one dedicated camera: let it take a long-exposure picture of the plate for each pitch, then instantly beam it to an iPhone strapped to the umpire’s wrist. He can check it in the course of whatever his natural motion for signaling a ball or strike would have been, and he’ll probably save time by not having players and managers up in his face every other pitch.

3. The plate is a waste of the umpire’s time, but not ours.

Umpires are great, they make entertaining gesticulating motions, and maybe in some extremely slight sense, people actually do go to the game to boo and hiss at them — I’m not suggesting MLB puts HAL back there. But as much as people love officiating controversies generally, umpires are so inconsistent and error-prone about the strike zone (which, you know, only matters like 300 times per game) that fans are too jaded to even care. There are enough actually subjective calls for umpires to blow, they don’t need to be spending their time and attention on something so objective, so easy to check, and so important.

(Photo Credit: “Lightning on the Columbia River” by phatman.)

On Nate Silver on ESPN Umpire Study

I was just watching the Phillies v. Mets game on TV, and the announcers were discussing this Outside the Lines study about MLB umpires, which found that 1 in 5 “close” calls were missed over their 184 game sample. Interesting, right?

So I opened up my browser to find the details, and before even getting to ESPN, I came across this criticism of the ESPN story by Nate Silver of FiveThirtyEight, which knocks his sometimes employer for framing the story on “close calls,” which he sees as an arbitrary term, rather than something more objective like “calls per game.” Nate is an excellent quantitative analyst, and I love when he ventures from the murky world of politics and polling to write about sports. But, while the ESPN study is far from perfect, I think his criticism here is somewhat off-base ill-conceived.

The main problem I have with Nate’s analysis is that the study’s definition of “close call” is not as “completely arbitrary” as Nate suggests. Conversely, Nate’s suggested alternative metric – blown calls per game – is much more arbitrary than he seems to think.

First, in the main text of the ESPN.com article, the authors clearly state that the standard for “close” that they use is: “close enough to require replay review to determine whether an umpire had made the right call.” Then in the 2nd sidebar, again, they explicitly define “close calls” as “those for which instant replay was necessary to make a determination.” That may sound somewhat arbitrary in the abstract, but let’s think for a moment about the context of this story: Given the number of high-profile blown calls this season, there are two questions on everyone’s mind: “Are these umps blind?” and “Should baseball have more instant replay?” Indeed, this article mentions “replay” 24 times. So let me be explicit where ESPN is implicit: This study is about instant replay. They are trying to assess how many calls per game could use instant replay (their estimate: 1.3), and how many of those reviews would lead to calls being overturned (their estimate: 20%).

Second, what’s with a quantitative (sometimes) sports analyst suddenly being enamored with per-game rather than rate-based stats? Sure, one blown call every 4 games sounds low, but without some kind of assessment of how many blown call opportunities there are, how would we know? In his post, Nate mentions that NBA insiders tell him that there were “15 or 20 ‘questionable’ calls” per game in their sport. Assuming ‘questionable’ means ‘incorrect,’ does that mean NBA referees are 60 to 80 times worse than MLB umpires? Certainly not. NBA refs may or may not be terrible, but they have to make double or even triple digit difficult calls every night. If you used replay to assess every close call in an NBA game, it would never end. Absent some massive longitudinal study comparing how often officials miss particular types of calls from year to year or era to era, there is going to be a subjective component when evaluating officiating. Measuring by performance in “close” situations is about as good a method as any.

Which is not to say that the ESPN metric couldn’t be improved: I would certainly like to see their guidelines for figuring out whether a call is review-worthy or not. In a perfect world, they might even break down the sets of calls by various proposals for replay implementation. As a journalistic matter, maybe they should have spent more time discussing their finding that only 1.3 calls per game are “close,” as that seems like an important story in its own right. On balance, however, when it comes to the two main issues that this study pertains to (the potential impact of further instant replay, and the relative quality of baseball officiating), I think ESPN’s analysis is far more probative than Nate’s.