Opinion » Skeptical Sports Analysis

Sports Geek Mecca: Recap and Thoughts, Part 2

This is part 2 of my “recap” of the Sloan Sports Analytics Conference that I attended in March (part 1 is here), mostly covering Day 2 of the event, but also featuring my petty way-too-long rant about Bill James (which I’ve moved to the end).

Day Two

First I attended the Football Analytics despite finding it disappointing last year, and, alas, it wasn’t any better. Eric Mangini must be the only former NFL coach willing to attend, b/c they keep bringing him back:

Just sat down for Football Analytics and I’m already bleh. In some ways, Mangini is worse than Brian Burke, b/c he acts like he cares. #SSAC

— Benjamin Morris (@skepticalsports) March 3, 2012

Overall, I spent more time in day 2 going to niche panels, research paper presentations and talking to people.

The last, in particular, was great. For example, I had a fun conversation with Henry Abbott about Kobe Bryant’s lack of “clutch.” This is one of Abbott’s pet issues, and I admit he makes a good case, particularly that the Lakers are net losers in “clutch” situations (yes, relative to other teams), even over the periods where they have been dominant otherwise.

Kobe is kind of a pivotal case in analytics, I think. First, I’m a big believer in “Count the Rings, Son” analysis: That is, leading a team to multiple championships is really hard, and only really great players do it. I also think he stands at a kind of nexus, in that stats like PER give spray shooters like him an unfair advantage, but more finely tuned advanced metrics probably over-punish the same. Part of the burden of Kobe’s role is that he has to take a lot of bad shots—the relevant question is how good he is at his job.

Abbott also mentioned that he liked one of my tweets, but didn’t know if he could retweet the non-family-friendly “WTF”:

Looking over the agenda, I don’t see “American Idol Analytics” anywhere. WTF? Competitive Singing is America’s 2nd favorite sport!#SSAC

— Benjamin Morris (@skepticalsports) March 2, 2012

I also had a fun conversation with Neil Paine of Basketball Reference. He seemed like a very smart guy, but this may be attributable to the fact that we seemed to be on the same page about so many things. Additionally, we discussed a very fun hypo: How far back in time would you have to go for the Charlotte Bobcats to be the odds-on favorites to win the NBA Championship?

As for the “sideshow” panels, they’re generally more fruitful and interesting than the ESPN-moderated super-panels, but they offer fewer easy targets for easy blog-griping. If you’re really interested in what went down, there is a ton of info at the SSAC website. The agenda can be found here. Information on the speakers is here. And, most importantly, videos of the various panels can be found here.

Box Score Rebooted

Featuring Dean Oliver, Bill James, and others.

This was a somewhat interesting, though I think slightly off-target, panel. They spent a lot of time talking about new data and metrics and pooh-poohing things like RBI (and even OPS), and the brave new world of play-by-play and video tracking, etc. But too much of this was discussing a different granularity of data than what can be improved in the current granularity levels. Or, in other words:

Solving box score problems w/ PBP or video data is fundamentally not “rebooting the box score.” What should be in box score but isn’t? #ssac

— Benjamin Morris (@skepticalsports) March 3, 2012

James acquitted himself a bit on this subject, arguing that boatloads of new data isn’t useful if it isn’t boiled down into useful metrics. But a more general way of looking at this is: If we were starting over from scratch, with a box-score-sized space to report a statistical game summary, and a similar degree of game-scoring resources, what kinds of things would we want to include (or not) that are different from what we have now? I can think of a few:

In basketball, it’s archaic that free-throws aren’t broken down into bonus free throws and shot-replacing free throws.
In football, I’d like to see passing stats by down and distance, or at least in a few key categories like 3rd and long.
In baseball, I’d like to see “runs relative to par” for pitchers (though this can be computed easily enough from existing box scores).

In this panel, Dean Oliver took the opportunity to plug ESPN’s bizarre proprietary Total Quarterback Rating. They actually had another panel devoted just to this topic, but I didn’t go, so I’ll put a couple of thoughts here.

First, I don’t understand why ESPN is pushing this as a proprietary stat. Sure, no-one knows how to calculate regular old-fashioned quarterback ratings, but there’s a certain comfort in at least knowing it’s a real thing. It’s a bit like Terms of Service agreements, which people regularly sign without reading: at least you know the terms are out there, so someone actually cares enough to read them, and presumably they would raise a stink if you had to sign away your soul.

As for what we do know, I may write more on this come football season, but I have a couple of problems:

One, I hate the “clutch effect.” TQBR makes a special adjustment to value clutch performance even more than its generic contribution to winning. If anything, clutch situations in football are so bizarre that they should count less. In fact, when I’ve done NFL analysis, I’ve often just cut the 4th quarter entirely, and I’ve found I get better results. That may sound crazy, but it’s a bit like how some very advanced Soccer analysts have cut goal-scoring from their models, instead just focusing on how well a player advances the ball toward his goal: even if the former matters more, its unreliability may make it less useful.

Dean Oliver: You can criticize QBR, but nothing better to replace it. Hm. Try QBR minus the distorting clutch adjustment. #SSAC

— Benjamin Morris (@skepticalsports) March 3, 2012

Two, I’m disappointed in the way they “assign credit” for play outcomes:

Division of credit is the next step. Dividing credit among teammates is one of the most difficult but important aspects of sports. Teammates rely upon each other and, as the cliché goes, a team might not be the sum of its parts. By dividing credit, we are forcing the parts to sum up to the team, understanding the limitations but knowing that it is the best way statistically for the rating.

I’m personally very interested in this topic (and have discussed it with various ESPN analytics guys since long before TQBR was released). This is basically an attempt to address the entanglement problem that permeates football statistics. ESPN’s published explanation is pretty cryptic, and it didn’t seem clear to me whether they were profiling individual players and situations or had created credit-distribution algorithms league-wide.

At the conference, I had a chance to talk with their analytics guy who designed this part of the metric (his name escapes me), and I confirmed that they modeled credit distribution for the entire league and are applying it in a blanket way. Technically, I guess this is a step in the right direction, but it’s purely a reduction of noise and doesn’t address the real issue. What I’d really like to see is like a recursive model that imputes how much credit various players deserve broadly, then uses those numbers to re-assign credit for particular outcomes (rinse and repeat).

Deconstructing the Rebound With Optical Tracking Data

Rajiv Maheswaran, and other nerds.

This presentation was so awesome that I offered them a hedge bet for the “Best Research Paper” award. That is, I would bet on them at even money, so that if they lost, at least they would receive a consolation prize. They declined. And won. Their findings are too numerous and interesting to list, so you should really check it out for yourself.

Obviously my work on the Dennis Rodman mystery makes me particularly interested in their theories of why certain players get more rebounds than others, as I tweeted in this insta-hypothesis:

So, upshot: Dennis Rodman’s incredible value could have come from him simply stepping into open spaces rather than following the ball. #SSAC

— Benjamin Morris (@skepticalsports) March 3, 2012

Following the presentation, I got the chance to talk with Rajiv for quite a while, which was amazing. Obviously they don’t have any data on Dennis Rodman directly, but Rajiv was also interested in him and had watched a lot of Rodman video. Though anecdotal, he did say that his observations somewhat confirmed the theory that a big part of Rodman’s rebounding advantage seemed to come from handling space very well:

Even when away from the basket, Rodman typically moved to the open space immediately following a shot. This is a bit different from how people often think about rebounding as aggressively attacking the ball (or as being able to near-psychically predict where the ball is going to come down.
Also rather than simply attacking the board directly, Rodman’s first inclination was to insert himself between the nearest opponent and the basket. In theory, this might slightly decrease the chances of getting the ball when it heads in toward his previous position, but would make up for it by dramatically increasing his chances of getting the ball when it went toward the other guy.
Though a little less purely strategical, Rajiv also thought that Rodman was just incredibly good at #2. That is, he was just exceptionally good at jockeying for position.

To some extent, I guess this is just rebounding fundamentals, but I still think it’s very interesting to think about the indirect probabilistic side of the rebounding game.

Live B.S. Report with Bill James

Quick tangent: At one point, I thought Neil Paine summed me up pretty well as a “contrarian to the contrarians.” Of course, I’m don’t think I’m contrary for the sake of contrariness, or that I’m a negative person (I don’t know how many times I’ve explained to my wife that just because I hated a movie doesn’t mean I didn’t enjoy it!), it’s just that my mind is naturally inclined toward considering the limitations of whatever is put in front of it. Sometimes that means criticizing the status quo, and sometimes that means criticizing its critics.

So, with that in mind, I thought Bill James’s showing at the conference was pretty disappointing, particularly his interview with Bill Simmons.

I have a lot of respect for James. I read his Historical Baseball Abstract and enjoyed it considerably more than Moneyball. He has a very intuitive and logical mind. He doesn’t say a bunch of shit that’s not true, and he sees beyond the obvious. In Saturday’s “Rebooting the Box-score” panel, he made an observation that having 3 of 5 people on the panel named John implied that the panel was [likely] older than the rest of the room. This got a nice laugh from the attendees, but I don’t think he was kidding. And whether he was or not, he still gets 10 kudos from me for making the closest thing to a Bayesian argument I heard all weekend. And I dutifully snuck in for a pic with him:

James was somewhat ahead of his time, and perhaps he’s still one of the better sports analytic minds out there, but in this interview we didn’t really get to hear him analyze anything, you know, sportsy. This interview was all about Bill James and his bio and how awesome he was and how great he is and how hard it was for him to get recognized and how much he has changed the game and how, without him, the world would be a cold, dark place where ignorance reigned and nobody had ever heard of “win maximization.”

Bill Simmons going this route in a podcast interview doesn’t surprise me: his audience is obviously much broader than the geeks in the room, and Simmons knows his audience’s expectations better than anyone. What got to me was James’s willingness to play along, and everyone else’s willingness to eat it up. Here’s an example of both, from the conference’s official Twitter account:

Quote of the day RT @SloanSportsConf: “this conference is a culmination of 30 years of my work” — Bill James #SSAC

— MIT Sports Conf. (@SloanSportsConf) March 3, 2012

Perhaps it’s because I never really liked baseball, and I didn’t really know anyone did any of this stuff until recently, but I’m pretty certain that Bill James had virtually zero impact on my own development as a sports data-cruncher. When I made my first PRABS-style basketball formula in the early 1990’s (which was absolutely terrible, but is still more predictive than PER), I had no idea that any sports stats other than the box score even existed. By the time I first heard the word “sabermetrics,” I was deep into my own research, and didn’t bother really looking into it deeply until maybe a few months ago.

Which is not to say I had no guidance or inspiration. For me, a big epiphanous turning point in my approach to the analysis of games did take place—after I read David Sklansky’s Theory of Poker. While ToP itself was published in 1994, Sklansky’s similar offerings date back to the 70s, so I don’t think any broader causal pictures are possible.

More broadly, I think the claim that sports analytics wouldn’t have developed without Bill James is preposterous. Especially if, as i assume we do, we firmly believe we’re right. This isn’t like L. Ron Hubbard and Incident II: being for sports analytics isn’t like having faith in a person or his religion. It simply means trying to think more rigorously about sports, and using all of the available analytical techniques we can to gain an advantage. Eventually, those who embrace the right will win out, as we’ve seen begin to happen in sports, and as has already happened in nearly every other discipline.

Indeed, by his own admission, James liked to stir controversy, piss people off, and talk down to the old guard whenever possible. As far as we know, he may have set the cause of sports analytics back, either by alienating the people who could have helped it gain acceptance, or by setting an arrogant and confrontational tone for his disciples (e.g., the uplifting “don’t feel the need to explain yourself” message in Moneyball). I’m not saying that this is the case or even a likely possibility, I’m just trying to illustrate that giving someone credit for all that follows—even a pioneer like James—is a dicey game that I’d rather not participate in, and that he definitely shouldn’t.

On a more technical note, one of his oft-quoted and re-tweeted pearls of wisdom goes as follows:

Bill James on whether we’ve exhausted all baseball advanced stats: “We’ve only taken a bucket of knowledge from a sea of ignorance.” #ssac

— Gill Alexander (@beatingthebook) March 2, 2012

Sounds great, right? I mean, not really, I don’t get the metaphor: if the sea is full of ignorance, why are you collecting water from it with a bucket rather than some kind of filtration system? But more importantly, his argument in defense of this claim is amazingly weak. When Simmons asked what kinds of things he’s talking about, he repeatedly emphasized that we have no idea whether a college sophomore will turn out to be a great Major League pitcher. True, but, um, we never will. There are too many variables, the input and outputs are too far apart in time, and the contexts are too different. This isn’t the sea of ignorance, it’s a sea of unknowns.

Which gets at one of my big complaints about stats-types generally. A lot of people seem to think that stats are all about making exciting discoveries and answering questions that were previously unanswerable. Yes, sometimes you get lucky and uncover some relationship that leads to a killer new strategy or to some game-altering new dynamic. But most of the time, you’ll find static. A good statistical thinker doesn’t try to reject the static, but tries to understand it: Figuring out what you can’t know is just as important as figuring out what you can know.

On Twitter I used this analogy:

I also don’t know whether this coin will come up heads or tails, but that doesn’t mean I have a poor understanding of coin-flipping. #SSAC

— Benjamin Morris (@skepticalsports) March 2, 2012

Success comes with knowing more true things and fewer false things than the other guy.

Stat Geek Smackdown 2012, Round 1: Odds and Ends

So in case any of you haven’t been following, the 2012 edition of the ESPN True Hoop Stat Geek Smackdown is underway. Now, obviously this competition shouldn’t be taken too seriously, as it’s roughly the equivalent of picking a weekend’s worth of NFL games, and last year I won only after picking against my actual opinion in the Finals (with good reason, of course). That said, it’s still a lot of fun to track, and basketball is a deterministic-enough sport that I do think skill is relevant. At least enough that I will talk shit if I win again.

To that end, the first round is going pretty well for me so far. Like last year, the experts are mostly in agreement. While there is a fair amount of variation in the series length predictions, there are only two matchups that had any dissent as to the likely winner: the 6 actual stat geeks split 4-2 in favor of the Lakers over the Nuggets, and 3-3 between the Clippers and the Grizzlies. As it happens, I have both Los Angeles teams (yes, I am from Homer), as does Matthew Stahlhut (though my having the Lakers in 5 instead of 7 gives me a slight edge for the moment). No one has gained any points on anyone else yet, but here is my rough account of possible scenarios:

[table “9” not found /]

On to some odds and ends:

The Particular Challenges of Predicting 2012

Making picks this year was a bit harder than in years past. At one point I seriously considered picking Dallas against OKC (in part for strategic purposes), before reason got the better of me. Abbott only published part of my comment on the series, so here’s the full version I sent him:

Throughout NBA history, defending champions have massively over-performed in the playoffs relative to their regular season records, so I wouldn’t count Dallas out. In fact, the spot Dallas finds itself in is quite similar to Houston’s in 1995, and this season’s short lead -time and compressed schedule should make us particularly wary of the usual battery of predictive models.

Thus, if I had to pick which of these teams is more likely to win the championship, I might take Dallas (or at least it would be a closer call). But that’s a far different question from who is most likely to win this particular series: Oklahoma City is simply too solid and Dallas too shaky to justify an upset pick. E.g., my generic model makes OKC a >90% favorite, so even a 50:50 chance that Dallas really is the sleeping giant Mark Cuban dreams about probably wouldn’t put them over the top.

That last little bit is important: The “paper gap” between Dallas and OKC is so great that even if Dallas were considerably better than they appeared during the regular season, that would only make them competitive, while if they were about as good as they appeared, they would be a huge dog (this kind of situation should be very familiar to any serious poker players out there).

But why on earth would I think Dallas might be any good in the first place? Well, I’ll discuss more below why champions should never be ignored, but the “paper difference” this year should be particularly inscrutable. The normal methods for predicting playoff performance (both my own and others) are particularly ill-suited for the peculiar circumstances of this season:

Perhaps most obviously, fewer regular season games means smaller sample sizes. In turn, this means that sample-sensitive indicators (like regular season statistics) should have less persuasive value relative to non-sensitive ones (like championship pedigree). It also affects things like head to head record, which is probably more valuable than a lot of stats people think, though less valuable than a lot of non-stats people think. I’ve been working on some research about this, but for an example, look at this post about how I thought there seemed to be a market error w/r/t Dallas vs. Miami in game 6, partly b/c of the bayesian value of Dallas’s head to head advantage.
Injuries are a bigger factor. This is not just that there are more of them (which is debatable), but there is less flexibility to effectively manage them: e.g., there’s obv less time to rehab players, but also less time to develop new line-ups and workarounds or make other necessary adjustments. In other words, a very good team might be hurt more by a role-player being injured than usual.
What is the most reliable data? Two things I discussed last year were that (contra unconventional wisdom) Win% is more reliable for post-season predictions than MOV-type stats, and that (contra conventional wisdom) early season performance is typically more predictive than late season performance. But both of these are undermined by the short season. The fundamental value of MOV is as a proxy for W% that is more accurate for smaller sample sizes. And the predictive power of early-season performance most likely stems from its being more representative of playoff basketball: e.g., players are more rested and everyone tries their hardest. However, not only are these playoffs not your normal playoffs, but this season was thrown together so quickly that a lot of teams had barely figured out their lineups by the quarter-pole. While late-season records have the same problems as usual, they may be more predictive just from being more similar to years past.
Finally, it’s not just the nature of the data, but the nature of the underlying game as well. For example, in a lockout year, teams concerned with injury may be quicker to pull starting players in less lopsided scenarios than usual, making MOV less useful, etc. I won’t go into every possible difference, but here’s a related Twitter exchange:

@skepticalsports Pop is the lockout-ball king. DNP-OLD motherf—er!

— Ignarus (@thegreatIgnarus) April 18, 2012

Which brings us to the next topic:

The Simplest Playoff Model You’ll Never Beat

The thing that Henry Abbott most highlighted from my Smackdown picks (which he quoted at least 3 times in 3 different places) was my little piece of dicta about the Spurs:

I have a ‘big pot’ playoff model (no matchups, no simulations, just stats and history for each playoff team as input) that produces some quirky results that have historically out-predicted my more conventional models. It currently puts San Antonio above 50 percent. Not just against Utah, but against the field. Not saying I believe it, but there you go.

I really didn’t mean for this to be taken so seriously: it’s just one model. And no, I’m not going to post it. It’s experimental, and it’s old and needs updating (e.g., I haven’t adjusted it to account for last season yet).

But I can explain why it loves the Spurs so much: it weights championship pedigree very strongly, and the Spurs this year are the only team near the top that has any.

Now some stats-loving people argue that the “has won a championship” variable is unreliable, but I think they are precisely wrong. Perhaps this will change going forward, but, historically, there are no two ways to cut it: No matter how awesomely designed and complicated your models/simulations are, if you don’t account for championship experience, you will lose to even the most rudimentary model that does.

So case in point, I came up with this 2-step method for picking NBA Champions:

If there are any teams within 5 games of the best record that have won a title within the past 5 years, pick the most recent.
Otherwise, pick the team with the best record.

Following this method, you would correctly pick the eventual NBA Champion in 64.3% of years since the league moved to a 16-team playoff in 1984 (with due respect to the slayer, I call this my “5-by-5” model ).

Of course, thinking back, it seems like picking the winner is sometimes easy, as the league often has an obvious “best team” that is extremely unlikely to ever lose a 7 game series. So perhaps the better question to ask is: How much do you gain by including the championship test in step 1?

The answer is: a lot. Over the same period, the team with the league’s best record has won only 10/28 championships, or ~35%. So the 5-by-5 model almost doubles your hit rate.

And in case you’re wondering, using Margin of Victory, SRS, or any other advanced stat instead of W-L record doesn’t help: other methods vary from doing slightly worse to slightly better. While there may still be room to beef up the complexity of your predictive model (such as advanced stats, situational simulations, etc), your gains will be (comparatively) marginal at best. Moreover, there is also room for improvement on the other side: by setting up a more formal and balanced tradeoff between regular season performance and championship history, the macro-model can get up to 70+% without danger of significant over-fitting.

In fairness, I should note that the 5-by-5 model has had a bit of a rough patch recently—but, in its defense, so has every other model. The NBA has had some wacky results recently, but there is no indication that stats have supplanted history. Indeed, if you break the historical record into groups of more-predictable and less-predictable seasons, the 5-by-5 model trumps pure statistical models in all of them.

Uncertainty and Series Lengths

Finally, I’d like to quickly address the complete botching of series-length analysis that I put forward last year. Not only did I make a really elementary mistake in my explanation (that an emailer thankfully pointed out), but I’ve come to reject my ultimate conclusion as well.

Aside from strategic considerations, I’m now fairly certain that picking the home team in 5 or the away team in 6 is always right, no matter how close you think the series is. I first found this result when running playoff simulations that included margin for error (in other words, accounting for the fact that teams may be better or worse than their stats would indicate, or that they may match up more or less favorably than the underlying records would suggest), but I had some difficulty getting this result to comport with the empirical data, which still showed “home team in 6” as the most common outcome. But now I think I’ve figured this problem out, and it has to do with the fact that a lot of those outcomes came in spots where you should have picked the other team, etc. But despite the extremely simple-sounding outcome, it’s a rich and interesting topic, so I’ll save the bulk of it for another day.

Sports Geek Mecca: Recap and Thoughts, Part 1

So, over the weekend, I attended my second MIT Sloan Sports Analytics Conference. My experience was much different than in 2011: Last year, I went into this thing barely knowing that other people were into the same things I was. An anecdote: In late 2010, I was telling my dad how I was about to have a 6th or 7th round interview for a pretty sweet job in sports analysis, when he speculated, “How many people can there even be in that business? 10? 20?” A couple of months later, of course, I would learn.

A lot has happened in my life since then: I finished my Rodman series, won the ESPN Stat Geek Smackdown (which, though I am obviously happy to have won, is not really that big a deal—all told, the scope of the competition is about the same as picking a week’s worth of NFL games), my wife and I had a baby, and, oh yeah, I learned a ton about the breadth, depth, and nature of the sports analytics community.

For the most part, I used Twitter as sort of my de facto notebook for the conference. Thus, I’m sorry if I’m missing a bunch of lengthier quotes and/or if I repeat a bunch of things you already saw in my live coverage, but I will try to explain a few things in a bit more detail.

For the most part, I’ll keep the recap chronological. I’ve split this into two parts: Part 1 covers Friday, up to but not including the Bill Simmons/Bill James interview. Part 2 covers that interview and all of Saturday.

Opening Remarks:

From the pregame tweets, John Hollinger observed that 28 NBA teams sent representatives (that we know of) this year. I also noticed that the New England Revolution sent 2 people, while the New England Patriots sent none, so I’m not sure that number of official representatives reliably indicates much.

The conference started with some bland opening remarks by Dean David Schmittlein. Tangent: I feel like political-speak (thank everybody and say nothing) seems to get more and more widespread every year. I blame it on fear of the internet. E.g., in this intro segment, somebody made yet another boring joke about how there were no women present (personally, I thought there were significantly more than last year), and was followed shortly thereafter by a female speaker, understandably creating a tiny bit of awkwardness. If that person had been more important (like, if I could remember his name to slam him), I doubt he would have made that joke, or any other joke. He would have just thanked everyone and said nothing.

The Evolution of Sports Leagues

Featuring Gary Bettman (NHL), Rob Manfred (MLB), Adam Silver (NBA), Steve Tisch (NYG) and Michael Wilbon moderating.

This panel really didn’t have much of a theme, it was mostly Wilbon creatively folding a bunch of predictable questions into arbitrary league issues. E.g.: ” “What do you think about Jeremy Lin?!? _{And, you know, overseas expansion blah blah}.”

I don’t get the massive cultural significance of Jeremy Lin, personally. I mean, he’s not the first ethnically Chinese player to have NBA success (though he is perhaps the first short one). The discussion of China, however, was interesting for other reasons. Adam Silver claimed that Basketball is already more popular in China than soccer, with over 300 million Chinese people playing it. Those numbers, if true, are pretty mind-boggling.

Finally, there was a whole part about labor negotiations that was pretty well summed up by this tweet:

Opening panel summary: league execs are very smart and have done a great job with labor negotiations according to league execs. #ssac

— Jeremy Schmidt (@Bucksketball) March 2, 2012

Hockey Analytics

Featuring Brian Burke, Peter Chiarelli, Mike Milbury and others.

The panel started with Peter Chiarelli being asked how the world champion Boston Bruins use analytics, and in an ominous sign, he rambled on for a while about how, when it comes to scouting, they’ve learned that weight is probably more important than height.

Overall, it was a bit like any scene from the Moneyball war room, with Michael Schuckers (the only pro-stats guy) playing the part of Jonah Hill, but without Brad Pitt to protect him.

When I think of Brian Burke, I usually think of Advanced NFL Stats, but apparently there’s one in Hockey as well. Burke is GM/President of the Toronto Maple Leafs. At one point he was railing about how teams that use analytics have never won anything, which confused me since I haven’t seen Toronto hoisting any Stanley Cups recently, but apparently he did win a championship with the Mighty Ducks in 2007, so he clearly speaks with absolute authority.

This guy was a walking talking quote machine for the old school. I didn’t take note of all the hilarious and/or non-sensical things he said, but for some examples, try searching Twitter for “#SSAC Brian Burke.” To give an extent of how extreme, someone tweeted this quote at me, and I have no idea if he actually said it or if this guy was kidding.

@skepticalsports ‘Hockey is played with a stick and a puck, not your little calculator and your spreadsheets, Poindexter.’ – Brian Burke

— Brian Woodburn (@MustRockTheRed) March 2, 2012

In other words, Burke was literally too over the top to effectively parody.

On the other hand, in the discussion of concussions, I thought Burke had sort of a folksy realism that seemed pretty accurate to me. I think his general point is right, if a bit insensitive: If we really changed hockey so much as to eliminate concussions entirely, it would be a whole different sport (which he also claimed no one would watch, an assertion which is more debatable imo). At the end of the day, I think professional sports mess people up, including in the head. But, of course, we can’t ignore the problem, so we have to keep proceeding toward some nebulous goal.

Mike Milbury, presently a card-carrying member of the media, seemed to mostly embrace the alarmist media narrative, though he did raise at least one decent point about how the increase in concussions—which most people are attributing to an increase in diagnoses—may relate to recent rules changes that have sped up the game.

But for all that, the part that frustrated me the most was when Michael Schuckers, the legitimate hockey statistician at the table, was finally given the opportunity to talk. 90% of the things that came out of his mouth were various snarky ways of asserting that face-offs don’t matter. I mean, I assume he’s 100% right, but just had no clue how to talk to these guys. Find common ground: you both care about scoring goals, defending goals, and winning. Good face-off skill get you the puck more often in the right situations. The question is how many extra possessions you get and how valuable those possessions are? And finally, what’s the actual decision in question?

Baseball Analytics

Featuring Scott Boras, Scott Boras, Scott Boras, some other guys, Scott Boras, and, oh yeah, Bill James.

In stark constrast to the Hockey panel, the Baseball guys pretty much bent over backwards to embrace analytics as much as possible. As I tweeted at the time:

Watching Hockey Analytics panel before Baseball Analytics panel is like watching Wheel of Fortune before Jeopardy. #SSAC #oldjoke

— Benjamin Morris (@skepticalsports) March 2, 2012

Scott Boras seems to like hearing Scott Boras talk. Which is not so bad, because Scott Boras actually did seem pretty smart and well informed: Among other things, Scott Boras apparently has a secret internal analytics team. To what end, I’m not entirely sure, since Scott Boras also seemed to say that most GM’s overvalue players relative to what Scott Boras’s people tell Scott Boras.

At this point, my mind wandered:

Fantasizing about Belichick being on one of these panels, but answering every question “I just try to give us the best chance to win.” #SSAC

— Benjamin Morris (@skepticalsports) March 2, 2012

How awesome would that be, right?

Anyway, in between Scott Boras’s insights, someone asked this Bill James guy about his vision for the future of baseball analytics, and he gave two answers:

Evaluating players from a variety of contexts other than the minor leagues (like college ball, overseas, Cubans, etc).
Analytics will expand to look at the needs of the entire enterprise, not just individual players or teams.

Meh, I’m a bit underwhelmed. He talked a bit about #1 in his one-on-one with Bill Simmons, so I’ll look at that a bit more in my review of that discussion. As for #2, I think he’s just way way off: The business side of sports is already doing tons of sophisticated analytics—almost certainly way more than the competition side—because, you know, it’s business.

E.g., in the first panel, there was a fair amount of discussion of how the NBA used “sophisticated modeling” for many different lockout-related analyses (I didn’t catch the Ticketing Analytics panel, but from its reputation, and from related discussions on other panels, it sounds like that discipline has some of the nerdiest analysis of all).

Scott Boras let Bill James talk about a few other things as well: E.g., James is not a fan of new draft regulations, analogizing them to government regulations that “any economist would agree” inevitably lead to market distortions and bursting bubbles. While I can’t say I entirely disagree, I’m going to go out on a limb and guess that his political leanings are probably a bit Libertarian?

Basketball Analytics

Featuring Jeff Van Gundy, Mike Zarren, John Hollinger, and ~~Mark Cuban~~ Dean Oliver.

If every one of these panels was Mark Cuban + foil, it would be just about the most awesome weekend ever (though you might not learn the most about analytics). So I was excited about this one, which, unfortunately, Cuban missed. Filling in on zero/short notice was Dean Oliver. Overall, here’s Nathan Walker’s take:

Basketball Panel Summary: “Too many variables. Too much noise. Stats are hard.” #ssac

— Nathan Walker (@bbstats) March 2, 2012

This panel actually had some pretty interesting discussions, but they flew by pretty fast and often followed predictable patterns, something like this:

Hollinger says something pro-stats, though likely way out of his depth.
Zarren brags about how they’re already doing that and more on the Celtics.
Oliver says something smart and nuanced that attempts to get at the underlying issues and difficulties.
Jeff Van Gundy uses forceful pronouncements and “common sense” to dismiss his strawman version of what the others have been saying.

E.g.:

“Michael Jordan was pretty good. That’s revolutionary.” <- Van Gundy to Oliver (but kind of took Dean out of context). #SSAC

— Benjamin Morris (@skepticalsports) March 2, 2012

Zarren talked about how there is practically more data these days than they know what to do with. This seems true and I think it has interesting implications. I’ll discuss it a little more in Part 2 re: the “Rebooting the Box Score” talk.

There was also an interesting discussion of trades, and whether they’re more a result of information asymmetry (in other words, teams trying to fleece each other), or more a result of efficient trade opportunities (in other words, teams trying to help each other). Though it really shouldn’t matter—you trade when you think it will help you, whether it helps your trade partner is mostly irrelevant—Oliver endorsed the latter. He makes the point that, with such a broad universe of trade possibilities, looking for mutually beneficial situations is the easiest way to find actionable deals. Fair enough.

Coaching Analytics

Featuring coaching superstars Jeff Van Gundy, Eric Mangini, and Bill Simmons. Moderated by Daryl Morey.

OK, can I make the obvious point that Simmons and Morey apparently accidentally switched role cards? As a result, this talk featured a lot of Simmons attacking coaches and Van Gundy defending them. I honestly didn’t remember Mangini was on this panel until looking back at the book (which is saying something, b/c Mangini usually makes my blood boil).

There was almost nothing on, say, how to evaluate coaches, say, by analyzing how well their various decisions comported with the tenets of win maximization. There was a lengthy (and almost entirely non-analytical) discussion of that all-important question of whether an NBA coach should foul or not up by 3 with little time left. Fouling probably has a tiny edge, but I think it’s too close and too infrequent to be very interesting (though obviously not as rare, it reminds me a bit of the impassioned debates you used to see on Poker forums about whether you should fast-play or slow-play flopped quads in limit hold’em).

There was what I thought was a funny moment when Bill Simmons was complaining about how teams seem to recycle mediocre older coaches rather than try out young, fresh talent. But when challenged by Van Gundy, Simmons drew a blank and couldn’t think of anyone. So, Bill, this is for you. Here’s a table of NBA coaches who have coached at least 1000 games for at least 3 different teams, while winning fewer than 60% of their games and without winning any championships:

[table “8” not found /]

Note that I’m not necessarily agreeing with Simmons: Winning championships in the NBA is hard, especially if your team lacks uber-stars (you know, Michael Jordan, Magic Johnson, Dennis Rodman, et al).

Part 2 coming soon!

Honestly, I got a little carried away with my detailed analysis/screed on Bill James, and I may have to do a little revising. So due to some other pressing writing commitments, you can probably expect Part 2 to come out this Saturday (Friday at the earliest).

A Defense of Sudden Death Playoffs in Baseball

So despite my general antipathy toward America’s pastime, I’ve been looking into baseball a lot lately. I’m working on a three part series that will “take on” Pythagorean Expectation. But considering the sanctity of that metric, I’m taking my time to get it right.

For now, the big news is that Major League Baseball is finally going to have realignment, which will most likely lead to an extra playoff team, and a one game Wild Card series between the non–division winners. I’m not normally one who tries to comment on current events in sports (though, out of pure frustration, I almost fired up WordPress today just to take shots at Tim Tebow—even with nothing original to say), but this issue has sort of a counter-intuitive angle to it that motivated me to dig a bit deeper.

Conventional wisdom on the one game playoff is pretty much that it’s, well, super crazy. E.g., here’s Jayson Stark’s take at ESPN:

But now that the alternative to finishing first is a ONE-GAME playoff? Heck, you’d rather have an appendectomy than walk that tightrope. Wouldn’t you?

Though I think he actually likes the idea, precisely because of the loco factor:

So a one-game, October Madness survivor game is what we’re going to get. You should set your DVRs for that insanity right now.

In the meantime, we all know what the potential downside is to this format. Having your entire season come down to one game isn’t fair. Period.

I wouldn’t be too sure about that. What is fair? As I’ve noted, MLB playoffs are basically a crapshoot anyway. In my view, any move that MLB can make toward having the more accomplished team win more often is a positive step. And, as crazy as it sounds, that is likely exactly what a one game playoff will do.

The reason is simple: home field advantage. While smaller than in other sports, the home team in baseball still wins around 55% of the time, and more games means a smaller percentage of your series games played at home. While longer series’ eventually lead to better teams winning more often, the margins in baseball are so small that it takes a significant edge for a team to prefer to play ANY road games:

^{Note: I calculated these probabilities using my favorite binom.dist function in Excel. Specifically, where the number of games needed to win a series is k, this is the sum from x=0 to x=k of the p(winning x home games) times p(winning at least k-x road games).}

So assuming each team is about as good as their records (which, regardless of the accuracy of the assumption, is how they deserve to be treated), a team needs about a 5.75% generic advantage (around 9-10 games) to prefer even a seven game series to a single home game.

But what about the incredible injustice that could occur when a really good team is forced to play some scrub? E.g., Stark continues:

It’s a lock that one of these years, a 98-win wild-card team is going to lose to an 86-win wild-card team. And that will really, really seem like a miscarriage of baseball justice. You’ll need a Richter Scale handy to listen to talk radio if that happens.

But you know what the answer to those complaints will be?

“You should have finished first. Then you wouldn’t have gotten yourself into that mess.”

Stark posits a 12 game edge between two wild card teams, and indeed, this could lead to a slightly worse spot for the better team than a longer series. 12 games corresponds to a 7.4% generic advantage, which means a 7-game series would improve the team’s chances by about 1% (oh, the humanity!). But the alternative almost certainly wouldn’t be seven games anyway, considering the first round of the playoffs is already only five. At that length, the “miscarriage of baseball justice” would be about 0.1% (and vs. 3 games, sudden death is still preferable).

If anything, consider the implications of the massive gap on the left side of the graph above: If anyone is getting screwed by the new setup, it’s not the team with the better record, it’s a better team with a worse record, who won’t get as good a chance to demonstrate their actual superiority (though that team’s chances are still around 50% better than they would have been under the current system). And those are the teams that really did “[get themselves] into that mess.”

Also, the scenario Stark posits is extremely unlikely: basically, the difference between 4th and 5th place is never 12 games. For comparison, this season the difference between the best record in the NL and the Wild Card Loser was only 13 games, and in the AL it was only seven. Over the past ten seasons, each Wild Card team and their 5th place finisher were separated by an average of 3.5 games (about 2.2%):

Note that no cases over this span even rise above the seven game “injustice line” of 5.75%, much less to the nightmare scenario of 7.5% that Stark invokes. The standard deviation is about 1.5%, and that’s with the present imbalance of teams (note that the AL is pretty consistently higher than the NL, as should be expected)—after realignment, this plot should tighten even further.

Indeed, considering the typically small margins between contenders in baseball, on average, this “insane” sudden death series may end up being the fairest round of the playoffs.

The Aesthetic Case Against 18 Games

By most accounts, the NFL’s plan to expand the regular season from 16 to 18 games is a done deal. Indulge me for a moment as I take off my Bill-James-Wannabe cap and put on my dusty old Aristotle-Wannabe kausia: In addition to various practical drawbacks, moving to 18 games risks disturbing the aesthetic harmony—grounded in powerful mathematics—inherent in the 16 game season.
Analytically, it is easy to appreciate the convenience of having the season break down cleanly into 8-game halves and 4-game quarters. Powers of 2 like this are useful and aesthetically attractive: after all, we are symmetrical creatures who appreciate divisibility. But we have a possibly even more powerful aesthetic attachment to certain types of asymmetrical relationships: Mozart’s piano concertos aren’t divided into equally-sized beginnings, middles and ends. Rather, they are broken into exposition, development, and recapitulation—each progressively shorter than the last.

Similarly, the 16 game season can fairly cleanly be broken into 3 or 4 progressively shorter but more important sections. Using roughly the same proportions that Mozart would, the first 10 games (“exposition”) would set the stage and reveal who we should be paying attention to; the next 3-4 games (“development”) would be where the race for playoff positioning really begins in earnest, and the final 2-3 weeks (“recapitulation”) are where hopes are realized and hearts are broken—including the final weekend when post-season fates are settled. Now, let’s represent the season as a rectangle with sides 16 (length of the season) and 10 (length of the “exposition”), broken down into consecutively smaller squares representing each section:

^{Note: The “last” game gets the leftover space, though if the season were longer we could obviously keep going.}

At this point many of you probably know where this is going: The ratio between each square to all of the smaller pieces is roughly equal, corresponding to the “divine proportion,” which is practically ubiquitous in classical music, as well as in everything from book and movie plots to art and architecture to fractal geometry to unifying theories of “all animate and inanimate systems.” Here it is again (incredibly clumsily-sketched) in the more recognizable spiral form:

The golden ratio is represented in mathematics by the irrational constant phi, which is:

1.6180339887…

Which, when divided into 1 gets you:

.6180339887…

Beautiful, right? So the roughly 10/4/1/1 breakdown above is really just 16 multiplied by 1/phi, with the remainder multiplied by 1/phi, etc—9.9, 3.8, 1.4, .9—rounded to the nearest game. Whether this corresponds to your thinking about the relative significance of each portion of the season is admittedly subjective. But this is an inescapably powerful force in aesthetics (along with symmetricality and symbols of virility and fertility), and can be found in places most people would never suspect, including in professional sports. Let’s consider some anecdotal supporting evidence:

The length of a Major League Baseball season is 162 games. Not 160, but 162. That should look familiar.
Both NBA basketball and NHL hockey have 82-game seasons, or roughly half-phi. Note 81 games would be impractical, because of need for equal number of home and road games (but bonus points if you’ve ever felt like the NBA season was exactly 1 game too long).
The “exposition” portion of a half-phi season would be 50 games. The NHL and NBA All-Star breaks both take place right around game 50, or a little later, each year.
Though still solidly in between 1/2 and 2/3 of the way through the season, MLB’s “Summer Classic” usually takes place slightly earlier, around game 90 (though I might submit that the postseason crunch doesn’t really start until after teams build a post-All Star record for people to talk about).
The NFL bye weeks typically end after week 10.
Fans and even professional sports analysts are typically inclined to value “clutch” players—i.e., those who make their bones in the “Last” quadrant above—way more than a non-aesthetic analytical approach would warrant.

Etc.
So fine, say you accept this argument about how people observe sports, your next question may be: well, what’s wrong with 18 games? any number of games can be divided into phi-sized quadrants, right? Well, the answer is basically yes, it can, but it’s not pretty:

The numbers 162, 82, and 16 all share a couple of nice qualities: first they are all roughly divisible by 4, so you have nice clean quarter-seasons. Second, they each have aesthetically pleasing “exposition” periods: 100 games in MLB, 50 in the NBA and NHL, and 10 in the NFL. The “exposition” period in an 18-game season would be 11 games. Yuck! These season-lengths balance our competing aesthetic desires for the harmony of symmetry and excitement of asymmetry. We like our numbers round, but not too round. We want them dynamic, but workable.

Finally, as to why the NFL should care about vague aesthetic concerns that it takes a mathematician to identify, I can only say: I don’t think these patterns would be so pervasive in science, art, and in broader culture if they weren’t really important to us, whether we know it or not. Human beings are symmetrical down the middle, but as some guy in Italy noticed, golden rectangles are not only woven into our design, but into the design of the things we love. Please, NFL, don’t take that away from us.

Why Not Balls and Strikes?

To expand a tiny bit on something I tweeted the other day, I swear there’s a rule (perhaps part of the standard licensing agreement with MLB), that any time anyone on television mentions the idea of expanding instant replay (or “use of technology”) in baseball, they are required to qualify their statement by assuring the audience that they do not mean for balls and strikes. But why not? If any reason is given, it is usually some variation of the following: 1) Balls and strikes are inherently too subjective, 2) It would slow the game down too much, or 3) The role of the umpire is too important. None of these seems persuasive to me, at least when applied to the strike zone’s horizontal axis — i.e., the plate:

1. The plate is not subjective.

In little league, we were taught that the strike zone was “elbows to knees and over the plate,” and surprisingly enough, the official major league baseball definition is not that much more complicated (from the Official Baseball Rules 2010, page 22):

A STRIKE is a legal pitch when so called by the umpire, which . . . is not struck at, if any part of the ball passes through any part of the strike zone. . . .
The STRIKE ZONE is that area over home plate the upper limit of which is a horizontal line at the midpoint between the top of the shoulders and the top of the uniform pants, and the lower level is a line at the hollow beneath the kneecap. The Strike Zone shall be determined from the batter’s stance as the batter is prepared to swing at a pitched ball.

I can understand several reasons why there may be need for a human element in judging the vertical axis of the zone, such as to avoid gamesmanship like crouching or altering your stance while the ball is in the air, or to make reasonable exceptions in cases where someone has kneecaps on their stomach, etc. But there is nothing subjective about “any part of the ball passes through any part of . . . the area over home plate.”

2. The plate is not hard to check.

I mean, if they can photograph lightning:

They should be able to tell whether a solid ball passes over a small irregular pentagon. Yes, replay takes a while when you have to look at 15 different angles to find the right one, or when you have to cognitively construct a 3-dimensional image from several 2-dimensional videos. It even takes a little while when you have to monitor a long perimeter to see if oddly shaped objects have crossed them (like tennis balls on impact or player’s shoes in basketball). But checking whether a baseball crossed the plate takes no time at all: they already do it virtually without delay on television, and that process could be sped up at virtually no cost with one dedicated camera: let it take a long-exposure picture of the plate for each pitch, then instantly beam it to an iPhone strapped to the umpire’s wrist. He can check it in the course of whatever his natural motion for signaling a ball or strike would have been, and he’ll probably save time by not having players and managers up in his face every other pitch.

3. The plate is a waste of the umpire’s time, but not ours.

Umpires are great, they make entertaining gesticulating motions, and maybe in some extremely slight sense, people actually do go to the game to boo and hiss at them — I’m not suggesting MLB puts HAL back there. But as much as people love officiating controversies generally, umpires are so inconsistent and error-prone about the strike zone (which, you know, only matters like 300 times per game) that fans are too jaded to even care. There are enough actually subjective calls for umpires to blow, they don’t need to be spending their time and attention on something so objective, so easy to check, and so important.

(Photo Credit: “Lightning on the Columbia River” by phatman.)