I’m Joining FiveThirtyEight

The title pretty much says it all. In January I’ll be starting a real j-o-b as a “Senior Writer, Sports” for the new ESPN-backed FiveThirtyEight, due to launch in February. So I thought I’d better say some quick goodbyes and hellos.

For old readers:

While I’m admittedly a little sad that this blog won’t be coming back any time soon, this should obviously be great news for people who enjoy my work: Backed by ESPN/FiveThirtyEight data and resources, it will be better and there will be more of it. My responsibilities at FiveThirtyEight will be similar to what I’d been doing here already: conducting original research, writing articles, and blogging. Except full time. And paid.

(Yeah, it’s basically my dream job.)

For new readers:

Of course, for many of you reading this, this is probably your first time visiting this site. In which case: welcome!  For a primer on who the hell I am, you might want to read the “about Ben” and “about this blog” pages, or you can skip those and just read some of my articles. My best known work is undoubtedly The Case For Dennis Rodman, which is incredibly long—


—but has a guide, which can be found here. And in case you’ve heard rumors, yes, it speculates that Rodman—in a very specific way—may have been more valuable than Michael Jordan.

However, if I had to pick just a handful of articles to best represent my ideas and interests, it might look something like this:


Quantum Randy Moss—An Introduction to Entanglement

The Aesthetic Case Against 18 Games


The Case for Dennis Rodman, Part 4/4(a): All-Hall?

Bayes’ Theorem, Small Samples, and WTF is Up With NBA Finals Markets?


A Defense of Sudden-Death Playoffs in Baseball

Why Not Balls and Strikes?


C.R.E.A.M. (Or, “How to Win a Championship in Any Sport”)

Applied Epistemology in Politics and the Playoffs

Sports Geek Mecca: Recap and Thoughts, Part 2

This is part 2 of my “recap” of the Sloan Sports Analytics Conference that I attended in March (part 1 is here), mostly covering Day 2 of the event, but also featuring my petty way-too-long rant about Bill James (which I’ve moved to the end).

Day Two

First I attended the Football Analytics despite finding it disappointing last year, and, alas, it wasn’t any better. Eric Mangini must be the only former NFL coach willing to attend, b/c they keep bringing him back:

Overall, I spent more time in day 2 going to niche panels, research paper presentations and talking to people.

The last, in particular, was great. For example, I had a fun conversation with Henry Abbott about Kobe Bryant’s lack of “clutch.” This is one of Abbott’s pet issues, and I admit he makes a good case, particularly that the Lakers are net losers in “clutch” situations (yes, relative to other teams), even over the periods where they have been dominant otherwise.

Kobe is kind of a pivotal case in analytics, I think. First, I’m a big believer in “Count the Rings, Son” analysis: That is, leading a team to multiple championships is really hard, and only really great players do it. I also think he stands at a kind of nexus, in that stats like PER give spray shooters like him an unfair advantage, but more finely tuned advanced metrics probably over-punish the same. Part of the burden of Kobe’s role is that he has to take a lot of bad shots—the relevant question is how good he is at his job.

Abbott also mentioned that he liked one of my tweets, but didn’t know if he could retweet the non-family-friendly “WTF”:

I also had a fun conversation with Neil Paine of Basketball Reference. He seemed like a very smart guy, but this may be attributable to the fact that we seemed to be on the same page about so many things. Additionally, we discussed a very fun hypo: How far back in time would you have to go for the Charlotte Bobcats to be the odds-on favorites to win the NBA Championship?

As for the “sideshow” panels, they’re generally more fruitful and interesting than the ESPN-moderated super-panels, but they offer fewer easy targets for easy blog-griping. If you’re really interested in what went down, there is a ton of info at the SSAC website. The agenda can be found here. Information on the speakers is here. And, most importantly, videos of the various panels can be found here.

Box Score Rebooted

Featuring Dean Oliver, Bill James, and others.

This was a somewhat interesting, though I think slightly off-target, panel. They spent a lot of time talking about new data and metrics and pooh-poohing things like RBI (and even OPS), and the brave new world of play-by-play and video tracking, etc. But too much of this was discussing a different granularity of data than what can be improved in the current granularity levels. Or, in other words:

James acquitted himself a bit on this subject, arguing that boatloads of new data isn’t useful if it isn’t boiled down into useful metrics. But a more general way of looking at this is: If we were starting over from scratch, with a box-score-sized space to report a statistical game summary, and a similar degree of game-scoring resources, what kinds of things would we want to include (or not) that are different from what we have now?  I can think of a few:

  1. In basketball, it’s archaic that free-throws aren’t broken down into bonus free throws and shot-replacing free throws.
  2. In football, I’d like to see passing stats by down and distance, or at least in a few key categories like 3rd and long.
  3. In baseball, I’d like to see “runs relative to par” for pitchers (though this can be computed easily enough from existing box scores).

In this panel, Dean Oliver took the opportunity to plug ESPN’s bizarre proprietary Total Quarterback Rating. They actually had another panel devoted just to this topic, but I didn’t go, so I’ll put a couple of thoughts here.

First, I don’t understand why ESPN is pushing this as a proprietary stat. Sure, no-one knows how to calculate regular old-fashioned quarterback ratings, but there’s a certain comfort in at least knowing it’s a real thing. It’s a bit like Terms of Service agreements, which people regularly sign without reading: at least you know the terms are out there, so someone actually cares enough to read them, and presumably they would raise a stink if you had to sign away your soul.

As for what we do know, I may write more on this come football season, but I have a couple of problems:

One, I hate the “clutch effect.” TQBR makes a special adjustment to value clutch performance even more than its generic contribution to winning. If anything, clutch situations in football are so bizarre that they should count less. In fact, when I’ve done NFL analysis, I’ve often just cut the 4th quarter entirely, and I’ve found I get better results. That may sound crazy, but it’s a bit like how some very advanced Soccer analysts have cut goal-scoring from their models, instead just focusing on how well a player advances the ball toward his goal: even if the former matters more, its unreliability may make it less useful.

Two, I’m disappointed in the way they “assign credit” for play outcomes:

Division of credit is the next step. Dividing credit among teammates is one of the most difficult but important aspects of sports. Teammates rely upon each other and, as the cliché goes, a team might not be the sum of its parts. By dividing credit, we are forcing the parts to sum up to the team, understanding the limitations but knowing that it is the best way statistically for the rating.

I’m personally very interested in this topic (and have discussed it with various ESPN analytics guys since long before TQBR was released). This is basically an attempt to address the entanglement problem that permeates football statistics.  ESPN’s published explanation is pretty cryptic, and it didn’t seem clear to me whether they were profiling individual players and situations or had created credit-distribution algorithms league-wide.

At the conference, I had a chance to talk with their analytics guy who designed this part of the metric (his name escapes me), and I confirmed that they modeled credit distribution for the entire league and are applying it in a blanket way.  Technically, I guess this is a step in the right direction, but it’s purely a reduction of noise and doesn’t address the real issue.  What I’d really like to see is like a recursive model that imputes how much credit various players deserve broadly, then uses those numbers to re-assign credit for particular outcomes (rinse and repeat).

Deconstructing the Rebound With Optical Tracking Data

Rajiv Maheswaran, and other nerds.

This presentation was so awesome that I offered them a hedge bet for the “Best Research Paper” award. That is, I would bet on them at even money, so that if they lost, at least they would receive a consolation prize. They declined. And won. Their findings are too numerous and interesting to list, so you should really check it out for yourself.

Obviously my work on the Dennis Rodman mystery makes me particularly interested in their theories of why certain players get more rebounds than others, as I tweeted in this insta-hypothesis:

Following the presentation, I got the chance to talk with Rajiv for quite a while, which was amazing. Obviously they don’t have any data on Dennis Rodman directly, but Rajiv was also interested in him and had watched a lot of Rodman video. Though anecdotal, he did say that his observations somewhat confirmed the theory that a big part of Rodman’s rebounding advantage seemed to come from handling space very well:

  1. Even when away from the basket, Rodman typically moved to the open space immediately following a shot. This is a bit different from how people often think about rebounding as aggressively attacking the ball (or as being able to near-psychically predict where the ball is going to come down.
  2. Also rather than simply attacking the board directly, Rodman’s first inclination was to insert himself between the nearest opponent and the basket. In theory, this might slightly decrease the chances of getting the ball when it heads in toward his previous position, but would make up for it by dramatically increasing his chances of getting the ball when it went toward the other guy.
  3. Though a little less purely strategical, Rajiv also thought that Rodman was just incredibly good at #2. That is, he was just exceptionally good at jockeying for position.

To some extent, I guess this is just rebounding fundamentals, but I still think it’s very interesting to think about the indirect probabilistic side of the rebounding game.

Live B.S. Report with Bill James

Quick tangent: At one point, I thought Neil Paine summed me up pretty well as a “contrarian to the contrarians.”  Of course, I’m don’t think I’m contrary for the sake of contrariness, or that I’m a negative person (I don’t know how many times I’ve explained to my wife that just because I hated a movie doesn’t mean I didn’t enjoy it!), it’s just that my mind is naturally inclined toward considering the limitations of whatever is put in front of it. Sometimes that means criticizing the status quo, and sometimes that means criticizing its critics.

So, with that in mind, I thought Bill James’s showing at the conference was pretty disappointing, particularly his interview with Bill Simmons.

I have a lot of respect for James.  I read his Historical Baseball Abstract and enjoyed it considerably more than Moneyball.  He has a very intuitive and logical mind. He doesn’t say a bunch of shit that’s not true, and he sees beyond the obvious. In Saturday’s “Rebooting the Box-score” panel, he made an observation that having 3 of 5 people on the panel named John implied that the panel was [likely] older than the rest of the room.  This got a nice laugh from the attendees, but I don’t think he was kidding.  And whether he was or not, he still gets 10 kudos from me for making the closest thing to a Bayesian argument I heard all weekend.  And I dutifully snuck in for a pic with him:

James was somewhat ahead of his time, and perhaps he’s still one of the better sports analytic minds out there, but in this interview we didn’t really get to hear him analyze anything, you know, sportsy. This interview was all about Bill James and his bio and how awesome he was and how great he is and how hard it was for him to get recognized and how much he has changed the game and how, without him, the world would be a cold, dark place where ignorance reigned and nobody had ever heard of “win maximization.”

Bill Simmons going this route in a podcast interview doesn’t surprise me: his audience is obviously much broader than the geeks in the room, and Simmons knows his audience’s expectations better than anyone. What got to me was James’s willingness to play along, and everyone else’s willingness to eat it up. Here’s an example of both, from the conference’s official Twitter account:

Perhaps it’s because I never really liked baseball, and I didn’t really know anyone did any of this stuff until recently, but I’m pretty certain that Bill James had virtually zero impact on my own development as a sports data-cruncher.  When I made my first PRABS-style basketball formula in the early 1990’s (which was absolutely terrible, but is still more predictive than PER), I had no idea that any sports stats other than the box score even existed. By the time I first heard the word “sabermetrics,” I was deep into my own research, and didn’t bother really looking into it deeply until maybe a few months ago.

Which is not to say I had no guidance or inspiration.  For me, a big epiphanous turning point in my approach to the analysis of games did take place—after I read David Sklansky’s Theory of Poker. While ToP itself was published in 1994, Sklansky’s similar offerings date back to the 70s, so I don’t think any broader causal pictures are possible.

More broadly, I think the claim that sports analytics wouldn’t have developed without Bill James is preposterous. Especially if, as i assume we do, we firmly believe we’re right.  This isn’t like L. Ron Hubbard and Incident II: being for sports analytics isn’t like having faith in a person or his religion. It simply means trying to think more rigorously about sports, and using all of the available analytical techniques we can to gain an advantage. Eventually, those who embrace the right will win out, as we’ve seen begin to happen in sports, and as has already happened in nearly every other discipline.

Indeed, by his own admission, James liked to stir controversy, piss people off, and talk down to the old guard whenever possible. As far as we know, he may have set the cause of sports analytics back, either by alienating the people who could have helped it gain acceptance, or by setting an arrogant and confrontational tone for his disciples (e.g., the uplifting “don’t feel the need to explain yourself” message in Moneyball). I’m not saying that this is the case or even a likely possibility, I’m just trying to illustrate that giving someone credit for all that follows—even a pioneer like James—is a dicey game that I’d rather not participate in, and that he definitely shouldn’t.

On a more technical note, one of his oft-quoted and re-tweeted pearls of wisdom goes as follows:

Sounds great, right? I mean, not really, I don’t get the metaphor: if the sea is full of ignorance, why are you collecting water from it with a bucket rather than some kind of filtration system? But more importantly, his argument in defense of this claim is amazingly weak. When Simmons asked what kinds of things he’s talking about, he repeatedly emphasized that we have no idea whether a college sophomore will turn out to be a great Major League pitcher.  True, but, um, we never will. There are too many variables, the input and outputs are too far apart in time, and the contexts are too different.  This isn’t the sea of ignorance, it’s a sea of unknowns.

Which gets at one of my big complaints about stats-types generally.  A lot of people seem to think that stats are all about making exciting discoveries and answering questions that were previously unanswerable. Yes, sometimes you get lucky and uncover some relationship that leads to a killer new strategy or to some game-altering new dynamic. But most of the time, you’ll find static. A good statistical thinker doesn’t try to reject the static, but tries to understand it: Figuring out what you can’t know is just as important as figuring out what you can know.

On Twitter I used this analogy:

Success comes with knowing more true things and fewer false things than the other guy.

Graphs of the Day: Bird vs. Bron

One of my favorite stat-nuggets ever is that “Larry Bird never had a losing month.” So, yesterday, I figured it was about time to check whether or not it’s, you know, true.

To do this, I first had to figure out which Celtics games Bird actually played in. The problem there is that his career began well before 1986, meaning the box score data aren’t in Basketball Reference’s database. But they do have images of the actual box scores, like so:

Fortunately, Bird played in every game in his first two seasons, so figuring this out was just a matter of poring through 4 years of these pics: Easy peasy! (I’ve done more grueling work for even more trivial questions, to be sure.) But results on that later.

Independently, I was trying to come up with a fun way to illustrate the fact that LeBron James won a lot more games in his last two seasons on the lowly Cleveland Cavaliers than he has so far on the perma-hyped Miami Heat:

So that graph reflects every game of LeBron’s career, including the regular season and playoffs (through last night). It’s pretty straightforward: With LeBron an 18-year-old rookie, the Cavs (though much improved) were still pretty shaky, and they pretty much got better and better each year. After a slight decline from their soaring 2008 performance, LeBron left to join the latest Big 3—which is a solid contender, but no threat to the greatest Big 3. (BTW, I would like to thank the Heat for becoming Exhibit A for my long-time contention that having multiple “primary” options is less valuable than having a well-designed supporting cast—even one with considerably less talent.)

But with Mr. Trifecta on my mind (not to mention overloading my browser history), I thought it might be fun to compare the two leading contenders for the small forward spot on any NBA GOAT team. So here’s Larry:

Wow, pretty crazy consistent, yes? Keep in mind that, despite the Celtics long winning tradition, they only won 29 games the year before Bird’s arrival.  Note the practically opposite gradient from LeBron’s: Bird started out hot, and basically stayed hot until injuries cooled him down.

As for the results of the original inquiry: It turns out Bird’s Celtics started the season 2-4 in November 1988, just before Bird had season-ending ankle surgery (of course, Bird’s 1988 games ARE in my database, so this was a bit of a “Doh!” finding). And, of course, he also had losing months in the playoffs.

His worst full month in the regular season, however, was indeed exactly .500: He went 8-8 in March of 1982. So, properly qualified (like, “In the regular season, Bird never had a losing month in which he played more than 6 games”), the claim holds up. If I were a political fact-checker, I would deem it “Mostly True.”

In case you’re interested, here is the complete list of months in Larry Bird’s career:

[table “10” not found /]

Sports Geek Mecca: Recap and Thoughts, Part 1

So, over the weekend, I attended my second MIT Sloan Sports Analytics Conference. My experience was much different than in 2011: Last year, I went into this thing barely knowing that other people were into the same things I was. An anecdote: In late 2010, I was telling my dad how I was about to have a 6th or 7th round interview for a pretty sweet job in sports analysis, when he speculated, “How many people can there even be in that business? 10? 20?” A couple of months later, of course, I would learn.

A lot has happened in my life since then: I finished my Rodman series, won the ESPN Stat Geek Smackdown (which, though I am obviously happy to have won, is not really that big a deal—all told, the scope of the competition is about the same as picking a week’s worth of NFL games), my wife and I had a baby, and, oh yeah, I learned a ton about the breadth, depth, and nature of the sports analytics community.

For the most part, I used Twitter as sort of my de facto notebook for the conference.  Thus, I’m sorry if I’m missing a bunch of lengthier quotes and/or if I repeat a bunch of things you already saw in my live coverage, but I will try to explain a few things in a bit more detail.

For the most part, I’ll keep the recap chronological.  I’ve split this into two parts: Part 1 covers Friday, up to but not including the Bill Simmons/Bill James interview.  Part 2 covers that interview and all of Saturday.

Opening Remarks:

From the pregame tweets, John Hollinger observed that 28 NBA teams sent representatives (that we know of) this year.  I also noticed that the New England Revolution sent 2 people, while the New England Patriots sent none, so I’m not sure that number of official representatives reliably indicates much.

The conference started with some bland opening remarks by Dean David Schmittlein.  Tangent: I feel like political-speak (thank everybody and say nothing) seems to get more and more widespread every year. I blame it on fear of the internet. E.g., in this intro segment, somebody made yet another boring joke about how there were no women present (personally, I thought there were significantly more than last year), and was followed shortly thereafter by a female speaker, understandably creating a tiny bit of awkwardness. If that person had been more important (like, if I could remember his name to slam him), I doubt he would have made that joke, or any other joke. He would have just thanked everyone and said nothing.

The Evolution of Sports Leagues

Featuring Gary Bettman (NHL), Rob Manfred (MLB), Adam Silver (NBA), Steve Tisch (NYG) and Michael Wilbon moderating.

This panel really didn’t have much of a theme, it was mostly Wilbon creatively folding a bunch of predictable questions into arbitrary league issues.  E.g.: ” “What do you think about Jeremy Lin?!? And, you know, overseas expansion blah blah.”

I don’t get the massive cultural significance of Jeremy Lin, personally.  I mean, he’s not the first ethnically Chinese player to have NBA success (though he is perhaps the first short one).  The discussion of China, however, was interesting for other reasons. Adam Silver claimed that Basketball is already more popular in China than soccer, with over 300 million Chinese people playing it.  Those numbers, if true, are pretty mind-boggling.

Finally, there was a whole part about labor negotiations that was pretty well summed up by this tweet:

Hockey Analytics

Featuring Brian Burke, Peter Chiarelli, Mike Milbury and others.

The panel started with Peter Chiarelli being asked how the world champion Boston Bruins use analytics, and in an ominous sign, he rambled on for a while about how, when it comes to scouting, they’ve learned that weight is probably more important than height.

Overall, it was a bit like any scene from the Moneyball war room, with Michael Schuckers (the only pro-stats guy) playing the part of Jonah Hill, but without Brad Pitt to protect him.

When I think of Brian Burke, I usually think of Advanced NFL Stats, but apparently there’s one in Hockey as well.  Burke is GM/President of the Toronto Maple Leafs. At one point he was railing about how teams that use analytics have never won anything, which confused me since I haven’t seen Toronto hoisting any Stanley Cups recently, but apparently he did win a championship with the Mighty Ducks in 2007, so he clearly speaks with absolute authority.

This guy was a walking talking quote machine for the old school. I didn’t take note of all the hilarious and/or non-sensical things he said, but for some examples, try searching Twitter for “#SSAC Brian Burke.” To give an extent of how extreme, someone tweeted this quote at me, and I have no idea if he actually said it or if this guy was kidding.

In other words, Burke was literally too over the top to effectively parody.

On the other hand, in the discussion of concussions, I thought Burke had sort of a folksy realism that seemed pretty accurate to me.  I think his general point is right, if a bit insensitive: If we really changed hockey so much as to eliminate concussions entirely, it would be a whole different sport (which he also claimed no one would watch, an assertion which is more debatable imo).  At the end of the day, I think professional sports mess people up, including in the head.  But, of course, we can’t ignore the problem, so we have to keep proceeding toward some nebulous goal.

Mike Milbury, presently a card-carrying member of the media, seemed to mostly embrace the alarmist media narrative, though he did raise at least one decent point about how the increase in concussions—which most people are attributing to an increase in diagnoses—may relate to recent rules changes that have sped up the game.

But for all that, the part that frustrated me the most was when Michael Schuckers, the legitimate hockey statistician at the table, was finally given the opportunity to talk.  90% of the things that came out of his mouth were various snarky ways of asserting that face-offs don’t matter.  I mean, I assume he’s 100% right, but just had no clue how to talk to these guys.  Find common ground: you both care about scoring goals, defending goals, and winning.  Good face-off skill get you the puck more often in the right situations. The question is how many extra possessions you get and how valuable those possessions are? And finally, what’s the actual decision in question?

Baseball Analytics

Featuring Scott Boras, Scott Boras, Scott Boras, some other guys, Scott Boras, and, oh yeah, Bill James.

In stark constrast to the Hockey panel, the Baseball guys pretty much bent over backwards to embrace analytics as much as possible.  As I tweeted at the time:

Scott Boras seems to like hearing Scott Boras talk.  Which is not so bad, because Scott Boras actually did seem pretty smart and well informed: Among other things, Scott Boras apparently has a secret internal analytics team. To what end, I’m not entirely sure, since Scott Boras also seemed to say that most GM’s overvalue players relative to what Scott Boras’s people tell Scott Boras.

At this point, my mind wandered:

How awesome would that be, right?

Anyway, in between Scott Boras’s insights, someone asked this Bill James guy about his vision for the future of baseball analytics, and he gave two answers:

  1. Evaluating players from a variety of contexts other than the minor leagues (like college ball, overseas, Cubans, etc).
  2. Analytics will expand to look at the needs of the entire enterprise, not just individual players or teams.

Meh, I’m a bit underwhelmed.  He talked a bit about #1 in his one-on-one with Bill Simmons, so I’ll look at that a bit more in my review of that discussion. As for #2, I think he’s just way way off: The business side of sports is already doing tons of sophisticated analytics—almost certainly way more than the competition side—because, you know, it’s business.

E.g., in the first panel, there was a fair amount of discussion of how the NBA used “sophisticated modeling” for many different lockout-related analyses (I didn’t catch the Ticketing Analytics panel, but from its reputation, and from related discussions on other panels, it sounds like that discipline has some of the nerdiest analysis of all).

Scott Boras let Bill James talk about a few other things as well:  E.g., James is not a fan of new draft regulations, analogizing them to government regulations that “any economist would agree” inevitably lead to market distortions and bursting bubbles.  While I can’t say I entirely disagree, I’m going to go out on a limb and guess that his political leanings are probably a bit Libertarian?

Basketball Analytics

Featuring Jeff Van Gundy, Mike Zarren, John Hollinger, and Mark Cuban Dean Oliver.

If every one of these panels was Mark Cuban + foil, it would be just about the most awesome weekend ever (though you might not learn the most about analytics). So I was excited about this one, which, unfortunately, Cuban missed. Filling in on zero/short notice was Dean Oliver.  Overall, here’s Nathan Walker’s take:

This panel actually had some pretty interesting discussions, but they flew by pretty fast and often followed predictable patterns, something like this:

  1. Hollinger says something pro-stats, though likely way out of his depth.
  2. Zarren brags about how they’re already doing that and more on the Celtics.
  3. Oliver says something smart and nuanced that attempts to get at the underlying issues and difficulties.
  4. Jeff Van Gundy uses forceful pronouncements and “common sense” to dismiss his strawman version of what the others have been saying.


Zarren talked about how there is practically more data these days than they know what to do with.  This seems true and I think it has interesting implications. I’ll discuss it a little more in Part 2 re: the “Rebooting the Box Score” talk.

There was also an interesting discussion of trades, and whether they’re more a result of information asymmetry (in other words, teams trying to fleece each other), or more a result of efficient trade opportunities (in other words, teams trying to help each other).  Though it really shouldn’t matter—you trade when you think it will help you, whether it helps your trade partner is mostly irrelevant—Oliver endorsed the latter.  He makes the point that, with such a broad universe of trade possibilities, looking for mutually beneficial situations is the easiest way to find actionable deals.  Fair enough.

Coaching Analytics

Featuring coaching superstars Jeff Van Gundy, Eric Mangini, and Bill Simmons.  Moderated by Daryl Morey.

OK, can I make the obvious point that Simmons and Morey apparently accidentally switched role cards?  As a result, this talk featured a lot of Simmons attacking coaches and Van Gundy defending them.  I honestly didn’t remember Mangini was on this panel until looking back at the book (which is saying something, b/c Mangini usually makes my blood boil).

There was almost nothing on, say, how to evaluate coaches, say, by analyzing how well their various decisions comported with the tenets of win maximization.  There was a lengthy (and almost entirely non-analytical) discussion of that all-important question of whether an NBA coach should foul or not up by 3 with little time left.  Fouling probably has a tiny edge, but I think it’s too close and too infrequent to be very interesting (though obviously not as rare, it reminds me a bit of the impassioned debates you used to see on Poker forums about whether you should fast-play or slow-play flopped quads in limit hold’em).

There was what I thought was a funny moment when Bill Simmons was complaining about how teams seem to recycle mediocre older coaches rather than try out young, fresh talent. But when challenged by Van Gundy, Simmons drew a blank and couldn’t think of anyone.  So, Bill, this is for you.  Here’s a table of NBA coaches who have coached at least 1000 games for at least 3 different teams, while winning fewer than 60% of their games and without winning any championships:

[table “8” not found /]

Note that I’m not necessarily agreeing with Simmons: Winning championships in the NBA is hard, especially if your team lacks uber-stars (you know, Michael Jordan, Magic Johnson, Dennis Rodman, et al).

Part 2 coming soon!

Honestly, I got a little carried away with my detailed analysis/screed on Bill James, and I may have to do a little revising. So due to some other pressing writing commitments, you can probably expect Part 2 to come out this Saturday (Friday at the earliest).

A Defense of Sudden Death Playoffs in Baseball

So despite my general antipathy toward America’s pastime, I’ve been looking into baseball a lot lately.  I’m working on a three part series that will “take on” Pythagorean Expectation.  But considering the sanctity of that metric, I’m taking my time to get it right.

For now, the big news is that Major League Baseball is finally going to have realignment, which will most likely lead to an extra playoff team, and a one game Wild Card series between the non–division winners.  I’m not normally one who tries to comment on current events in sports (though, out of pure frustration, I almost fired up WordPress today just to take shots at Tim Tebow—even with nothing original to say), but this issue has sort of a counter-intuitive angle to it that motivated me to dig a bit deeper.

Conventional wisdom on the one game playoff is pretty much that it’s, well, super crazy.  E.g., here’s Jayson Stark’s take at ESPN:

But now that the alternative to finishing first is a ONE-GAME playoff? Heck, you’d rather have an appendectomy than walk that tightrope. Wouldn’t you?

Though I think he actually likes the idea, precisely because of the loco factor:

So a one-game, October Madness survivor game is what we’re going to get. You should set your DVRs for that insanity right now.

In the meantime, we all know what the potential downside is to this format. Having your entire season come down to one game isn’t fair. Period.

I wouldn’t be too sure about that.  What is fair?  As I’ve noted, MLB playoffs are basically a crapshoot anyway.  In my view, any move that MLB can make toward having the more accomplished team win more often is a positive step.  And, as crazy as it sounds, that is likely exactly what a one game playoff will do.

The reason is simple: home field advantage.  While smaller than in other sports, the home team in baseball still wins around 55% of the time, and more games means a smaller percentage of your series games played at home.  While longer series’ eventually lead to better teams winning more often, the margins in baseball are so small that it takes a significant edge for a team to prefer to play ANY road games:

Note: I calculated these probabilities using my favorite binom.dist function in Excel. Specifically, where the number of games needed to win a series is k, this is the sum from x=0 to x=k of the p(winning x home games) times p(winning at least k-x road games).

So assuming each team is about as good as their records (which, regardless of the accuracy of the assumption, is how they deserve to be treated), a team needs about a 5.75% generic advantage (around 9-10 games) to prefer even a seven game series to a single home game.

But what about the incredible injustice that could occur when a really good team is forced to play some scrub?  E.g., Stark continues:

It’s a lock that one of these years, a 98-win wild-card team is going to lose to an 86-win wild-card team. And that will really, really seem like a miscarriage of baseball justice. You’ll need a Richter Scale handy to listen to talk radio if that happens.

But you know what the answer to those complaints will be?

“You should have finished first. Then you wouldn’t have gotten yourself into that mess.”

Stark posits a 12 game edge between two wild card teams, and indeed, this could lead to a slightly worse spot for the better team than a longer series.  12 games corresponds to a 7.4% generic advantage, which means a 7-game series would improve the team’s chances by about 1% (oh, the humanity!).  But the alternative almost certainly wouldn’t be seven games anyway, considering the first round of the playoffs is already only five.  At that length, the “miscarriage of baseball justice” would be about 0.1% (and vs. 3 games, sudden death is still preferable).

If anything, consider the implications of the massive gap on the left side of the graph above: If anyone is getting screwed by the new setup, it’s not the team with the better record, it’s a better team with a worse record, who won’t get as good a chance to demonstrate their actual superiority (though that team’s chances are still around 50% better than they would have been under the current system).  And those are the teams that really did “[get themselves] into that mess.”

Also, the scenario Stark posits is extremely unlikely: basically, the difference between 4th and 5th place is never 12 games.  For comparison, this season the difference between the best record in the NL and the Wild Card Loser was only 13 games, and in the AL it was only seven.  Over the past ten seasons, each Wild Card team and their 5th place finisher were separated by an average of 3.5 games (about 2.2%):

Note that no cases over this span even rise above the seven game “injustice line” of 5.75%, much less to the nightmare scenario of 7.5% that Stark invokes.  The standard deviation is about 1.5%, and that’s with the present imbalance of teams (note that the AL is pretty consistently higher than the NL, as should be expected)—after realignment, this plot should tighten even further.

Indeed, considering the typically small margins between contenders in baseball, on average, this “insane” sudden death series may end up being the fairest round of the playoffs.

9/25 NFL Sunday Live Blog

As promised, I’ll be live-blogging all day.  Details here.

I’ll finally be getting NFL Sunday Ticket later this week (DirecTV is cheaper than cable, who knew?), but for now I’m stuck with what the networks give me.  As of last night, I thought the early game was going to be New England against Buffalo, but now my channel guide is saying Philadelphia/Giants.  I’ll find out in a few minutes.

This may not be the most controversial statement, but I think the two most powerful forces in the NFL over the last decade have been Peyton Manning and Bill Belichick (check out the 2nd tab as well):

I’m sure I’ll have more to say about the Great Hoodied One over the course of the day, so, for now, on with the show:

9:55 am: Watching pre-game.  Strahan is taking “overreaction” to a new level, not only declaring that maybe the NFL isn’t even ready for Cam Newton, but that this has taught him to stop being critical of rookie QB’s in the future.

10:00 am: CBS pregame show over and now it’s a paid advertisement for the Genie Bra (I’m so tempted).  And, yep, Fox has the Philly game.

10:10: In case you haven’t seen it, the old “Graph of the Day” that I tweaked for the above is here.

10:15: Nothing wrong with that interception by Vick.  Ugh, commercials.  I hate Live TV, especially when there’s only one thing on.

10:20: Belichick, of course, is known for winning Super Bowls, going for it on 4th down, and:

Good thing he doesn’t have to worry about potential employers Googling him.

10:24: Manning gets first blood in this battle of “#1 draft picks who everyone was ready to give up on but then performed miracles.”

10:30: True story: Yesterday, my wife needed a T-shirt, and ended up borrowing my souvenir shirt from SSAC (MIT/Sloan Sports Analytics Conference). She was still wearing it when we went to see Moneyball last night, and, sure enough, she ended up liking it (nerd!) and I thought it was pretty dull.

10:35: David: Wins in season n against wins in season n+1.  Sorry, maybe should have explained that.

10:39: Aaron Schatz tweeted:


Bills go for it on fourth-and-14 from NE 35… and Fitzpatrick throws his second pick (first that is his fault)

4th and 14 is a situation where I think more quarterbacks throw too few interceptions than throw too many.

10:48: Tom Coughlin thinks LeSean McCoy is the fastest running back in the NFL? Which would make him faster than Chris Johnson? Who thinks he’s faster than Usain Bolt?  What is he, a neutrino?

11:00: O.K., per Matt’s request, I’ve added a “latest updates” section to the top.  Let me know if you like this better.

11:05: And sorry about the neutrino joke.  Incidentally, it probably goes without saying, but CJ is not as fast as Bolt in the 40. Bolt is the fastest man on earth at any distance from 50 meters to 300.  I’ll post graphs in a bit.

11:17: Man, I am having all kinds of browser problems.  May have to switch computers.  Anyway, here’s a Bolt graph:

Since we know his split time (minus reaction) over 32.8 yards was 3.64 seconds, using the curve above we can nail down his time at 40 yards pretty accurately: it’s around 4.19 to 4.21. (Note, the first 50 Meters of Bolt’s 100M world record were faster than the record for 50 meters indoors.)

11:25: Incidentally, Chris Johnson’s 40 time of 4.25 is bogus.  I won’t go into all the details, but I’ve calculated his likely 40 time (for purposes of comparison with Bolt), and it’s more like 4.5.  Of course, that’s a bit of “apples to oranges” while combine times vs. each other are “apples to apples,” but the point is that Bolt’s advantage over CJ is much bigger than .05.

11:28: Ok, halftime.  Unremarkable game so far, though Eli got a good stat boost from a couple of nice catch-and-runs.  I would love to see how Vick performs under pressure, so I’m glad they’ve gotten close again.  Going to grab a snack.

11:44: Matt: I’ve seen a few things, but I don’t have the links in front of me. Prior to Berlin, Usain was definitely known as a slow starter with a crazy top speed that made up for it, but in his 9.58 WR run he was pretty much textbook and led wire to wire, posting the fastest splits ever at every point.

11:58: Lol, everyone loves when linemen advance the ball.  Until they fumble.  Then they’re pariahs.

12:07: I haven’t really used Advanced NFL Stats WPA Calculator much, as I’ve been (very slowly) trying to build my own model.  But I just noticed it doesn’t take time outs into account.  I’m curious whether that’s the same for his internal model or if that’s just the calculator.  Obv timeouts make a huge difference in endgame or even end-half scenarios (and accounting for them properly is one of the toughest things to figure out).

12:11: Man, I was just thinking how old I must be that I remember the Simpson’s origins on the Tracy Ullman show, but the Fox promo department made me feel all better by viscerally reminding me that they’re still on.

12:14: Google Search Leading to My Blog of the Day: “what sport does dennis rodman play”

12:22: So Both Donovan McNabb and Michael Vick have been considerably better QB’s in Philadelphia than elsewhere.  At some point, does Andy Reid get some credit?  Without a Super Bowl ring, he’s generally respected but not revered in the mainstream, and he’s such a poor tactician that he’s dismissed by most analytics types.  But he may be one of the best offensive schemers in the modern era.

12:34: Moneyball nit-picking: The Athletics won their last game of the season in 2004, 2005, 2007, and 2010. (It’s not that hard when you don’t make the playoffs).

12:40: I kind of feel the same way about Vick that I felt about Stephen Strasbourg after he hurt his arm last year: their physical skills are so unprecedented that, unfortunately, Bayesian inference suggests that their injury-proneness isn’t a coincidence.

12:45: David: I just mean that he has notoriously bad time management skills, makes ridiculous 4th down decisions, and generally seems clueless about win maximization, esp. in end-game scenarios.

12:48: So if the Eagles go on to lose, does this make Vick 1-0 with 2 “no decisions” for the year?

12:54: Wow, Tom Brady has as many. . . Crap, Aaron Schatz beat me to it:


Tom Brady has as many INT in this game as he had all last year. Egads.

12:57: Dangit, exciting New England end-game and I’m stuck watching the Giants beat Vick’s backup.  Argh!

1:03: Really Moneyball is all about money, not statistics. Belichick would be such a better subject for a sports-analytics movie than Billy Beane.  It’s dramatic how Belichick has been willing to do whatever it takes to win—whether it be breaking the rules or breaking with convention—plus, you know, with more success.

1:06: “Bonus Coverage” on Fox is Detroit v. Minnesota.  CBS just started KC/San Diego.

1:14: Top-notch analysis from Arturo:


Holy effin Christ. Bills/Pats
4 minutes ago

1:19: Nate: Are you referring the the uber-exciting Pats/Bills game that I can’t watch?  I’ll check the p-b-p.

1:27: Congrats Nate and Lions fans everywhere!

1:30: Ok, I’m going to take a short lunch break, I’ll be back @ 2ish PST.

2:05: Nate asks:

Any thoughts on the Lions kicking a 32-yard FG in overtime from the left hash on first down?

I’ve thought about this situation a bit, and I don’t hate it.  Let me pull up this old graph:

So a kneel in the center is maybe slightly better: generically, they lose a percentage or two, but I’m pretty sure that even from that distance you lose a percentage or two for being on the hash.  Kickers are good enough at that length that going for extra yards or a TD isn’t really worth it, plus you’re not DOA even if you miss (while you might be if you turn the ball over).

2:08: Btw, I’ve got Green Bay/Chicago of Fox to go with aforementioned KC/SD on CBS.

2:12: Also from that post where the graph came from, the “OT Choke Factor” for kicks of that length is negligible.

2:35: So this Neutrino [measured as faster-than-light, in case you’ve been living under a rock or aren’t a total dork] situation is pretty fascinating to me.  What’s amazing is that, even days later, no one has been able to posit a good theory for either the result-as-good OR where the error might be coming from.

Note this wasn’t like some random crackpot scientist, this was a massive team at CERN, which is like the Supreme Court of the particle physics world.

It’s a bit like if you brought the world’s best mathematicians and computer scientists together to design a simple and effective way to calculate Pi, only to have it spit out 3.15.  It just can’t possibly be, yet no one has a good explanation for how they screwed up.

To complicate things further, you have previous, “statistically insignificant” results at MINOS that also clocked neutrinos as FTL.  Indepedently, this should be irrelevant, but as a Bayesian matter, a prior consistent result — even an “insignificant” one — can exponentially increase the likelihood of the latter being valid.  If it had been any other discovery, this would be iron-clad evidence, and it would probably be scientific “consensus” by now.

So, as a second-order observation, assuming they eventually do find whatever the error may be in this case, doesn’t it suggest that there may be other “consensus” issues with similarly difficult-to-find errors underlying them that were simply never challenged b/c they weren’t claiming that 2+2=5?

2:36: Ok, I think I’m required by Nerd Law to post the XKCD comic on the topic:

2:41: Added links.

2:43: Argh.  NFL Live Blog, I know.  Sorry.

2:52: News says that a 3.3 earthquake “hit” Los Angeles today.  Um, 3.3.  I’m pretty sure that’s also known as “not an earthquake.”

3:05: OK, this has little to do with the game I’m “watching” (something about the non-Detroit NFC central bores me now with Favre gone and Lovie/Martz coaching in Chicago), but here’s a brand new (10 minutes old) bar chart from my salary study:

Obv this stuff becomes more meaningful in a regression context, but it’s interesting even at this level.  A little interpretation to follow.

3:20: “Overspending” is the total amount spent above the sum of cap values for all your active players, like “loading up” on one year by paying a lot of pro-rated signing bonuses.  For position players, their cap value is the best (salary-based) predictor of their value, so, unsurprisingly, teams with high immediate cap values tend to have the better teams (while total money spent also correlates positively, it’s entirely because it also correlates with total cap value).

What’s interesting about running backs is that RB cap value correlates positively, but signing bonus correlates negatively.  My unconsidered interpretation is that RB’s are valuable enough to spend money on when they’re actually good, but they’re too hard to evaluate to try to buy yourself one.

3:36: Matt Glassman asks:

Question re: field goals — What percentage are you looking for your kicker to have at the longest range you are willing to regularly (i.e. throughout the game) use him?

I’ll use a static example: if your kicker was a known 50% from 52 yards, would you regularly take that over a punt? What about 40%, etc. Then make it dynamic, where the kicker has some shrinking probability as he moves back, and the coach has a decision about whether to kick/punt from a given distance. At what maximum distance/percentage do you regularly kick, rather than regularly punt.

This is a good question and topic, but it’s extremely hard to generalize. It depends on your game situation and what your alternatives are. Long kicks, for example, are generally bad—even with a relatively good long-range kicker.  But in late-game or late-half scenarios, clearly being able to take long kicks can be very valuable.

It is demonstrable, however, that NFL kickers have gotten incredibly good compared to past kickers.  Aside from end-game scenarios, kicking FG’s used to be almost universally dominated by going for it (or sometimes punting). But since kickers have become so accurate, the balance has gotten more delicate.

Also [sort of contra Brian Burke, I’m thinking of a link but can’t find it], I think individual team considerations are a much bigger factor in these decisions than just raw WPA.  It depends a lot on how good your offense is, how good it is at converting particular distances, how good your defense is, etc.  While the percentage differences may be fairly small for the instant decision, they pile up on each other in these types of multi-pronged calculations.

3:50: I have to admit, Aaron Rodgers is a great QB who seems to defy my “Show me a QB who doesn’t throw interceptions, and I’ll show you a sucky quarterback” rule of thumb.  And it’s not like Tom Brady, who throws INT’s when his team is struggling and doesn’t throw them when his team is awesome (which, ofc, I have NO problem with): Rodgers has a crazy-low INT rate on a team that has been mediocre (2008), good-but-not-great (2009), or all over the place (2010) during his 3 years as a starter.

4:05: Ok, purely for fun, let’s compare the all-time single-season leaders in (low) Int% (from Pro Football Reference):

With the all-time leaders for most INT thrown (also from Pro Football Reference):

Not drawing any conclusions or doing any scientific comparisons, but both lists seem to have plenty of studs as well as plenty of duds. (Actually, when I first made this comparison a couple of years ago, the “Most” list had a much better resume than the “Least” list.  But since then, the ‘good’ list has added several quality new members.)

4:11: O.K., I think I’m switching to Football Night in America.  Peter King! He was the first sports columnist I ever read regularly (though eventually I stopped).  I mean, he talks and writes completely out of his ass, but there’s a kind of refreshing sincerity about him.

4:24: So should I be more or less excited about Cam Newton after his win today?  He had a much more “rookie-like” box of 18/34 for 158.  Here’s how to break that down for rookies: Low yards = bad. High attempts = good.  Completion percentage = completely irrelevant. Win = Highly predictive of length of career, not particularly predictive of quality (likely b/c a winning rookie season gets you a lot of mileage whether you’re actually good or not). Oh, and he’s still tall:  Height is also a significant indicator (all else being equal).

Short break, back in a few.

4:43: Detroit is currently 3-0 and leading the league in Point Differential at +55, and unlikely to be passed by anyone any time soon [by which I mean, this weekend].

4:58: That +55 would be the 16th best since 2000.  Combined with their 3-0 record, they project to win ~11 games, though with lots of variance:

Yes, this can be calculated more precisely, but it will be around 11 games regardless.

5:12: The teams who led in MOV after 3 weeks since 2000 were:

  • 2010: Pittsburgh, +39, Lost Super Bowl
  • 2009: New Orleans, +64, Won Super Bowl
  • 2008: Tennessee, +43, Lost Divisional
  • 2007: New England, +79, Lost Super Bowl
  • 2006: San Diego, +57, Lost Divisional
  • 2005: Cincinatti, +60, Lost Wild Card
  • 2004: Seattle, +52, Lost Wild Card
  • 2003: Denver, +65, Lost Wild Card
  • 2002: Miami, +63, Missed Playoffs
  • 2001: Green Bay, +80, Lost Divisional
  • 2000: Tampa Bay, +67, Lost Wild Card

Not bad.  Only Miami missed the playoffs, and they were in a 3 way tie atop AFC East at 9-7.

5:23: I hate to keep going back to Schatz, but he posts so much and so fast that he’s dominating my Twitter feed.  Anyway, the latest:

Weird week for FO Premium picks. 8-6 vs. spread (4-0 Green/Yellow) but 5-9 straight up.

In 2002, I picked better against the spread than straight up over the entire season (picking every game).

5:34: Shout-out to Matt Glassman for plugging my live blog on his:

One look at his blog will convince you that he’s not only a killer sports statistician, but he’s also an engaging and humorous writer.

Though, at best, this generous praise is a game of “Two Truths and a Lie.”  [I’m not even remotely a statistician.]

5:39: If I were more clever, I’d think of some riff off the Jay-Z’s 99 problems line:

Nah, I ain’t pass the bar but i know a little bit

Enough that you won’t illegally search my shit

Incidentally, love the Rap Genius annotation for that lyric (also apt to my situation):

If you represent yourself (pro se), Bar admission is not required, actually

5:55: Since I’m obv watching the Indy game, a few things Peyton Manning coming up.  First, a quick over/under: .5, for number of Super Bowls won by Peyton Manning as a coach?

I mean, I’d take the under obv just b/c of the sheer difficulty of winning Super Bowls, but I’d be curious about the moneyline.

6:20: Sorry, was looking at something completely new to me.  Not sure exactly what to make of it, but it’s interesting:

This is QB’s with 7+ seasons of 8+ games who averaged 200+ yards per game (n=42).  These are their standard deviations, from season to season (counting only the 8+ gm years), for Yards per Game vs. Adjusted Net Yards Per Attempt.

The red dot is our absentee superstar, Peyton Manning, and the green is Johnny Unitas.  The orange is Randall Cunningham, but his numbers I think are skewed a bit because of the Randy Moss effect.  The dot at the far left of the trend-line is Jim Kelly.

6:28: So what to make of it?  I’ve been mildy critical of Adjusted Net Yards Per Attempt for the same reasons I’ve been critical of Win Percentage Added: Since the QB is involved in basically every offensive play, both of these tend to track two things: 1) Their raw offensive quality, plus (or multiplied by) 2) The amount which the team relies on the passing game.  Neither is particularly indicative of a QB doing the best with what he can, as it is literally impossible to put up good numbers in these stats on a bad team.

So it’s interesting to me that Peyton — who most would agree is one of the most consistent QB’s in football — would have such a high ANY/A standard dev (he also has a larger sample than some of the other qualifiers).

6:35: An incredibly superficial interpretation might be that Peyton sacrifices efficiency in order to “get his yards.” OTOH, this may be counter-intuitive, but I wonder if it’s not actually the opposite: Peyton was an extremely consistent winner.  Is it possible that the ANY/A to some extent reflected the quality of his supporting cast, but the yards sort of indirectly reflect his ability to transfer whatever he had into actual production? Obv I’d have to think about it more.

6:42: I think when this is over, maybe I should split it into separate parts, roughly grouped by content?  Getting unwieldy, but kind of too late to split it now.

7:02: So, according to the commentators, Mike Wallace is now the fastest player in the NFL, which makes him faster than LeSean McCoy, who (as the fastest RB) is faster than Chris Johnson, who (by proclamation) is faster than Usain Bolt, who is the fastest man on the planet.  So either someone is an alien (or a neutrino! [sorry, can’t help myself]), or something’s got to give.

7:17: David asks:

Q: The Bills for real? What do they project to over a season?

Um, I don’t know.  Generically, being 3-0 and +40 projects to 10 or 11 wins, but there’s a lot of variance in there.  The previous season’s results are still fairly significant, as are the million other things most fans could tick off.  Another statistically significant factor that most people prob wouldn’t think of is preseason results.  The Bills scored 24 and 35 points in games 2 and 3 of the preseason.  There’s a ton of work behind this result, but basically I’ve found that points scored plus points scored in games 2 and 3 of the preseason (counting backwards) is approximately as predictive as points scored minus points allowed in one game from the regular season.  So, loosely speaking, in this case, you might say that the Bills are more like a 4-0 team, with the extra game worth of data being the equivalent of a fairly quality win over a Denver/Jacksonville Hybrid.

7:27: I’d also note that it’s difficult to take strength of schedule into account at this point, at least in a quantitative way.  You can make projections about the quality of a team’s opponents, but the error in those projections are so large at this point that they add more variance to your target team’s projections than they are worth.  Or, maybe a simpler way to put it: it’s hard enough to adjust for quality of opponent when you *know* how good they were, and we don’t even know, we just have educated guesses.  (Even at the END of the season, I think a lot of ranking models and such don’t sufficiently account for the variance in SoS: that is, when a team beats x number of teams with good records, they can do very well in those rankings, even though some of the teams they beat overperfomed in their other games.  In fact, given regression to the mean, this will almost always be the case.  Of course, a clever enough model should account for this uncertainty.)

7:28: Man, that was a seriously ramble-y answer to a simple question.

7:44: I remember Mike Wallace being a valuable backup on my fantasy team in 2009, otherwise, meh.  Seems to talk a lot of crap that these announcers eat up.  Ironically, though, if a rookie or a complete unknown starts a season super-hot, commentary is usually that they’re already the next big thing, while a quality-but-not-superstar veteran with a hot start is often just credited with a hot start.  But, in reality, I think the vet, despite being more of a known quantity, is still more likely to take off.  In this case, they’re busting out the hyperbole regardless.

8:03: Speaking of which, does anyone remember Ryan Moats?  A stringer for Houston in 2009, he ended up starting (briefly) after a rash of injuries to his teammates. In his first start (against Buffalo), he had 150 yards and 3 touchdowns, and some fantasy contestants were falling over each other to pick him up.  After that, he had 2 touchdowns the rest of the season, and then was out of football.

8:07: Polamalu to the rescue, of course.  He’s so good that I think he improves the Steeler’s offense.  (And no, not kidding.)

8:13: So, with Sunday Ticket’s streaming content, instead of watching Monday Night Football, I could watch several whole games instead.

8:20: So I always think of Kerry Collins as a pretty bad QB, but damn: he’s the last man standing from the entire 1995 draft:

And, you know, he’s not dead.  So I guess he won that rivalry.

8:25: Oooh, depending on the time out situation, that might have been a spot where dropping just short of the first down would have been better than making it.  Too bad Burke’s WPA Calculator doesn’t factor in time outs!

8:31: So before this is over, one more fun fact about Usain Bolt: In his 100M record run, he maintained a minimum speed over a 40 meter stretch that no other man has ever achieved over 10.

8:32: CC just said kickers prefer being on the left hash. [Though the justification was kind of weak.]

8:35: Congrats Steelers, and condolences to Colts fans.  With their schedule, Indy may be eliminated from playoff contention before Manning even starts thinking about a return.  Could be good for them next year, though:  San Antonio Gambit, anyone?

8:38: No Post Game Show for me.  Peace out, y’all.

8:59: Okay, one last thought:  In this post, Brian Burke estimates Manning’s worth to that team, and uses the team’s total offensive WPA as a sort of “ceiling” for how valuable Manning could be:

In this case, it can tell us how many wins the Peyton Manning passing game can account for. Although we can’t really separate Manning from his blockers and receivers, we can nail down a hard number for the Colts passing game as a whole, of which Manning has been the central fixture.

The analysis, while perfectly good, does ignore two possibilities: First, the Indianapolis offense minus Manning may be below average (negative WPA), in which case the “Colts passing game” as a whole would understate Manning’s value: E.g., he could be taking it from -1 to +2.5, such that he’s actually worth 3.5, etc.  Second, even if you could get a proper measure of how much the offense would suffer without Manning, that still may not account for the degree to which the Indianapolis offense bolstered their defense’s stats.  When you’re ahead a lot, you force the other team to make sub-optimal plays that increase variance to give themselves some opportunity to catch up: this makes your defense look good. In such a scenario, I would imagine hearing things like, “Oh, the Indianapolis defense is so opportunistic!” Hmmm.

New! This Sunday: Wire-to-Wire NFL Live-Blog

With a nice vacation under my belt and the NFL season underway, I figure it’s a good time to shift some of my attention back to the blog.  I’m working on finishing and writing up some of the research and analysis I’ve been doing for a number of different sports and contexts (even baseball), so I should have some pretty interesting and diverse things to post about in the coming weeks.  But I’d also like to try some new things content-wise, and one of those that I’m very excited about is doing a regular NFL live-blog:  So, for the first time this Sunday, I’ll be conducting an all-day live blog—starting from the first kick-off and continuing all the way through the night game.

Obv I’ll be kind of making up the format as I go along, but I expect it to be a little different from your usual play-by-play with instant reactions.  We’ll see what works and what doesn’t, but my intention is for it to be a bit more of a window into how I watch the NFL, and the kinds of things I think about and explore in the process, like:

  • Random thoughts and observations related to the games and coverage that I’m watching.
  • Quick and dirty analysis (I’ve got my databases locked and loaded, and there will be graphs).
  • Relevant tidbits from or previews of some of my ongoing research.
  • Links and/or brief discussions of relevant articles, tweets, blog posts or other things that I’m reading.
  • Other random ideas (sports related or not) that grab me and won’t let go.

Additionally, if there are any reader questions, criticisms, or comments that come up, I’ll be monitoring and responding to them throughout the day (and these don’t necessarily have to be on topic: so if you have the urge to pick my brain, challenge my ideas, or point out any of my stupid mistakes, this will be a good opportunity to get an immediate response).

I’ll be starting just before the first kickoff, around 10am PST.  So, you know, be there, drop on by, I’ll make it worth your while, see you then, etc.

Graph of the Day: Alanis Loves Rookie Quarterbacks

Last season I did some analysis of rookie starting quarterbacks and which of their stats are most predictive of future NFL success. One of the most fun and interesting results I found is that rookie interception % is a statistically significant positive indicator—that is, all else being equal, QB’s who throw more interceptions as rookies tend to have more successful careers.  I’ve been going back over this work recently with an eye towards posting something on the blog (coming soon!), and while playing around with examples I stumbled into this:

Note: Data points are QB’s in the Super Bowl era who were drafted #1 overall and started at least half of their team’s games as rookies (excluding Matthew Stafford and Sam Bradford for lack of ripeness). Peyton Manning and Jim Plunkett each threw 4.9% interceptions and won one Super Bowl, so I slightly adjusted their numbers to make them both visible, though the R-squared value of .7287 is accurate to the original (a linear trend actually performs slightly better—with an R-squared of .7411—but I prefer the logarithmic one aesthetically).

Notice the relationship is almost perfectly ironic: Excluding Steve Bartowski (5.9%), no QB with a lower interception percentage has won more Super Bowls than any QB with a higher one. Overall (including Steve B.), the seven QB’s with the highest rates have 12 Super Bowl rings, or an average of 1.7 per (and obv the remaining six have none).  And it’s not just Super Bowls: those seven also have 36 career Pro Bowl selections between them (average of 5.1), to just seven for the remainder (average of 1.2).

As for significance, obviously the sample is tiny, but it’s large enough that it would be an astounding statistical artifact if there were actually nothing behind it (though I should note that the symmetricality of the result would be remarkable even with an adequate explanation for its “ironic” nature).  I have some broader ideas about the underlying dynamics and implications at play, but I’ll wait to examine those in a more robust context. Besides, rank speculation is fun, so here are a few possible factors that spring to mind:

  1. Potential for selection effect: Most rookie QB’s who throw a lot of interceptions get benched.  Teams may be more likely to let their QB continue playing when they have more confidence in his abilities—and presumably such confidence correlates (at least to some degree) with actually having greater abilities.
  2. The San Antonio gambit: Famously, David Robinson missed most of the ’96-97 NBA season with back and foot injuries, allowing the Spurs to bomb their way into getting Tim Duncan, sending the most coveted draft pick in many years to a team that, when healthy, was already somewhat of a contender (also preventing a drool-worthy Iverson/Duncan duo in Philadelphia).  Similarly, if a quality QB prospect bombs out in his rookie campaign—for whatever reason, including just “running bad”—his team may get all of the structural and competitive advantages of a true bottom-feeder (such as higher draft position), despite actually having 1/3 of a quality team (i.e., a good quarterback) in place.
  3. Gunslingers are just better:  This is my favorite possible explanation, natch.  There are a lot of variations, but the most basic idea goes like this: While ultimately a good QB on a good team will end up having lower interception rates, interceptions are not necessarily bad.  Much like going for it on 4th down, often the best win-maximizing choice that a QB can make is to “gamble”—that is, to risking turning the ball over when the reward is appropriate. This can be play-dependent (like deep passes with high upsides and low downsides), or situation-dependent (like when you’re way behind and need to give yourself the chance to get lucky to have a chance to win).  E.g.: In defense of Brett Favre—who, in crunch time, could basically be counted on to deliver you either a win or multiple “ugly” INT’s—I’ve quipped: If a QB loses a game without throwing 4 interceptions, he probably isn’t trying hard enough.  And, of course, this latter scenario should come up a lot for the crappy teams that just drafted #1 overall:  I.e., when your rookie QB is going 4-12 and isn’t throwing 20 interceptions, he’s probably doing something wrong.

[Edit (9/24/2011) to add: Considering David Meyer’s comment below, I thought I should make clear that, while my interests and tastes lie with #3 above, I don’t mean to suggest that I endorse it as the most likely or most significant factor contributing to this particular phenomenon (or even the broader one regarding predictivity of rookie INT%).  While I do find it meaningful and relevant that this result is consistent with and supportive of some of my wilder thoughts about interceptions, risk-taking, and quarterbacking, overall I think that macroscopic factors are more likely to be the driving force in this instance.]

For the record, here are the 13 QB’s and their relevant stats:

[table “7” not found /]

ESPN Stat Geek Smackdown 2011 Champion

. . . is me.

Final Standings:

  1. Benjamin Morris (68)
  2. Stephen Ilardi (65)
  3. Matthew Stahlhut (56)
  4. (Tie) Haralabos Voulgaris (54)
  5. (Tie) John Hollinger (54)
  6. David Berri (52)
  7. Neil Paine (49)
  8. Henry Abbott’s Mom (46)

To go totally obscure, I feel like Packattack must have felt when he pulled off this strat (the greatest in the history of Super Monkey Ball):

That is, he couldn’t have done it without a lot of luck, but it still feels better than just getting lucky.

As for the result, I don’t have any awesome gloating comments prepared: Like all the other “Stat Geeks,” I thought Miami was a favorite going into the Finals—and given what we knew then, I would think that again.  But at this point I definitely feel like the better team won.

For as far as they went, Miami’s experiment of putting 3 league-class primary options on the same team was essentially a failure.  I’m sure the narrative will be about how they were “in disarray” or needed more time together, but ultimately it’s a design flaw.  Without major changes, I think they’ll be in a similar spot every year: that is, they’ll be very good, and maybe even contenders, but they won’t ever be the dominant team so many imagined.

As for Dallas, they played beautiful basketball throughout the playoffs, and I personally love seeing a long-range shooting team take it down for a change.  It’s noteworthy that they defied two of the patterns I identified in my “How to Win a Championship in Any Sport” article: They become only the second NBA team since 2000 with a top-3 payroll to win it all, and they’re only the second champion in 21 years without a first-team All-NBA player.

Game Theory in Practice: Smackdown Meta-Strategy

Going into the final round of ESPN’s Stat Geek Smackdown, I found myself 4 points behind leader Stephen Ilardi, with only 7 points left on the table: 5 for picking the final series correctly, and a bonus 2 for also picking the correct number of games.  The bottom line being, the only way I could win is if the two of us picked opposite sides.  Thus, with Miami being a clear (though not insurmountable) favorite in the Finals, I picked Dallas.  As noted in the ESPN write-up”

“The Heat,” says Morris, “have a better record, home-court advantage, a better MOV [margin of victory], better SRS [simple rating system], more star power, more championship experience, and had a tougher road to the Finals. Plus Miami’s poor early-season performance can be fairly discounted, and it has important players back from injury. Thus, my model heavily favors Miami in five or six games.

But I’m sure Ilardi knows all this, so, since I’m playing to win, I’ll take Dallas. Of course, I’m gambling that Ilardi will play it safe and stick with Miami himself since I’m the only person close enough to catch him. If he assumes I will switch, he could also switch to Dallas and sew this thing up right now. Game-theoretically, there’s a mixed-strategy Nash equilibrium solution to the situation, but without knowing any more about the guy, I have to assume he’ll play it like most people would. If he’s tricky enough to level me, congrats.

Since I actually bothered to work out the equilibrium solution, I thought some of you might be interested in seeing it. Also, the situation is well-suited to illustrate a couple of practical points about how and when you should incorporate game-theoretic strategies in real life (or at least in real games).

Some Game Theory Basics

Certainly many of my readers are intimately familiar with game theory already (some probably much more than I am), but for those who are less so, I thought I should explain what a “mixed-strategy Nash equilibrium solution” is, before getting into the details on the Smackdown version (really, it’s not as complicated as it sounds).

A set of strategies and outcomes for a game is an “equilibrium” (often called a “Nash equilibrium”) if no player has any reason to deviate from it.  One of the most basic and most famous examples is the “prisoner’s dilemma” (I won’t get into the details, but if you’re not familiar with it already, you can read more at the link): the incentive structure of that game sets up an equilibrium where both prisoners rat on each other, even though it would be better for them overall if they both kept quiet.  “Rat/Rat” is an equilibrium because an individual deviating from it will only hurt themselves.  Bother prisoners staying silent is NOT an equilibrium, because either can improve their situation by switching strategies (note that games can also have multiple equilibriums, such as the “Which Side of the Road To Drive On” game: both “everybody drives on the left” and “everybody drives on the right” are perfectly good solutions).

But many games aren’t so simple.  Take “Rock-Paper-Scissors”:  If you pick “rock,” your opponent should pick “paper,” and if he picks “paper,” you should take “scissors,” and if you take “scissors,” he should take “rock,” etc, etc—at no point does the cycle stop with everyone happy.  Such games have equilibriums as well, but they involve “mixed” (as opposed to “pure”) strategies (trivia note: John Nash didn’t actually discover or invent the equilibrium named after him: his main contribution was proving that at least one existed for every game, using his own proposed definitions for “strategy,” “game,” etc).  Of course, the equilibrium solution to R-P-S is for each player to pick completely at random.

If you play the equilibrium strategy, it is impossible for opponents to gain any edge on you, and there is nothing they can do to improve their chances—even if they know exactly what you are going to do.  Thus, such a strategy is often called “unexploitable.”  The downside, however, is that you will also fail to punish your opponents for any “exploitable” strategies they may employ: For example, they can pick “rock” every time, and will win just as often.

The Smackdown Game

The situation between Ilardi and I going into our final Smackdown picks is just such a game: If Ilardi picked Miami, I should take Dallas, but if I picked Dallas, he should take Dallas, in which case I should take Miami, etc.  When you find yourself in one of these “loops,” generally it means that the equilibrium solution is a mixed strategy.

Again, the equilibrium solution is the set of strategies where neither of us has any incentive to deviate.  While finding such a thing may sound difficult in theory, for 2-player games it’s actually pretty simple intuitively, and only requires basic algebra to compute.

First, you start with one player, and find their “break-even” point: that is, the strategy their opponent would have to employ for them to be indifferent between their own strategic options.  In this case, this meant: How often would I have to pick Miami for Miami and Dallas to be equally good options for Ilardi, and vice versa.

So let’s formalize it a bit:  “EV” is the function “Expected Value.”  Let’s call Ilardi or I picking Miami “iM” and “bM,” and Ilardi or I picking Dallas “iD” and “bD,” respectively.   Ilardi will be indifferent between picking Miami and Dallas when the following is true:


Let’s say “WM” = the odds of the Heat winning the series.  So now we need to find EV(iM) in terms of bM and WM.  If Ilardi picks Miami, he wins every time I pick Miami, and every time Miami wins when I pick Dallas.  Thus his expected value for picking Miami is as follows:


When he picks Dallas, he wins every time I don’t pick Miami, and every time Miami loses when I do:


Setting these two equations equal to each other, the point of indifference can be expressed as follows:


Solving for bM, we get:


What this tells us is MY equilibrium strategy.  In other words, if I pick Miami exactly as often as we expect Miami to lose, it doesn’t matter whether Ilardi picks Miami or Dallas, he will win just as often either way.

Now, to find HIS equilibrium strategy, we repeat the process to find the point where I would be indifferent between picking Miami or Dallas:






In other words, if Ilardi picks Miami exactly as often as they are expected to win, it doesn’t matter which team I pick.

Note the elegance of the solution: Ilardi should pick each team exactly as often as they are expected to win, and I should pick each team exactly as often as they are expected to lose.  There are actually a lot of theorems and such that you’d learn in a Game Theory class that make identifying that kind of situation much easier, but I’m pretty rusty on that stuff myself.

So how often would each of us win in the equilibrium solution?  To find this, we can just solve any of the EV equations above, substituting the opposing player’s optimal strategy for the variable representing the same.  So let’s use the EV(iM) equation, substituting (1-WM) anywhere bM appears:



EV(iEq)=1 - WM +WM^2

Here’s a graph of the function:

Obviously, it doesn’t matter which team is favored: Ilardi’s edge is weakest when the series is a tossup, where he should win 75% of the time.  The bigger a favorite one team is, the bigger the leader’s advantage.

Now let’s Assume Miami was expected to win 63% of the time (approximately the consensus), the Nash Equilibrium strategy would give Ilardi a 76.7% chance of winning, which is obviously considerably better than the 63% chance that he ended up with by choosing Miami to my Dallas—so the actual picks were a favorable outcome for me. Of course, that’s not to say his decision was wrong from his perspective: Either of us could have other preferences that come into play—for example, we might intrinsically value picking the Finals correctly, or someone in my spot (though probably not me) might care more about securing their 2nd-place finish than about having a chance to overtake the leader, or Ilardi might want to avoid looking bad if he “outsmarted himself” by picking Dallas while I played straight-up and stuck with Miami.

But even assuming we both wanted to maximize our chances of winning the competition, picking Miami may still have been Ilardi’s best strategy given when he knew at the time, and it would have been a fairly common outcome if we had both played game-theoretically anyway.  Which brings me to the main purpose for this post:

A Little Meta-Strategy

In reality, neither of us played our equilibrium strategies.  I believed Ilardi would pick Miami more than 63% of the time, and thus the correct choice for me was to pick Dallas.  Assuming Ilardi believed I would pick Dallas less than 63% of the time, his best choice was to pick Miami.  Indeed, it might seem almost foolhardy to actually play a mixed strategy: what are the chances that your opponent ever actually makes a certain choice exactly 37% of the time?  Whatever your estimation, you should go with whichever gives you the better expected value, right?

This is a conundrum that should be familiar to any serious poker players out there. E.g., at the end of the hand, you will frequently find yourself in an “is he bluffing or not?” (or “should I bluff or not?”) situation.  You can work out the game-theoretically optimal calling (or bluffing) rate and then roll a die in your head.  But really, what are the chances that your opponent is bluffing exactly the correct percentage of the time?  To maximize your expected value, you gauge your opponent’s chances of bluffing, and if you have the correct pot odds, you call or fold (or raise) as appropriate.

So why would you ever play the game-theoretical strategy, rather than just making your best guess about what your opponent is doing and responding to that?  There are a couple of answers to this. First, in a repeating game, there can be strategic advantages to having your opponent know (or at least believe) that you are playing such a strategy.  But the slightly trickier—and for most people, more important—answer is that your estimation might be wrong: playing the “unexploitable” strategy is a defensive maneuver that ensures your opponent isn’t outsmarting you.

The key is that playing any “exploiting” strategy opens you up to be exploited yourself.  Think again of Rock-Paper-Scissors:  If you’re pretty sure your opponent is playing “rock” too often, you can try to exploit them by playing “paper” instead of randomizing—but this opens you up for the deadly “scissors” counterattack.  And if your opponent is a step ahead of you (or a level above you), he may have anticipated (or even set up) your new strategy, and has already prepared to take advantage.

Though it may be a bit of an oversimplification, I think a good meta-strategy for this kind of situation—where you have an equilibrium or “unexploitable” strategy available, but are tempted to play an exploiting but vulnerable strategy instead—is to step back and ask yourself the following question:  For this particular spot, if you get into a leveling contest with your opponent, who is more likely to win? If you believe you are, then, by all means, exploit away.  But if you’re unsure about his approach, and there’s a decent chance he may anticipate yours—that is, if he’s more likely to be inside your head than you are to be inside his—your best choice may be to avoid the leveling game altogether.  There’s no shame in falling back on the “unexploitable” solution, confident that he can’t possibly gain an advantage on you.

Back in Smackdown-land: Given the consensus view of the series, again, the equilibrium strategy would have given Ilardi about a 77% chance of winning.  And he could have announced this strategy to the world—it wouldn’t matter, as there’s nothing I could have done about it.  As noted above, when the actual picks came out, his new probability (63%) was significantly lower.  Of course, we shouldn’t read too much into this: it’s only a single result, and doesn’t prove that either one of us had an advantage.  On the other hand, I did make that pick in part because I felt that Ilardi was unlikely to “outlevel” me.  To be clear, this was not based on any specific assessment about Ilardi personally, but based my general beliefs about people’s tendencies in that kind of situation.

Was I right? The outcome and reasoning given in the final “picking game” has given me no reason to believe otherwise, though I think that the reciprocal lack of information this time around was a major part of that advantage.  If Ilardi and I find ourselves in a similar spot in the future (perhaps in next year’s Smackdown), I’d guess the considerations on both sides would be quite different.