Uncategorized » Skeptical Sports Analysis

Stat Geek Smackdown, Round 1: Leftovers

I’ve been invited to participate in TrueHoop’s “Stat Geek Smackdown 2011” on ESPN.com. Unfortunately, I won’t actually get to smack any stat geeks, but I will get to pick NBA playoff series and compete with the likes of John Hollinger, David Berri, and Henry Abbot’s mom.

The rules are simple: each “expert” calls the winner of each series and the number of games (e.g., Spurs in 6)—5 points are awarded for each correct winner, with an additional 2 points for getting the length as well.

Most of the first round matchups have heavy favorites, so there isn’t too much disagreement on the panel about outcomes. But while researching my picks on Thursday night, I had some interesting findings that seemed a bit at odds with a lot of the others’ comments. So rather than going into the nitty-gritty of each series, I thought I’d summarize a few of these broader instances of divergence. Beware, a lot of this is preliminary stuff. I do think it is all on pretty solid footing, but there is much more to be done:

1. Form is overrated

At one point or another, nearly every expert quoted in this article cites a team’s recent good or bad performance as evidence that the team may be better or worse than their overall record would indicate. I’ve been interested in this question for a long time, and have looked at it from many different angles. Ultimately, I’ve concluded that there is no special correlation between late-season performance and playoff success. In fact, the opposite is far more likely.

To examine this issue, I took the last 20 years of regular and post-season data, and broke the seasons down into 20 game quarters. I excluded the last 2 games of each season, which is mathematically more convenient and reduces a lot of tactical distortion (I also excluded games from the 1998-99 strike-shortened season). I then ran a number of regressions comparing regular and post-season performances of playoff teams. There are a lot of different ways to design this regression (should the regression be run on a game-by-game or series-by-series basis? etc.), but literally no permutation I could think of offered any significant support for the conventional approach of favoring recent results. For example, here are the results of a linear regression from wins by quarter-season to playoff series won (taller bars mean more predictive):

Aesthetically pleasing, no? As to why the later part of the season performs so poorly in these tests, it has been suggested that resting players and various other strategic incentives not to maximize winning may be the cause. That is almost certainly true to some extent, but I suspect it also has to do with the playoff structure itself: because of the drawn-out schedule, unvarying opponents, and high stakes, teams are better rested, better prepared, and more psychologically focused—not unlike they are at the beginning of each season.

Read the rest of this entry »