Favre’s Not-So-Bad Interception

This post on Advanced NFL Stats (which is generally my favorite NFL blog), quantifying the badness of Brett Favre’s interception near the end of regulation, is somewhat revealing of a subtle problem I’ve noticed with simple win-share analysis of football plays.  To be sure, Favre’s interception “cost” the Vikings a chance to win the game in regulation, and after a decent return, even left a small chance of the Saints winning before overtime.  So in an absolute sense, it was a “bad” play, which is reflected by Brian’s conclusion that it cost the Vikings .38 wins.  But I think there are a couple of issues with that figure that are worth noting:

First, while it may have cost .38 wins versus the start of that play, a more important question might be how bad it was on the spectrum of possible outcomes.  For example, an incomplete pass still would not have left the Vikings in a great position, as they were outside of field goal range with enough time on the clock to run probably only one more play before making a FG attempt.  Likewise, if they had run the ball instead — with the Saints seemingly keyed up for the run — it is unlikely that they would have picked up the necessary yards to end the game there either.  It is important to keep in mind that many other negative outcomes, like a sack or a run for minus yards would be nearly as disastrous as the interception. In fact, by the nature of the position the Vikings were in, most “bad” outcomes would be hugely bad (in terms of win-shares), and most “good” outcomes would be hugely good.

The formal point here is that while Favre’s play was bad in absolute terms, it wasn’t much worse than a large percentage of other possible outcomes.  For an extreme comparison, imagine a team with 4th and goal at the 1 with 1 second left in the game, needing a touchdown to win, and the quarterback throws an incomplete pass.  The win-shares system would grade this as a terrible mistake!  I would suggest that a better way to quantify this type of result might be to ask the question: how many standard deviations worse than the mean was the outcome?  In the 4th down case, I think it’s hard to make either a terrible mistake or an incredible play, because practically any outcome is essentially normal.  Similarly, in the Favre case, while the interception was a highly unfavorable outcome, it wasn’t nearly as drastic as the basic win-shares analysis might make it seem.

Second, to rate this play based on the actual result is, shall we say, a little results-oriented.  As should be obvious, a completion of that length would have been an almost sure victory for the Vikings, so it’s unclear whether Favre’s throw was even a bad decision.  Considering they were out of field goal range at the start of the play, if the distribution of outcomes of the pass were 40% completions, 40% incompletions, and 20% interceptions, it would easily have been a win-maximizing gamble.  Regardless of the exact distribution ex ante, the -.38 wins outcome is way on the low end of the possible outcomes, especially considering that it reflects a longer than average return on the pick.  As should be obvious, many interceptions are the product of good quarterbacking decisions (I may write separately at a later point on the topic “Show me a quarterback that doesn’t throw interceptions, and I’ll show you a sucky quarterback”), and in this case it is not clear to me which type this was.

This should not be taken as a criticism of Advanced NFL Stats’ methodology. I’m certain Brian understands the difference between the resulting win-shares a play produces and the question of whether that result was the product of a poor decision.  When it comes to 4th downs, for example, everyone with even an inkling of analytical skill understands that Belichick’s infamously going for it against the Colts was definitely the win-maximizing play, even though it had a terrible result.  It doesn’t take a very big leap from there to realize that the same reasoning applies equally to players’ decisions.

My broader agenda that these issues partly relate to (which I will hopefully expand on significantly in the future) is that while I believe win-share analysis is the best — and in some sense the only — way to evaluate football decisions, I am also concerned with the many complications that arise when attempting to expand its purview to player evaluation.

A Decade of Hot Teams in the Playoffs

San Diego and Dallas were the Super Bowl-pick darlings of many sports writers and commentators heading into this postseason, in no small part because they were the two “hottest” teams in the NFL, having finished the regular season with the two longest winning streaks of any contenders (at 11 and 3, respectively).  Routinely, year after year, I think that the prediction-makers in the media overvalue season-ending rushes.  My reasons for believing this include:

  1. The seeding of many teams are frequently sealed or near-sealed weeks before the playoffs begin, leaving them with little incentive to compete fully.
  2. Teams that are eliminated from playoff contention may be dispirited, and/or players may not be giving 100% effort to winning, instead focusing on padding statistics or avoiding injury.
  3. When non-contenders do give maximum effort, it may more often be to play the role of “spoiler,” or to save face for their season by trying to beat the most high-profile contenders.
  4. Variance.

So the broader question to ask is “does late-season success correlate any more strongly with postseason performance than middle or early season success?”  But in this case, I’m interested only in winning streaks — i.e., the “hottest” teams, for which any relevant sample would probably be too small to draw any meaningful conclusions.  However, I thought it might be interesting to look at how the teams with the longest winning streaks have performed in the last decade:

2009:
AFC: San Diego: Won 11, lost divisional
NFC: Dallas: Won 3, lost divisional

2008:
AFC: Indianapolis: Won 9, lost wildcard
NFC: Atlanta: Won 3, lost wildcard

2007:
AFC: New England: Won 16, lost Superbowl
NFC: Washington: Won 4, lost wildcard

2006:
AFC: San Diego:  Won 10, lost divisional
NFC: Philadelphia: Won 5, lost divisional

2005:
NFC: Redskins: Won 5, lost divisional
AFC: Tie: Won 4: Denver: lost AFC championship; Pittsburg: won Superbowl
(the hottest team overall, Miami, won 6 but didn’t make the playoffs)

2004:
NFC: Pittsburg: Won 14, lost AFC championship
AFC: Tie: Won 2: Seattle: lost Superbowl; St. Louis: lost divisional; Green Bay: lost wildcard
2003:
AFC: New England: Won 12, won Superbowl
NFC: Green Bay: Won 4, lost divisional

2002:
AFC: Tennessee: Won 5, lost AFC championship
NFC: NY Giants: Won 4, lost wildcard

2001:
AFC: Patriots: Won 6, won Superbowl
NFC: Rams: Won 6, lost Superbowl

2000:
AFC: Baltimore: Won 7, won Superbowl
NFC: NY Giants: Won 5, lost Superbowl

From 2006 on, the hottest teams have obviously done terribly, with the undefeated Patriots being the only team to make it out of the divisional round.  Prior to that, the results seem more normal:  In 2005, Pittsburg won the Superbowl after tying for the longest winning streak among AFC playoff teams (though they trailed Washington in the NFC and Miami who didn’t make the playoffs).  New England won the Superbowl as the hottest team twice: in 2001 and 2003 — although both times they were one of the top seeds in their conference as well.  The last hottest team to play on wildcard weekend AND win the Superbowl was the Baltimore Ravens in 2000.

So what does that tell us?  Well, a decent anecdote — and not much more.  The sample is small and the numbers inconclusive.  On the one hand, the particular species of Cinderella team that gets predicted to win the Superbowl year after year by some — one that starts the season weakly but catches fire late and rides their momentum to the championship — has been a rarity (and going back further, it doesn’t get any more common).  On the other hand, if you simply picked the hottest team to win the Superbowl every year in this decade, you would have correctly picked 3 winners out of 10, which would not be a terrible record.