Sports » Skeptical Sports Analysis

Punts are Turnovers Too (Introducing PUPTO!)

[For ease of reference—with apologies to those of you who sat through or otherwise already read my NFL Live Blog from this Sunday—I’m once again splitting a few of the topics I covered out into individual posts. I’ve made mostly made only cosmetic adjustments (additional comments are in brackets or at the end), so apologies if these posts aren’t quite as clean or detailed as a regular article. For flavor and context, I still recommend reading the whole thing.]

[I removed the “From the Live Blog” tag from the title of this post, because 1) I added a bit more explanation in my Addendum below, 2) The original discussion was at the very end of my very long live-blog post and a lot of people prob didn’t get to it anyway, and 3) I just think it’s an important issue and I don’t want to scare people away.]

With all the turnovers in [the Jets and Ravens] game, there’s about a 100% chance that commentators later will talk about the importance of “turnover differential.” People always rattle off a bunch of stats about how the team that wins the “turnover battle” almost always wins the game (like, duh), with the intention of reminding everyone how terrible it is to take the kinds of risks that lead to turnovers.

But this causation goes both ways: Turnovers can obviously cause teams to lose, but teams losing also cause turnovers. When you’re behind, you have to take risks to have any chance of winning. Citing the “turnover battle” stats without context is about as ridiculous as [the also way-overused] “Team X is 43-1 when having a 100 yard rusher.”

What goes unmentioned in all of this is “punt differential.” But punts also involve turning the ball over, and guess what? This stat is ALSO highly predictive of game outcomes, but without as much causation baggage: When teams are behind, they are actually forced to punt less. Despite the completely routine nature of punts vs. the “extreme” nature of turnovers, “punt differential” holds its own with “turnover differential” in a logistic regression to Win % (n=5308):

If you run this as a linear regression to point differential, it gets even closer (I should also note, if you do your regression to “outside” games, punt differential is actually more predictive, because it is much more reliable).

A fun metric that I love (and believe to be very useful) is “punts plus turnovers,” or PUPTO [Make it big, people!]:

A pretty interesting thing to note in this chart is the difference between the correlation to win % of interception differential vs. fumble differential. From a pure “Turnovers=Bad” perspective, this should be counter-intuitive: After all, many interceptions take place down-field, while fumbles typically happen at the line of scrimmage (also, I haven’t checked, but I feel like a disproportionate number of fumbles are returned for touchdowns). My suspicion is that this difference is at least partly [on reflection, probably mostly] explained by what I described earlier: When teams are losing, they have to take a lot of risks that lead to more interceptions, [i.e., passing a lot] but they don’t take a lot of risks [or at least as many] that lead to more fumbles.

Addendum:

To make this a bit more clear (I hope), the point is that the difference in fumbles lost should be a pretty “pure” metric for representing the consequences of turnovers. This is because they happen much more randomly than interceptions, which (both rationally and empirically) increase in frequency significantly when a team is already behind. So imagine this effect didn’t exist, and interceptions were distributed more like fumbles: we would expect the “INT Diff” bar in the chart above to drop closer to where the “Fum Diff” bar is, and consequently the “TO Diff” bar would drop as well. The “PUPTO” bar—though obviously dropping a little itself—would comparatively tower over the rest. So I don’t just love PUPTO for the way it sounds: I think it’s actually a powerful metric.

Not to mention, if I somehow had the power to instantly mainstream it, it might dampen a little bit of the stigma against “going for it” on 4th down: one of the things sports commentators and talking heads constantly seem to forget is that punts are bad too.

From the Live Blog: Drive Outcomes From Deep in Your Territory

[See also my Addendum below.]

Random stat from my PBP database: For home teams, 19% of drives starting with a kickoff end in a touchdown, for away teams, just under 17%. But on the first drive of a game, home teams score TD’s on 22%, away teams just 16%.

Any time I start wading through PBP stuff, I get easily distracted. There’s something new and fascinating around every corner! [E.g.], here’s what should be considered a pretty basic graph, but it has some interesting subtleties to it:

One of the most interesting parts is what’s going on in the first 20 yards:

So what’s interesting about this? Well, that aside from safeties, these particular results are very linear. I think many people would expect that being backed into your endzone makes executing your offense a lot harder — but aside from the occasional safety, the outcomes are really no worse than what you would expect from just being more yards back (turnovers aren’t shown b/c of data mashing issues, but there’s not a massive jump for them either).

Of course, another important factor [is the effect on punting]:

And, the corresponding graph limited to your own 20:

So the take-homes from the above graphs are that the situation gets significantly better/worse within the 5 yard line, accelerating as you approach the goal line. [Though the effect may not be as apparently strong as some probably thought,] this is why kicking field goals from the 1 is terrible even in situations where it has some tactical benefit. Obv this is nothing new to anyone even slightly informed about “expected value” in football (it’s basically the prototypical example), but to break it down clearly: If you don’t score on your 4th down play, trapping your opponent in that spot is valuable 4 ways:

Natural field position advantage vs. giving your opponent the ball on the 20 after a made field goal.
Significantly increased chance of a safety.
Increased chance of good field position b/c of short opponent punts.
Your subsequent field position also starts to hit the increasing part of your Touchdown/Expected Points curve (i.e., it has value in addition to the generic value of better expected field position).

Though it should be noted that the last 3 effects are much stronger on the 1 than on the 5.

Matt notes:

I was just able to use the drive expectancy chart to check on a Chris Collinsworth comment, love all the graphs and tools. BTW the comment was about the value of getting the ball on the 2-3 vs the half yard line, if I read the graph right the odds of a safety double at the 1 yard line vs the three yard line. Still only 6% but a big enough difference to care.

Yes, there’s a significant difference between the two, and it gets more and more dramatic the closer you get to the plane: there’s a pretty significant difference from being at the 1 and being at the 1/2, etc. It’s also true on the other side of the field: all kinds of wacky things happen as you approach the Endzone, and they’re not all intuitive.

In fact, one of the big difficulties with building a [Win Percentage Added] model is accounting for these kinds of situations empirically, because 1) they behave abnormally, and 2) they’re either rare (e.g., being right at your own end zone), or extremely specific (e.g., some of the things that happen around the 11-12 yard line in the Red Zone), and thus have some of the smallest sample sizes for observation.

Addendum:

David Myers (of Code and Football) also comments:

Why ask? I think the result is important, and I was curious how reliable the data set was. 9 years is a lot of data [note: it’s actually 10 years].

A further explanation would be this: I’m curious why the expected points curves of, say, Keith Goldner and Bill Connelly and Romer/Burke are different. I’ve speculated on the difference here. Your plot suggests that different drive scoring has to be at the root of those differences, as safeties alone can’t account for the first and 10 expected points curves I’ve seen.

Yes, this is what I was getting at in that last paragraph. It’s a bit like physics: it’s easy to build models that explain all the common and relatively simple situations. But it gets much more difficult in the extremes, which can be more complicated, often have less data to analyze, and what data is available is often less reliable.

From the Live Blog: Baseball Haterade (With NFL Regression Tangent)

In support of last night’s screed [Why Baseball and I are, Like, Unmixy Things], especially the claim that “[MLB] games are either not important enough to be interesting (98% of the regular season), or too important to be meaningful (100% of the playoffs),” here’s a graph I made to illustrate just how silly the MLB Playoffs are:

Not counting home-field advantage (which is weakest in baseball anyway), this represents the approximate binomial probability [thank you, again, binom.dist() function] of the team with the best record in the league [technically, a team that has an actual expectation against an average opponent equal to best record] winning a series of length X against the playoff team with the worst record [again, technically, a team that has an actual expectation equal to worst record] going in. The chances of winning each game are approximated by taking .5 + better win percentage – worse win percentage (note, of course, the NFL curve is exaggerated b/c of regression to the mean: a team that goes 14-2 doesn’t won’t actually win 88% of their games against an average opponent. But they won’t regress nearly enough for their expectation to drop anywhere near MLB levels). The brighter and bigger data points represent the actual first round series lengths in each sport.

By this approximation, the best team against the worst team in a 1st round series (using the latest season’s standings as the benchmark) in MLB would win about 64% of the time, while in the NBA they would win ~95% of the time. To win 2/3 of the time, MLB would need to switch to a 9 game series instead of 5; and to have the best team win 75% of the time, they would need to shift to 21 (for the record, in order to match the NBA’s 95% mark, they would have to move to a 123 game series. I know, this isn’t perfectly calculated, but it’s ballpark accurate). Personally, I like the fact that the NBA and NFL postseasons generally feature the best teams winning.

Moreover, it also makes upsets more meaningful: since the math is against “true” upsets happening often, an apparent upset can be significant: it often indicates—Bayes-wise (ok, if that’s not a word, it should be)—that the upsetting team was actually better. In baseball, an upset pretty much just means that the coin came up tails.

Adam asks:

In the MLB vs. NFL vs. NBA Playoffs graph, the chances of best beating worst in first round for NFL for a 1 game series is almost 95%.

Looking at the odds to this week’s NFL games, the biggest favorite was GB verses Denver and they were only an 88% chance of winning by the money line (-700). Denver is almost certainly not a playoff team, so it’s tough to imagine an even more lopsided playoff matchup that could get to 95%. What am I missing?

I sort of addressed this in my longer explanation, but he’s not missing anything: the football effect is exaggerated. First off, to your specific concern, this early in the season there is even more uncertainty than in the playoffs. But second, and more importantly, this method for approximating a win percentage is less accurate in the extremes, especially when factoring in regression to the mean (which is a huge factor given the NFL’s very small sample sizes).

In fact, the regression to the mean effect in the NFL is SO strong, that I think it helps explain why so many Bye-teams lose against the Wild Card game winners (without having to resort to “momentum” or psychological factors for our explanation). By virtue of having the best records in the league, they are the most likely teams to have significant regression effects. That is, their true strength is likely to be lower than what their records indicate. Conversely, the teams that win in the bye week (against other playoff-level competition), are, from a Bayesian perspective, more likely to be better than their records indicated. Think of it like this: there’s a range of possible true strength for each playoff team: when you match two of those teams against each other (in the WC round), the one who wins might have just gotten lucky, but that particular result is more likely to occur when the winning team’s actual strength was closer to the top of their range and/or their opponent’s was closer to their bottom.

I’ve looked at this before, and it’s very easy to construct scenarios where WC teams with worse records have a higher projected strength than Bye team opponents with better records. Factor in the fact that home field advantage actually decreases in the playoffs (it’s a common misconception that HFA is more important in the post-season: adjusting for team quality, it’s actually significantly reduced—which probably has something to do with the post-season ref shuffle: see section on ref bias in this post), and you have a recipe for frequent upsets.

In retrospect, I probably should have just left the NFL out of that graph. Basketball makes for a much better comparison [both aesthetically and analytically]:

From the Live Blog: More on Cam Newton

[This is sort of following up on last week’s live blog, where I discussed Cam Newton’s hot start a little, so I’ll include that snip first.]

Last Week, On Cam Newton:

Watching pre-game. Strahan is taking “overreaction” to a new level, not only declaring that maybe the NFL isn’t even ready for Cam Newton, but that this has taught him to stop being critical of rookie QB’s in the future.

But should I be more or less excited about Cam Newton after his win today? He had a much more “rookie-like” box of 18/34 for 158. Here’s how to break that down for rookies: Low yards = bad. High attempts = good. Completion percentage = completely irrelevant. Win = Highly predictive of length of career, not particularly predictive of quality (likely b/c a winning rookie season gets you a lot of mileage whether you’re actually good or not). Oh, and he’s still tall: Height is also a significant indicator (all else being equal).

This week:

Google search that just led to the blog: “should I start Michael Vick over Cam Newton today.” Welcome, fantasy footballer! And sorry, I have no idea. I don’t play fantasy football anymore: it’s pleasantly time consuming and has near-infinite depths for analysis, but the overlap with analysis of things that matter is way too small. [This was a bit harshly put, but I mean something serious: I’m increasingly convinced that NFL box score accomplishments have little relation to actual player values].

Here’s rookie QB’s YPG over their first 3 starts [actually, it’s games played, not started—my bad] vs. their YPG for the rest of the season (min 8 starts total):

And here’s the table of rookies through Dan Marino (who I thought would have been higher):

Cam Newton, of course, has 1012 through his first 3. And in game 4 he has already passed Vinny Testaverde’s production for the rest of his rookie season.

[At the conclusion of the game] Newton now has 1386 yards, which is the record for a rookie through his first four games (previously held by Marc Bulger with 1149). The record through five is 1496, so he’s likely to break that. Through six is 1815, so that’s not a sure thing, but through seven is also 1815 (Bulger only played 6 games, and the next highest total through seven is 1699). But there’s still a lot of variance to be navigated between now and [Peyton] Manning’s full-season record of 3739.

10/2 NFL Sunday Live Blog

Here we go again. Details here. Please leave comments about anything you want and I’ll do my best to give thorough responses.

Getting DirecTV Monday but having to wait until today to use Sunday Ticket has been like waiting to open a Christmas present I bought myself.

10:00: Tough to pick a game with the Panthers, Lions, Steelers, and Bills games all having surprisingly interesting story-lines, so I’m going to stick with the Red Zone Channel for a few.

10:07: OK, I didn’t know this was humanly possible, but I think RZC is too ADD for me, so I’m going to switch to the Panthers game.

10:11: I wonder what qualities on teams are most likely to lead to successful rookie campaigns. And given that successful rookie campaigns tend to fizzle out in future years, do those same things perhaps correlate negatively with long-term success?

10:18: Ugh, breast cancer awareness. Am I the only one that finds the breast cancer pandering to be a little insidious? It’s politically an easy cancer to go with, and has cross-gender appeal, but it’s already over-funded relative to its prevalence and mortality rates. See this slightly out-of-date NYTimes article: Lung Cancer receives $1,630 research dollars per death, and Breast Cancer receives $13,452.

Maybe lung cancer is politically toxic b/c it’s seen as the smoker’s curse, but Colon cancer kills more people and gets 1/3 the funding. But it wouldn’t be nearly as sexy to wear brown “Colon Cancer Awareness” ribbons.

10:25: Ok, 20 minutes and no graphs, I know, derelict.

10:35: Ok, this was an idea that turned out approximately how I expected. This is Win Percentage Added per Game on the X-axis and Attempts and Completion percentage on the Y-axes:

The interesting part is that attempts has a higher relative slope, but completion percentage has a better R-squared. In other words, completion percentage is more reliable, but less indicative.

10:42: Google search that just led to the blog: “should I start Michael Vick over Cam Newton today.” Welcome, fantasy footballer! And sorry, I have no idea. I don’t play fantasy football anymore: it’s pleasantly time consuming and has near-infinite depths for analysis, but the overlap with analysis of things that matter is way too small.

11:02: More on Cam Newton. Here’s rookie QB’s YPG over their first 3 starts vs. their YPG for the rest of the season (min 8 starts total):

11:18: Here’s the table of rookies through Dan Marino (who I thought would have been higher):

Cam Newton, of course, has 1012 through his first 3. And in game 4 he has already passed Vinny Testaverde’s production for the rest of his rookie season.

11:22: Oops, I apologize. The above counts games played by rookie quarterbacks, not just games started.

11:30: Reliable blog-fan Matt comments:

I’m watching Redskins-Rams right now, and they just called a personal foul on a punt coverage player for “hitting a defense-less receiver,” as he pummeled the punt returner just as the returner caught the ball. He did not hit him prior to the ball being caught.

Now, I’m sure the officials are properly applying the rule. But this has to be a dumb rule, right? This is why we have the fair-catch rule. To me, as long as the hit is clean (i.e. not helmet-helmet, etc.) and you don’t hit him prior to him touching the ball, this rule is unnecessary. (Note that this is different than hitting a defenseless receiver on a pass play over the middle; he can’t declare (or not declare) a fair catch.

I didn’t see the play, but I’m not sure I entirely agree, at least as a theoretical matter. I can imagine there being a scale for how “unnecessarily rough” a hit is relative to how defenseless the target is. But, yes, I agree that there is a bit of discord between this approach and the fair catch rule, which is basically a completely objective but much more crude way of addressing the same issue.

11:35: Broadly, I think the league is moving away from the “brutality” of violence and more toward the “finesse” of passing. I imagine that (in addition to protecting the players, blah, blah) they think they’re giving the fans what they want. But I don’t know. As an aesthetic matter, I think people are more mesmerized by the “finesse” side of the game, but possibly without realizing that a good chunk of its beauty comes from the contrast.

11:39: Ok, getting long enough that I’m switching to the split format that seemed to work last week.

11:50: A friend on IM:

Friend [11:40 AM]: i think people realize they like the contrast

Friend [11:40 AM]: i doubt there are numbers for it, but i’ve heard anecdotes about training camp attendance

Friend [11:41 AM]: and how it sharply increases when full pads

I at least partly agree. The typical fan knows that they like the violence, though I’m not sure they understand how the two sides enhance each other. And I’m not sure that the league understands this either. I think they probably see it a matter of addition: for example, say it used to be 40 utils of pleasure from finesse and 40 utils of pleasure from violence, for 80 total. They believe that if they can sacrifice 20 utils of violence for 25 of finesse, they’ll have upped the total utility to 85. But if the total aesthetic utility corresponds to the product of the two values, they would have actually reduced the value from 1600 (40*40) to 1300 (65*20).

12:09: Matt asks:

Anyway, I wanted to pose you a question regarding time management. Skins complete pass on 2nd down, setting up 3rd and 2, from their own 28, with about 1:20 to play. Neither team calls a timeout. That could very well be correct, but I assume similar situations are ripe for poor decision making. (i.e. if it’s 3rd and 17, Rams definitely call TO; if the ball is at the 48, Skins probably call timeout). Any thoughts / empirical work you’ve done/seen on this?

Yes, I’ve looked into it some. The situational analysis is sticky and hard to generalize, but the main thing to note is that any time it has a significant effect, one of the two teams will usually have the incentive to take the timeout, at least somewhere within the given sequence of downs (unless the value of keeping it is greater).

In this situation, the Redskins gain little by taking the timeout, since letting the clock run down and keeping it for later is virtually equivalent with 1:20 left, the marginal difference to their scoring chances isn’t huge, and the bigger risk to them is giving St. Louis too much time. Conversely, the Rams prob should take it. Take a look at this old link about two minute drills for some relevant stats.

12:29: Random stat from my PBP database: For home teams, 19% of drives starting with a kickoff end in a touchdown, for away teams, just under 17%.

12:39: But on the first drive of a game, home teams score TD’s on 22%, away teams just 16%.

12:56: Sorry I’m so slow this morning. It’s not slacking, I’ve been playing around with some PBP data relating to Matt’s question above, and any time I start wading through PBP stuff, I get easily distracted. There’s something new and fascinating around every corner!

1:11: OK, here’s what should be considered a pretty basic graph, but it has some interesting subtleties to it:

Comment to follow.

1:24: One of the interesting things about this graph is what’s going on in the first 20 yards:

1:35: Of course, the other important factor:

And, the corresponding graph limited to your own 20:

1:41: Grabbing lunch, back in a few.

1:44: Bill Simmons tweets:

sportsguy33

The Cowboys really need to fire Wade Phillips. This has dragged on far too long.

20 minutes ago

Ha! I Think the NFL Coach with the longest tenure at the moment is Jerry Jones Puppet X.

2:09: So the take-homes from the above graphs are that the situation gets significantly better/worse within the 5 yard line, accelerating as you approach the goal line. This is why kicking field goals from the 1 is terrible even in situations where it has some tactical benefit. Obv this is nothing new to anyone even slightly informed about “expected value” in football (it’s basically the prototypical example), but to break it down clearly: If you don’t score on your 4th down play, trapping your opponent in that spot is valuable 4 ways:

Natural field position advantage vs. giving your opponent the ball on the 20 after a made field goal.
Significantly increased chance of a safety.
Increased chance of good field position b/c of short opponent punts.
Your subsequent field position also starts to hit the increasing part of your Touchdown/Expected Points curve (i.e., it has value in addition to the generic value of better expected field position).

Though it should be noted that the last 3 effects are much stronger on the 1 than on the 5.

2:16: After Red Zone Channel-ing it for a while, I’m switching to the New England game.

2:17: This is odd: Traffic for the live blog is up from last week, but commenting is down. Come on, people!

2:25: So these Sunday Ticket “Short Cuts” are pretty sweet: They’re 30 minute versions of each game that include every play and no filler. But their utility is limited by the fact that they just cut up the original sound track and don’t have any new commentary or any other way of identifying who is involved in each play: About half the time, the commentators get cut off before saying who the runner or receiver was. At the very least, they should scroll the play-by-play along the bottom.

Also, while it might make the vids slightly longer, they should take at least a moment to dwell on the particularly big or important plays: It’s weird when there’s like a 60 down touchdown pass, then bam!, extra point, kickoff, etc. Show a few replays! Not only is it more fun that way, but they contain important information.

2:51: Random thought: There’s a pretty simple pattern of where analytics has and hasn’t been adopted in sports: it has been adopted in spots where decision-makers can blame other people for having been wrong all along, and it hasn’t been adopted in places where decision-makers would have to blame themselves.

2:59: Congrats to the Lions, but with apologies to Nate, there’s at least one good reason to root against Detroit: If they end up going to—or, god forbid, winning—the Super Bowl, the stories about the recession/auto industry recovery and the Lions being the main source of hope for a struggling community, etc., may be even more insufferable than the litany of identical stories we had with New Orleans.

3:12: More on the “random thought” above: So the sport of baseball has no problem with the Moneyball movie, I’m sure, b/c who are portrayed as the stupid ones: The nameless, decrepit, curmudgeonly old scouts in the basement? The faceless, nameless, obnoxious talk radio hosts and listeners? The only baseball execs we see are 1) the Oakland ownership, who tell Beane to do what he’s got to do, 2) the Cleveland management, who, for some unknown reason, listen to this lackey analyst that they supposedly disrespect, and 3) the glorious, heartwarming Red Sox. Even the “clueless” manager really only has one major dispute: whether to start Peña (an All-Star) or Hatteburg at 1st base, which is ultimately portrayed as a near-push anyway.

3:19: And within the sport of Baseball, look at where analytics has been most embraced: 1) Player evaluation—something it is easy to blame others for (as above). And 2) High-granularity pitching and batting stats (location, etc), which are just an advancement in technology, and not having them before was no-one’s fault. And where has it been rejected and ignored? Strategic decisions! Teams still steal too much, bunt too much, they’re still stuck on archaic pitching rotation strategy (e.g., Mariano Rivera should either be pitching more innings or more important innings: the way he is used is demonstrably inefficient), etc. And who is responsible for having gotten those wrong for so long? The same people who are in position to accept or reject new approaches!

3:43: So the most exciting game going atm is Giants/Cardinals—ugh. I did mention I’ll try to answer almost any question, right? Aren’t there any Dennis Rodman haters out there who can give me something more interesting to talk about than Eli Manning’s mediocrity?

3:46: Even with the New England’s defensive problems, I think my generic “Patriots vs. Whoever Was The Best Team in the NFC Last Season” Super Bowl pick is looking pretty decent right now.

4:16: In support of last night’s screed, especially the claim that “[MLB] games are either not important enough to be interesting (98% of the regular season), or too important to be meaningful (100% of the playoffs),” here’s a graph I made to illustrate just how silly the MLB Playoffs are:

4:30: Not counting home-field advantage (which is weakest in baseball anyway), this represents the approximate binomial probability of the team with the best record in the league winning a series of length X against the playoff team with the worst record going in. The chances of winning a game are approximated by taking .5 + better win percentage – worse win percentage (note, of course, the NFL curve is exaggerated b/c of regression to the mean: a team that goes 14-2 doesn’t won’t actually win 88% of their games against an average opponent. But they won’t regress nearly enough for their expectation to drop anywhere near MLB). The brighter and bigger data points represent the actual 1st round series lengths in each sport.

By this approximation, the best team against the worst team in a 1st round series (using the latest season’s standings as the benchmark) in MLB would win about 64% of the time, while in the NBA they would win ~95% of the time. To win 2/3 of the time, MLB would need to switch to a 9 game series instead of 5; and to have the best team win 75% of the time, they would need to shift to 21 (for the record, in order to match the NBA’s 95% mark, they would have to move to a 123 game series. I know, this isn’t perfectly calculated, but it’s ballpark accurate). I like the fact that the NBA and NFL postseasons generally feature the best teams winning.

Moreover, in a sense, it also makes upsets more meaningful: since the math is against “true” upsets happening often, an apparent upset can be significant: it often indicates—Bayes-wise (ok, if that’s not a word, it should be)—that the upsetting team was actually better. In baseball, an upset pretty much just means that the coin came up tails.

4:42: Adam asks:

In the MLB vs. NFL vs. NBA Playoffs graph, the chances of best beating worst in first round for NFL for a 1 game series is almost 95%.

Looking at the odds to this week’s NFL games, the biggest favorite was GB verses Denver and they were only an 88% chance of winning by the money line (-700). Denver is almost certainly not a playoff team, so it’s tough to imagine an even more lopsided playoff matchup that could get to 95%. What am I missing?

I sort of addressed this in my longer explanation, but you’re not missing anything: the football effect is exaggerated (but the reality still doesn’t drop anywhere near baseball). First off, to your specific concern, this early in the season there is even more uncertainty than in the playoffs. But second, and more importantly, this method for approximating a win percentage is less accurate in the extremes, especially when factoring in regression to the mean (which is a huge factor given the NFL’s very small sample sizes). Maybe I’ll update the math tonight or tomorrow, but even this crude demonstration should be sufficient proof of the point.

4:52: In fact, the regression to the mean effect in the NFL is SO strong, that I think it helps explain why so many Bye-teams lose against the wild-card game winners (without having to resort to “momentum” or psychological factors for our explanation). By virtue of having the best records in the league, they are the most likely teams to have significant regression effects. That is, their true strength is likely to be lower than what their records indicate. Conversely, the teams that win in the bye week (against other playoff-level competition), are, from a Bayesian perspective, more likely to be better than their records indicated. Think of it like this: there’s a range of possible true strength for each playoff team: when you match two of those teams against each other (in the WC round), the one who wins might have just gotten lucky, but that particular result is more likely to occur when the winning team’s actual strength was closer to the top of their range and/or their opponent’s was closer to their bottom.

In fact, I’ve looked at this before, and it’s very easy to construct scenarios where WC teams with worse records have a better projected strength than Bye teams with better records. Factor in the fact that home field advantage actually decreases in the playoffs (it’s a common misconception that HFA is more important in the post-season: adjusting for team quality, it’s actually significantly reduced—which probably has something to do with the post-season ref shuffle: see section on ref bias in this post), and you have a recipe for frequent upsets.

4:55: In retrospect, I probably should have just left the NFL out of that graph. Basketball makes for a much better comparison:

4:59: Arturo tweets:

ArturoGalletti

Safe to say at this point, Phillip Rivers is << than Drew Brees

54 minutes ago

Man, it’s like I KNOW Brees is awesome, sort of, but I still have lingering doubts that a 6-foot tall QB can really be that good. I mean, I feel like a bigot, but I’ve looked at the math over and over again, and he’s such a massive outlier that I can’t let it go.

5:11: WTF? With the latest Firefox update, I can no longer drag tabs left and right. Did Mozilla hire the Netflix web design team or something?

5:12: “Don’t kick it to Devin Hester.” Man, can we please retire this stupid observation? He has 12 return touchdowns. In 6 years.

5:32: Ok, I’m not finding the 12 number they’re mentioning, but Hester has 11 TD on 182 punt returns and has averaged 12.6 yards per return. This is obviously excellent, but please: the average punt return is 8.1 yards. What is the cost of punting away from Hester? It’s not like this is a free proposition. Hint: it’s probably more than 4.5 yards. And you know who else scores touchdowns? Teams with better field position.

5:37: I hate the new Facebook. I mean, I wasn’t the biggest fan before. But it’s like they took every thing I hated about it and multiplied them each by 10. Unrelatedly, don’t forget sign up for updates on the Skeptical Sports Analysis Facebook page!

5:41: I think I’m addicted to scatterplots. <— new contender for nerdiest thing I’ve ever thought in earnest.

5:47: Earlier, from IM, re: rule changes protecting “defenseless” players:

Friend [11:45 AM]: probably don’t do very much to actually protect players

Friend [11:47 AM]: wouldn’t it be much more effective just to mandate equipment changes?

Friend [5:44 PM]: also, it would eliminate the alleged bias in those calls

So, the answer to this is so obviously “yes” that I’m interested in the second-order questions: 1) WTF is going on behind the scenes that has prevented this from happening? 2) What are the ulterior motives that we don’t know about—both on the owners’ and players’ sides?

5:52: Collinworth just said that we’ve seen more returns from deep in the endzone this season than ever before. First, I don’t know if that’s just his observation or if it’s based on actual stats. Second, even if it is based on real stats, I don’t know if he means in raw numbers or percentages—obv the first would be expected from the kickoff change. But—and it’s a big but—if teams actually were returning a higher percentage of kicks from deep in the endzone, wouldn’t THAT be interesting?

6:36: Can I interrupt this blog to say that I love my wife? Not only is she a great person, and a successful Harvard/Stanford educated lawyer, but she makes great spaghetti. Anyway, back from dinner. Working on a couple more graphs.

6:42: By the way, please email me any suggestions, ideas or opinions (positive or negative) you have about this live blog. Would you like to see more in-game analysis instead of game-inspired musing? More strict football and fewer tangents? More and broader tangents? More frequent and shorter updates or less frequent and more detailed? Etc. I kind of love doing it, so I’m set on it being a regular feature, and I’d obv like to do it the right way.

7:03: Cam Newton now has 1386 yards, which is the record for a rookie through his first 4 games (previously held by Marc Bulger with 1149). The record through 5 is 1496, so he’s likely to break that one. Through 6 is 1815, so that’s not a sure thing, but through 7 is also 1815 (Bulger only played 6 games, and the next highest at 7 is 1699). But there’s still a lot of variance to be navigated between now and Manning’s full-season record of 3739.

7:10: Aaron Schatz tweets:

FO_ASchatz

With all the wacky comebacks in the NFL this year, does anyone really want to write off the Jets this early?

1 minute ago

I’m sure he’s kidding, but I’m not sure “wacky comebacks” correlate much with “wacky comebacks.” Of course, a higher league-wide emphasis on passing will prob lead to more “comebacks” by previous standards: that is, games may be extended, there may be more turnovers, and trailing teams may have greater capability to catch up. But eventually this should just lead to a stasis where our perception of “wacky” changes: If a team has 10% chance of winning, they have a 10% chance of winning, whether that 10% means being behind 10 or being behind 20.

That said, under current league conditions, I’d say a 13 point lead definitely isn’t sufficient to “write off” any team that wasn’t DOA already.

7:33: Why does everyone call Romo “electrifying”? He had some “big” numbers on an offensive team, but what’s the evidence that any of that was his doing? Has he ever given the impression that he was doing more with his team than other quarterbacks would have?

7:39: Crap, my Excel just crashed and killed at least 15 minutes of work. Maybe not the best time to write my “Ode to spreadsheet software.” Though seriously, present resentments aside, isn’t Excel pretty much the best program ever?

7:42: And what the heck is it doing when it says “Excel is trying to recover your information. This may take several minutes.” This is a new(ish) computer. Nothing takes several minutes.

7:46: Matt notes:

I was just able to use the drive expectancy chart to check on a Chris Collinsworth comment, love all the graphs and tools. BTW the comment was about the value of getting the ball on the 2-3 vs the half yard line, if I read the graph right the odds of a safety double at the 1 yard line vs the three yard line. Still only 6% but a big enough difference to care.

In fact, one of the big difficulties with building a WPA model is accounting for these kinds of situations empirically, because 1) they behave abnormally, and 2) they’re either rare (e.g., being right at your own end zone), or extremely specific (e.g., some of the things that happen around the 11-12 yard line in the Red Zone), and thus have some of the smallest sample sizes for observation.

7:51: Matt also asks:

[I]n your salary research were there any trends dealing with the distribution of the players salary? Do winning teams generally have alot of average-paid players(The old New England Patriot teams) or alot of high paid elite players(Maybe the Colts? Cowboys? Not even sure what teams what qualify under this distinction.)

Yeah, this is actually closer to the main thing I’m interested in: E.g., what’s the most successful salary distribution profile? If you didn’t notice, the salary graph I posted last week actually has “Salary Standard Deviation” as a variable, so that gets at your question a little: Yes, it’s one of the most predictive indicators. However, a lot of that comes from quarterbacks: Since a few great quarterbacks (or at least quarterbacks on already-great teams) get enormous salaries, they’re kind of like automatic outliers.

8:20: So, back to what I was working on before my Excel crashed: With all the turnovers in this game, there’s about a 100% chance that commentators later will talk about the importance of “turnover differential.” People always rattle off a bunch of stats about how the team that wins the “turnover battle” almost always wins the game (like, duh), with the intention of reminding everyone how terrible it is to take the kinds of risks that lead to turnovers.

But this causation goes both ways: Turnovers can obviously cause teams to lose, but teams losing also cause turnovers. When you’re behind, you have to take risks to have any chance of winning. Citing the “turnover battle” stats without context is about as ridiculous as citing the “team X is 43-1 when having a 100 yard rusher.”

8:28: What goes unmentioned in all of this is “punt differential.” [Punts also involve turning the ball over.] Guess what? This stat is ALSO highly predictive of game outcomes, but without as much causation baggage: When teams are behind, they are actually forced to punt less. Despite the completely routine nature of punts vs. the extreme nature of turnovers, “punt differential” holds its own with “turnover differential” in a logistic regression to Win% (n=5308):

8:31: If you do run this as a linear regression to point differential, it gets even closer (I should also note, if you do your regression to “outside” games, punt differential is actually more predictive, but this is because it is much more reliable).

8:41: A fun metric that I love (and believe to be very useful) is “punts plus turnovers,” or PUPTO:

8:45: A pretty interesting thing to note in this chart is the difference between the predictivity of Interception differential vs. Fumble differential: from a pure “Turnovers=Bad” perspective, this is counter-intuitive: After all, many interceptions take place down-field, while fumbles typically happen at the line of scrimmage (also, I haven’t checked, but I feel like a disproportionate number of fumbles are returned for touchdowns). My suspicion is that this difference is at least partly explained by what I described earlier: When teams are losing, they have to take a lot of risks that lead to more interceptions, but they don’t take a lot of risks that lead to more fumbles.

8:51: Anyway, not seeing any more questions, I’m going to take off a few minutes early. Hope you enjoyed my blogging today, and I apologize for the technical difficulties. Cya.

Live Blog Tomorrow, Plus: Why Baseball and I Are, Like, Unmixy Things

As a friend of mine put it, “Posts merely announcing something are pretty lame,” so before I announce tomorrow’s event, let me explain why I will NOT be live-blogging any of tomorrow’s baseball games:

It’s a constant source of guilt for me that I don’t like baseball more, but I can’t help it: To me, the games are either not important enough to be interesting (98% of the regular season), or too important to be meaningful (100% of the playoffs). That said, I still dabble in baseball analysis myself, and I certainly understand the statistical appeal: the data-sets are huge, the variables are mostly independent, and—even in a post-Moneyball world—the screw-ups are ample.

I have many baseball-loving friends, and talking to them about this subject almost always goes like this exchange from Buffy the Vampire Slayer (with appropriate substitutions, and minus the sexual undertones):

[Me]: [Baseball?]

[Every Baseball Fan Ever]: Yeah.

[Me]: You seriously [watch baseball] for fun?

[Fan]: Well, not [minor leagues] or anything, but yeah. Don’t you?

[Me]: Actually, [no leagues] is more my specialty. I’m an avid [non-baseball watcher].

[Fan]: You’re kidding, right? I mean, you know how to [watch baseball]..

[Me]: Well, I took the class.. [Baseball] and [I] are, like .. un-mixy things.

[Fan]: It’s just because you haven’t had a good experience yet. You can have the best time [watching baseball]. It’s not about getting somewhere. You have to take your time. Forget about everything. Just.. relax. Let it wash over you. The air.. motion.. Just, let it roll.

[Me]: We are talking about [baseball], right?

I also don’t entirely believe the hype about it being such an integral part of our national heritage, and I think that perception today has been influenced heavily by nostalgia from influential people like George Will and Ken Burns, and I posted a graph somewhat supportive of that a while back:

Note also that the NFL’s relative popularity vs. MLB is nothing new. Here is a year-by-year plot showing the World Series ratings vs. Super Bowl ratings:

Since it’s inception, the Super Bowl has beaten even the highest-rated World Series game every single year (recently, it has even been beating the entire series combined for total viewers).

OK, so with that out of the way: Once again, I’ll be live-blogging NFL Sunday from 10am until the final whistle tomorrow—now powered by NFL Sunday Ticket! Here’s the explanation, and here’s last week’s end product.

From the Live Blog: Odds and Ends (On Moneyball, Michael Vick, Cam Newton, Kerry Collins, etc.)

[Preface: With apologies to people those of you who sat through or otherwise already read my NFL Live Blog from Sunday, it is incredibly long, so I thought maybe I should split out individual posts for some of the individual topics I covered. I’ve removed the time-stamps and re-organized a bit, but this is all original, so it obviously may not be as clean or detailed as a regular article (any additional comments I’ll put in brackets or at the end). If this sort of stuff interests you, I will be live-blogging again this Sunday.]

On Moneyball:

True story: Yesterday, my wife needed a T-shirt, and ended up borrowing my souvenir shirt from SSAC (MIT/Sloan Sports Analytics Conference). She was still wearing it when we went to see Moneyball last night, and, sure enough, she ended up liking it (nerd!) and I thought it was pretty dull.

Nit-picking: The Athletics won their last game of the season in 2004, 2005, 2007, and 2010. (It’s not that hard when you don’t make the playoffs). [If you haven’t seen it, a major theme in the whole affair is how Billy Beane really wants to win the last game of the season.]

Really Moneyball is all about money, not statistics. Belichick would be such a better subject for a sports-analytics movie than Billy Beane. It’s dramatic how Belichick has been willing to do whatever it takes to win—whether it be breaking the rules or breaking with convention—plus, you know, with more success.

On Advanced NFL Stats’ WPA Calculator:

I haven’t really used Advanced NFL Stats WPA Calculator much, as I’ve been (very slowly) trying to build my own model. But I just noticed it doesn’t take time outs into account. I’m curious whether that’s the same for his internal model or if that’s just the calculator. Obv timeouts make a huge difference in endgame or even end-half scenarios (and accounting for them properly is one of the toughest things to figure out).

[This came up in the Pitt/Indy game, I believe on the play where Roethlisburger scrambled for a 1st down in Indy territory.] Oooh, depending on the time out situation, that might have been a spot where dropping just short of the first down would have been better than making it. Too bad Burke’s WPA Calculator doesn’t factor in time outs!

On Andy Reid:

So Both Donovan McNabb and Michael Vick have been considerably better QB’s in Philadelphia than elsewhere. At some point, does Andy Reid get some credit? Without a Super Bowl ring, he’s generally respected but not revered in the mainstream, and he’s such a poor tactician that he’s dismissed by most analytics types. But he may be one of the best offensive schemers in the modern era.

[David Meyers asked what I meant by “poor tactician”]: I just mean that he has notoriously bad time management skills, makes ridiculous 4th down decisions, and generally seems clueless about win maximization, esp. in end-game scenarios.

On Michael Vick:

I kind of feel the same way about Vick that I felt about Stephen Strasbourg after he hurt his arm last year: their physical skills are so unprecedented that, unfortunately, Bayesian inference suggests that their injury-proneness isn’t a coincidence.

So if the Eagles go on to lose, does this make Vick 1-0 with 2 “no decisions” for the year?

On Cam Newton:

So should I be more or less excited about Cam Newton after his win today? He had a much more “rookie-like” box of 18/34 for 158. Here’s how to break that down for rookies: Low yards = bad. High attempts = good. Completion percentage = completely irrelevant. Win = Highly predictive of length of career, not particularly predictive of quality (likely b/c a winning rookie season gets you a lot of mileage whether you’re actually good or not). Oh, and he’s still tall: Height is also a significant indicator (all else being equal).

On Breakouts:

I remember Mike Wallace being a valuable backup on my fantasy team in 2009, otherwise, meh. Seems to talk a lot of crap that these announcers eat up. Ironically, though, if a rookie or a complete unknown starts a season super-hot, commentary is usually that they’re already the next big thing, while a quality-but-not-superstar veteran with a hot start is often just credited with a hot start. But, in reality, I think the vet, despite being more of a known quantity, is still more likely to take off. In this case, they’re busting out the hyperbole regardless.

Speaking of which, does anyone remember Ryan Moats? A stringer for Houston in 2009, he ended up starting (briefly) after a rash of injuries to his teammates. In his first start (against Buffalo), he had 150 yards and 3 touchdowns, and some fantasy contestants were falling over each other to pick him up. After that, he had 2 touchdowns the rest of the season, and then was out of football.

On Troy Polamalu:

Polamalu to the rescue, of course. He’s so good that I think he improves the Steeler’s offense. (And no, not kidding.)

On Skeptical Sports Analysis:

Google Search Leading to My Blog of the Day: “what sport does dennis rodman play”

Shout-out to Matt Glassman for plugging my live blog on his:

One look at his blog will convince you that he’s not only a killer sports statistician, but he’s also an engaging and humorous writer.

Though, at best, this generous praise is a game of “Two Truths and a Lie.” (I’m not even remotely a statistician.)

If I were more clever, I’d think of some riff off the Jay-Z’s 99 problems line:

Nah, I ain’t pass the bar but i know a little bit

Enough that you won’t illegally search my shit

Incidentally, love the Rap Genius annotation for that lyric (also apt to my situation):

If you represent yourself (pro se), Bar admission is not required, actually

On Kerry Collins:

So I always think of Kerry Collins as a pretty bad QB, but damn: he’s the last man standing from the entire 1995 draft:

And, you know, he’s not dead. So I guess he won that rivalry.

From the Live Blog: On Detroit, Quick Field Goals, and Buffalo

On Detroit’s OT 1st-Down Field Goal:

Nate asks:

Any thoughts on the Lions kicking a 32-yard FG in overtime from the left hash on first down?

I’ve thought about this situation a bit, and I don’t hate it. Let me pull up this old graph:

So a kneel in the center is maybe slightly better: generically, they lose a percentage or two, but I’m pretty sure that even from that distance you lose a percentage or two for being on the hash. Kickers are good enough at that length that going for extra yards or a TD isn’t really worth it, plus you’re not DOA even if you miss (while you might be if you turn the ball over).

Also from that post where the graph came from, the “OT Choke Factor” for kicks of that length is negligible.

Matt Glassman asks:

Question re: field goals — What percentage are you looking for your kicker to have at the longest range you are willing to regularly (i.e. throughout the game) use him?

I’ll use a static example: if your kicker was a known 50% from 52 yards, would you regularly take that over a punt? What about 40%, etc. Then make it dynamic, where the kicker has some shrinking probability as he moves back, and the coach has a decision about whether to kick/punt from a given distance. At what maximum distance/percentage do you regularly kick, rather than regularly punt.

This is a good question and topic, but it’s extremely hard to generalize. It depends on your game situation and what your alternatives are. Long kicks, for example, are generally bad—even with a relatively good long-range kicker. But in late-game or late-half scenarios, clearly being able to take long kicks can be very valuable.

It is demonstrable, however, that NFL kickers have gotten incredibly good compared to past kickers. Aside from end-game scenarios, kicking FG’s used to be almost universally dominated by going for it (or sometimes punting). But since kickers have become so accurate, the balance has gotten more delicate.

Also [sort of contra Brian Burke, I’m thinking of a link but can’t find it], I think individual team considerations are a much bigger factor in these decisions than just raw WPA. It depends a lot on how good your offense is, how good it is at converting particular distances, how good your defense is, etc. While the percentage differences may be fairly small for the instant decision, they pile up on each other in these types of multi-pronged calculations.

[In the later game] Chris Collinsworth said kickers prefer being on the left hash (though the justification was kind of weak). [Obv this would make the 1st down kick even better.]

On Detroit and Buffalo’s Chances:

Detroit is currently 3-0 and leading the league in Point Differential at +55, and unlikely to be passed by anyone any time soon [by which I mean, this weekend].

That +55 would be the 16th best since 2000. Combined with their 3-0 record, they project to win ~11 games, though with lots of variance:

Yes, this can be calculated more precisely, but it will be around 11 games regardless.

The teams who led in MOV after 3 weeks since 2000 were:

2010: Pittsburgh, +39, Lost Super Bowl
2009: New Orleans, +64, Won Super Bowl
2008: Tennessee, +43, Lost Divisional
2007: New England, +79, Lost Super Bowl
2006: San Diego, +57, Lost Divisional
2005: Cincinatti, +60, Lost Wild Card
2004: Seattle, +52, Lost Wild Card
2003: Denver, +65, Lost Wild Card
2002: Miami, +63, Missed Playoffs
2001: Green Bay, +80, Lost Divisional
2000: Tampa Bay, +67, Lost Wild Card

Not bad. Only Miami missed the playoffs, and they were in a 3 way tie atop AFC East at 9-7.

David asks:

Q: The Bills for real? What do they project to over a season?

Um, I don’t know. Generically, being 3-0 and +40 projects to 10 or 11 wins, but there’s a lot of variance in there. The previous season’s results are still fairly significant, as are the million other things most fans could tick off. Another statistically significant factor that most people prob wouldn’t think of is preseason results. The Bills scored 24 and 35 points in games 2 and 3 of the preseason. There’s a ton of work behind this result, but basically I’ve found that points scored plus points scored in games 2 and 3 of the preseason (counting backwards) is approximately as predictive as points scored minus points allowed in one game from the regular season. So, loosely speaking, in this case, you might say that the Bills are more like a 4-0 team, with the extra game worth of data being the equivalent of a fairly quality win over a Denver/Jacksonville Hybrid.

I’d also note that it’s difficult to take strength of schedule into account at this point, at least in a quantitative way. You can make projections about the quality of a team’s opponents, but the error in those projections are so large at this point that they add more variance to your target team’s projections than they are worth. Or, maybe a simpler way to put it: it’s hard enough to adjust for quality of opponent when you *know* how good they were, and we don’t even know, we just have educated guesses. (Even at the END of the season, I think a lot of ranking models and such don’t sufficiently account for the variance in SoS: that is, when a team beats x number of teams with good records, they can do very well in those rankings, even though some of the teams they beat overperfomed in their other games. In fact, given regression to the mean, this will almost always be the case. Of course, a clever enough model should account for this uncertainty.)

From the Live Blog: More on Interceptions

Aaron Schatz tweeted:

FO_ASchatz

Bills go for it on fourth-and-14 from NE 35… and Fitzpatrick throws his second pick (first that is his fault)

4th and 14 is a situation where I think more quarterbacks throw too few interceptions than throw too many.

[As you can note from this post, an issue I’m very interested in is how to judge interceptions more fairly. As I said in the comments, “I’m conceptually drawn to the similarity between the stigma against interceptions and the stigma against going for it on 4th down.” E.g., a coach who played optimal 4th down strategy would easily lead the league in 4th down turnovers.]

Though, I have to admit, Aaron Rodgers is a great QB who seems to defy my “Show me a QB who doesn’t throw interceptions, and I’ll show you a sucky quarterback” rule of thumb. And it’s not like Tom Brady, who throws INT’s when his team is struggling and doesn’t throw them when his team is awesome (which, ofc, I have NO problem with): Rodgers has a crazy-low INT rate on a team that has been mediocre (2008), good-but-not-great (2009), or all over the place (2010) during his 3 years as a starter.

Ok, purely for fun, let’s compare the all-time single-season leaders in (low) Int% (from Pro Football Reference):

With the all-time leaders for most INT thrown (also from Pro Football Reference):

Not drawing any conclusions or doing any scientific comparisons, but both lists seem to have plenty of studs as well as plenty of duds. (Actually, when I first made this comparison a couple of years ago, the “Most” list had a much better resume than the “Least” list. But since then, the ‘good’ list has added several quality new members.)

From the Live Blog: On Bill Belichick and Peyton Manning

This may not be the most controversial statement, but I think the two most powerful forces in the NFL over the last decade have been Peyton Manning and Bill Belichick (check out the 2nd tab as well):

^{[Main axes are] wins in season n against wins in season n+1. [Note (May 28, 2012): Updated through 2011 season.]}

In case you haven’t seen it, the old “Graph of the Day” that I tweaked for the above is here.

Belichick, of course, is known for winning Super Bowls, going for it on 4th down, and:

Good thing he doesn’t have to worry about potential employers Googling him.

Since I [was] watching the Indy game, a few things about Peyton Manning:

First, a quick over/under: .5, for number of Super Bowls won by Peyton Manning as a coach? I mean, I’d take the under obv just b/c of the sheer difficulty of winning Super Bowls, but I’d be curious about the moneyline.

[Here’s] something completely new to me. Not sure exactly what to make of it, but it’s interesting:

This is QB’s with 7+ seasons of 8+ games who averaged 200+ yards per game (n=42). These are their standard deviations, from season to season (counting only the 8+ gm years), for Yards per Game vs. Adjusted Net Yards Per Attempt.

The red dot is our absentee superstar, Peyton Manning, and the green is Johnny Unitas. The orange is Randall Cunningham, but his numbers I think are skewed a bit because of the Randy Moss effect. The dot at the far left of the trend-line is Jim Kelly.

So what to make of it? I’ve been mildy critical of Adjusted Net Yards Per Attempt for the same reasons I’ve been critical of Win Percentage Added: Since the QB is involved in basically every offensive play, both of these tend to track two things:

Their raw offensive quality, plus (or multiplied by)
The amount which the team relies on the passing game.

Neither is particularly indicative of a QB doing the best with what he can, as it is literally impossible to put up good numbers in these stats on a bad team.

So it’s interesting to me that Peyton — who most would agree is one of the most consistent QB’s in football — would have such a high ANY/A standard dev (he also has a larger sample than some of the other qualifiers).

An incredibly superficial interpretation might be that Peyton sacrifices efficiency in order to “get his yards.” OTOH, this may be counter-intuitive, but I wonder if it’s not actually the opposite: Peyton was an extremely consistent winner. Is it possible that the ANY/A to some extent reflected the quality of his supporting cast, but the yards sort of indirectly reflect his ability to transfer whatever he had into actual production? Obv I’d have to think about it more.

With their schedule, Indy may be eliminated from playoff contention before Manning even starts thinking about a return. Could be good for them next year, though: San Antonio Gambit, anyone?

Okay, one last thought: In this post, Brian Burke estimates Manning’s worth to that team, and uses the team’s total offensive WPA as a sort of “ceiling” for how valuable Manning could be:

In this case, it can tell us how many wins the Peyton Manning passing game can account for. Although we can’t really separate Manning from his blockers and receivers, we can nail down a hard number for the Colts passing game as a whole, of which Manning has been the central fixture.

The analysis, while perfectly good, does ignore two possibilities: First, the Indianapolis offense minus Manning may be below average (negative WPA), in which case the “Colts passing game” as a whole would understate Manning’s value: E.g., he could be taking it from -1 to +2.5, such that he’s actually worth 3.5, etc. Second, even if you could get a proper measure of how much the offense would suffer without Manning, that still may not account for the degree to which the Indianapolis offense bolstered their defense’s stats. When you’re ahead a lot, you force the other team to make sub-optimal plays that increase variance to give themselves some opportunity to catch up: this makes your defense look good. In such a scenario, I would imagine hearing things like, “Oh, the Indianapolis defense is so opportunistic!” Hmmm.