I’m Joining FiveThirtyEight

The title pretty much says it all. In January I’ll be starting a real j-o-b as a “Senior Writer, Sports” for the new ESPN-backed FiveThirtyEight, due to launch in February. So I thought I’d better say some quick goodbyes and hellos.

For old readers:

While I’m admittedly a little sad that this blog won’t be coming back any time soon, this should obviously be great news for people who enjoy my work: Backed by ESPN/FiveThirtyEight data and resources, it will be better and there will be more of it. My responsibilities at FiveThirtyEight will be similar to what I’d been doing here already: conducting original research, writing articles, and blogging. Except full time. And paid.

(Yeah, it’s basically my dream job.)

For new readers:

Of course, for many of you reading this, this is probably your first time visiting this site. In which case: welcome!  For a primer on who the hell I am, you might want to read the “about Ben” and “about this blog” pages, or you can skip those and just read some of my articles. My best known work is undoubtedly The Case For Dennis Rodman, which is incredibly long—

TCFDR

—but has a guide, which can be found here. And in case you’ve heard rumors, yes, it speculates that Rodman—in a very specific way—may have been more valuable than Michael Jordan.

However, if I had to pick just a handful of articles to best represent my ideas and interests, it might look something like this:

Football:

Quantum Randy Moss—An Introduction to Entanglement

The Aesthetic Case Against 18 Games

Basketball:

The Case for Dennis Rodman, Part 4/4(a): All-Hall?

Bayes’ Theorem, Small Samples, and WTF is Up With NBA Finals Markets?

Baseball:

A Defense of Sudden-Death Playoffs in Baseball

Why Not Balls and Strikes?

General:

C.R.E.A.M. (Or, “How to Win a Championship in Any Sport”)

Applied Epistemology in Politics and the Playoffs

Graph of the Day: When Do Undefeated Teams Lose?

The Kansas City Chiefs beat Buffalo today to push their record to 9-0, with Alex Smith once again putting up numbers only his mother or Terry Bradshaw could love (too bad he doesn’t have Randy Moss). There has been a lot of grumbling about this Chiefs team, and for a serious-ish treatment see “The Worst 8-0 Team of All Time?” at Advanced NFL Stats.

However, their victory is perfectly consistent with the long and storied history of 8-0 teams, who have a 19-2 record in game 9 (through today). Indeed, of all “x-0” teams (where X is less than 15), 8-0 is the least likely to lose their next game:

undefeated

And if you counted all the remaining games of 15-0 teams, they would still only be 5-1 (the ’72 Dolphins went undefeated, and the 2007 Patriots won 3 more before losing the Super Bowl).

What does this mean? Probably not that much, though late regular-season NFL gets wacky for a number of reasons (injuries, strategic incentives, motivational changes, etc), so it’s unsurprising to me that the “drop rate” increases even as the teams should theoretically be getting stronger.

In bad news for the Chiefs, however, their odds of winning the Super Bowl decreased slightly: while 40% of 8-0 teams had gone on to a championship, only 39% of 9-0 teams have done the same (classic DUCY).

Graph of the Day: Big Papi on the World Series Legend Curve

Nothing fancy here folks, just a little perspective.

David Ortiz had a crazy-good (Matsui-esque) World Series, taking MVP honors and his third title overall. His 1.948 OPS is 7th all-time and tops many legendary championship campaigns, such as Reggie Jackson’s 4 HR in 4 straight appearances in 1977 (1.792 OPS for the series). Amazingly, both Babe Ruth (2.022) and Lou Gehrig (2.433!) posted higher numbers in 1928 (a Yankees sweep).

Of course, Papi has been instrumental in all three of Boston’s post-Ruth championships, and his combined OPS of 1.372 is the highest for anyone with 50+ WS plate appearances:
WS LegendsBy my reading, Ortiz fits clearly in an elite class of outliers that includes Bonds, Jackson, Gehrig and Ruth, though Ruth stands out the most. Note that this includes Ruth’s first three WS appearances with Boston, when he was still a pitcher. I.e.:

If we expand the purview to all postseason games, we see a considerable drop in Papi’s OPS, though he still maintains an elite position overall:

PS Legends

Of course there is no change for Ruth and Gehrig, since there were no divisional rounds when they played (though we wouldn’t expect too much of a drop for them anyway, since their playoff OPS aren’t substantial deviations from the rest of their careers). Otherwise, the outlier curve is dominated by the big names of the modern era, with Papi comfortably nestled in the middle.

LeBron’s High-Usage Shooting Efficiency (Featuring Adrian Dantley)

As anyone (statistically-inclined or not) can tell you, LeBron James is having a pretty good year. His 26.8 points, 8 rebounds and 7.3 assists per game (through 81) makes for another entry in his already stunning portfolio of versatile seasons: This will be his 6th time hitting 25/7/7+, a feat that has only been accomplished 8 times since the merger:

Totals Shooting Per Game
Rk Player Season Age Tm G FGA FG% 3P% FT% PTS TRB AST TS%
1 LeBron James 2012-13 28 MIA 76 1354 .565 .406 .753 26.8 8.0 7.3 .640
2 Michael Jordan* 1988-89 25 CHI 81 1795 .538 .276 .850 32.5 8.0 8.0 .614
3 Larry Bird* 1986-87 30 BOS 74 1497 .525 .400 .910 28.1 9.2 7.6 .612
4 LeBron James 2009-10 25 CLE 76 1528 .503 .333 .767 29.7 7.3 8.6 .604
5 LeBron James 2010-11 26 MIA 79 1485 .510 .330 .759 26.7 7.5 7.0 .594
6 LeBron James 2008-09 24 CLE 81 1613 .489 .344 .780 28.4 7.6 7.2 .591
7 LeBron James 2007-08 23 CLE 75 1642 .484 .315 .712 30.0 7.9 7.2 .568
8 LeBron James 2004-05 20 CLE 80 1684 .472 .351 .750 27.2 7.4 7.2 .554
Provided by Basketball-Reference.com: View Original Table
(Generated 4/17/2013.)

But the thing that sticks out (which stat-heads have been going berserk about) is his shooting, which has been by far the most efficient of his career.  Indeed, it may be one of the greatest shooting efficiency seasons of all time.

While his raw shooting % wouldn’t break the top 100 seasons, and his “true” shooting % (adjusted for free throws and 3 point shots made) would still only rank about 60th, the key here is that James’ shooting efficiency is remarkable for someone with his role as both a primary option and a shooter of last resort.  Generally, when you increase a player’s shot-taking responsibilities, it comes at the cost of marginal shot efficiency. This doesn’t mean this is a bad decision or that the player is doing anything wrong—what may be a bad shot “for them” may be a great shot under the circumstances in which they are asked to take it (like when the shot clock is running down, etc).

While there’s no simple stat that describes the degree to which someone is a “shot creator,” we can use usage rate as a decent (though obviously imperfect) proxy. There have been around 150 seasons in which one player “used” >=30% of their team’s possessions:

Usage >30% vs. TS%

All player seasons with USG% >= 30. LeBron’s in red.

As we would expect, the best shooting percentages decline as the players’ usage rates get larger and larger.  The red points are LeBron’s seasons (which are pretty excellent across the board) and as we can see from this scatter, his 2012-13 campaign is about to set the record for this group (though we should note that it’s NOT a Rodman-esqe outlier).

Amazingly, the previous record-holder was Adrian Dantley! Dantley is a Hall of Fameer who I had practically never heard of until his name kept popping up in my historical research as possibly one of the most underrated players ever.

Dantley never made an All-NBA first team or won an NBA championship, but he does extremely well in a variety of plus-minus and statistical plus-minus style metrics. While he didn’t have the all-around game of a LeBron James (though he did average a respectable 6-7 rebounds and 3-4 assists in his prime), Dantley was an extremely efficient high-usage shooter. For example, if we look at the top True Shooting seasons among players with a Usage Rate of greater than 27.5%, guess who occupies fully 5 of the top 10 spots:

Totals Shooting Advanced
Rk Player Season Age Tm G FG FGA PTS FG% TS% USG%
1 Amare Stoudemire 2007-08 25 PHO 79 714 1211 1989 .590 .656 28.2
2 Adrian Dantley* 1983-84 27 UTA 79 802 1438 2418 .558 .652 28.2
3 Kevin Durant 2012-13 24 OKC 81 731 1433 2280 .510 .647 29.8
4 LeBron James 2012-13 28 MIA 76 765 1354 2036 .565 .640 30.1
5 Charles Barkley* 1990-91 27 PHI 67 665 1167 1849 .570 .635 29.1
6 Adrian Dantley* 1979-80 23 UTA 68 730 1267 1903 .576 .635 27.8
7 Adrian Dantley* 1981-82 25 UTA 81 904 1586 2457 .570 .631 27.9
8 Adrian Dantley* 1985-86 29 UTA 76 818 1453 2267 .563 .629 30.0
9 Karl Malone* 1989-90 26 UTA 82 914 1627 2540 .562 .626 32.6
10 Adrian Dantley* 1980-81 24 UTA 80 909 1627 2452 .559 .622 28.4
Provided by Basketball-Reference.com: View Original Table
(Generated 4/17/2013.)

Dantley was also in the news a bit last month for working part-time as a crossing guard:


Key quotes from that story:

“It’s not a big thing to me … I just do it. I have a routine. I exercise, I go to work, I go home. I have a spring break next week. I have a summer off, just like when I was a basketball player.”

“I just did it for the kids … I just didn’t want to sit around the house all day.”

“I’ve definitely saved two lives. I’ve almost gotten hit by a car twice. And I would say 70 percent of the people who go across my route are on their telephone or on their BlackBerry, text-messaging. I never would have seen that if I had not been on the post.”

What a character!

Graph of the Day: Second Look at Stan Van?

Granted, “of the Day” isn’t really accurate considering how often I post, but I found it amusing enough to share:

Red is years coached by Stan Van Gundy.

Win % in games played by Dwight Howard. Red years were with Stan Van Gundy coaching.

This came up in a discussion about the possibility that Dwight Howard might not be leveraged optimally on teams that aren’t comprised mostly of small 3 point shooters. That would have interesting implications.

Is Randy Moss the Greatest?

So apparently San Francisco backup wide receiver Randy Moss made some headlines at Super Bowl media day by expressing the opinion that he is the greatest receiver of all time.

Much of the response I’ve seen on Twitter has looked like this:

The ESPN article similarly emphasizes Jerry Rice’s superior numbers:

[Moss] has 982 catches for 15,292 yards and 156 touchdowns in his 14-season career.

Hall of Famer Jerry Rice, who now is an ESPN NFL analyst, leads the all-time lists in those three categories with 1,549 receptions, 22,895 yards and 197 touchdown receptions.

Elsewhere, they do note that Jerry Rice played 20 seasons.

Mike Sando has some analysis and a round-up of analyst and fan reactions, including several similar points under heading “The Stats”, and this slightly snarky caption:

Randy Moss says he’s the greatest WR of all time. @JerryRice: “Put my numbers up against his numbers.” We did –>

So when I first saw this story, I kind of laughed it off (generally I’m against claims of greatness that don’t come with 150-page proofs), but then I saw what Randy Moss actually said:

“I don’t really live on numbers, I really live on impact and what you’re able to do out on the field,” he said Tuesday. “I really think I’m the greatest receiver to ever play this game.”

From this, I think the only logical conclusion is that Randy Moss clearly reads this blog.

As any of my ultra-long-time readers know, I’ve written about Randy Moss before. “Quantum Randy Moss—An Introduction to Entanglement” was one of my earliest posts (and probably my first ever to be read by anyone other than friends and family).

Cliff’s Notes version: I think Moss is right that yards and touchdowns and other production “numbers” don’t matter as much as “impact”, or what a player’s actual affect is on his team’s ability to move the ball, score points, and ultimately win games. Unfortunately, isolating a player’s “true value” can be virtually impossible in the NFL, since everyone’s stats are highly “entangled.” However, Randy Moss may come the closest to having a robust data set that’s actually on point, since, for a variety of reasons, he has played with a LOT of different quarterbacks. When I wrote that article, it was clear that all of them played *much* better with Moss than without him.

Given this latest “controversy,” I thought I’d take a quick chance to update my old data. After all, Tom Brady and Matt Cassell have played some more seasons since I did my original analysis. Also, while it may or may not be relevant given Moss’s more limited role and lower statistical production, Alex Smith now actually qualifies under my original criteria (playing at least 9 games with Randy Moss in a single season). So, for what it’s worth, I’ve included him as well. Here’s the updated comparison of seasons with Randy Moss vs. career without him (for more details, read the original article):

Note: I calculated these numbers a tiny bit differently than before. specifically, I cut out all performance stats from seasons in which a QB didn’t play at least 4 games.

Of course, Alex Smith had a much better season last year than he has previously in his career, so that got me thinking it might be worth trying to make a slightly more apples-to-apples comparison for all 7 quarterbacks. So I filtered the data to compare seasons with Randy Moss only against “Bookend” seasons—that is, each quarterback’s seasons immediately before or after playing with Moss (if applicable):

Here we can see a little bit more variability, as we would expect considering the smaller sample of seasons for comparison, but the bottom line is unchanged. On average, the “Moss effect” even appears to be slightly larger overall. Adjusted Net Yards Per Attempt is probably the best single metric for measuring QB/passing game efficiency, and a difference of 1.77 is about what separates QB’s like Aaron Rodgers from Shaun Hill (7.54 v. 5.68), or a Peyton Manning from a Gus Frerotte (7.11 v. 5.27).

This magnitude of difference is down slightly from the calculations I did in 2010. This is partly because of a change in method (see “note” above), but (in fairness), also partly because Tom Brady’s “non Moss” numbers have improved a bit in the last couple of seasons. On the other hand, the samples are also larger, which makes the unambiguous end result a bit more reliable.

Even Smith clearly still had better statistics this season with Moss (not to mention Colin Kaepearnick seems to be doing OK as well).  Whether that improvement is due to Moss (or more likely, the fear of Moss), who knows. For any particular case(s), there may be/probably are other factors at play: By no means am I saying these are all fair comparisons. But better results in this type of comparison are more likely to occur the better the player actually was. Thus, as a Bayesian matter, extreme results like these make it likely that Randy Moss was extremely good.

So does this mean I think Moss is right? Really, I have no idea. “Greatness” is a subjective term, and Rice clearly had a longer and more fruitful (3 Super Bowl rings) career. But for actual “impact” on the game: If I were a betting man (and I am), I’d say that the quality and strength of evidence in Moss’s favor makes him the most likely “best ever” candidate.

[1/31 Edit: Made some minor clarifying changes throughout.]

Don’t Play Baseball With Bill Belichick

[Note: I apologize for missing last Wednesday and Friday in my posting schedule. I had some important business-y things going on Wed and then went to Canada for a wedding over the weekend.]

Last week I came across this ESPN article (citing this Forbes article) about how Bill Belichick is the highest-paid coach in American sports:

Bill Belichick tops the list for the second year in a row following the retirement of Phil Jackson, the only coach to have ever made an eight-figure salary. Belichick is believed to make $7.5 million per year. Doc Rivers is the highest-paid NBA coach at $7 million.

Congrats to Belichick for a worthy accomplishment! Though I still think it probably under-states his actual value, at least relative to NFL players. As I tweeted:

Of course, coaches’ salaries are different from players’: they aren’t constrained by the salary cap, nor are they boosted by the mandatory revenue-sharing in the players’ collective bargaining agreement.  Yet, for comparison, this season Belichick will make a bit more than a third of what Peyton Manning will in Denver. As I’ve said before, I think Belichick and Manning have been (almost indisputably) the most powerful forces in the modern NFL (maybe ever). Here’s the key visual from my earlier post, updated to include last season (press play):

The x axis is wins in season n, y axis is wins in season n+1.

Naturally, Belichick has benefited from having Tom Brady on his team. However, Brady makes about twice as much as Belichick does, and I think you would be hard-pressed to argue that he’s twice as valuable—and I think top QB’s are probably underpaid relative to their value anyway.

But being high on Bill Belichick is about more than just his results. He is well-loved in the analytical community, particularly for some of his high-profile 4th down and other in-game tactical decisions.  But I think those flashy calls are merely a symptom of his broader commitment to making intelligent win-maximizing decisions—a commitment that is probably even more evident in the decisions he has made and strategies he has pursued in his role as the Patriots’ General Manager.

But rather than sorting through everything Belichick has done that I like, I want to take a quick look at one recent adjustment that really impressed me: the Patriots out-of-character machinations in the 2012 draft.

The New Rookie Salary Structure

One of the unheralded elements to the Patriots’ success—perhaps rivaling Tom Brady himself in actual importance—is their penchant for stock-piling draft-picks in the “sweet spot” of the NFL draft (late 1st to mid-2nd round), where picks have the most surplus value. Once again, here’s the killer graph from the famous Massey-Thaler study on the topic:

In the 11 drafts since Belichick took over, the Patriots have made 17 picks between numbers 20 and 50 overall, the most in the NFL (the next-most is SF with 15, league average is obv 11). To illustrate how unusual their draft strategy has been, here’s a plot of their 2nd round draft position vs. their total wins over the same period:

Despite New England having the highest win percentage (not to mention most Super Bowl wins and appearances) over the period, there are 15 teams with lower average draft positions in the 2nd round. For comparison, they have the 2nd lowest average draft position in the 1st round and 7th lowest in the third.

Of course, the new collective bargaining agreement includes a rookie salary scale. Without going into all the details (in part because they’re extremely complicated and not entirely public), the key points are that it keeps total rookie compensation relatively stable while flattening the scale at the top, reducing guaranteed money, and shortening the maximum number of years for each deal.

These changes should all theoretically flatten out the “value curve” above. Here’s a rough sketch of what the changes seem to be attempting:

Since the original study was published, the dollar values have gone up and the top end has gotten more skewed. I adjusted the Y-axis to reflect the new top, but didn’t adjust the curve itself, so it should actually be somewhat steeper than it appears.  I tried to make the new curves as conceptually accurate as I could, but they’re not empirical and should be considered more of an “artist’s rendition” of what I think the NFL is aiming for.

With a couple of years of data, this should be a very interesting issue to revisit.  But, for now, I think it’s unlikely that the curve will actually be flattened very much. If I had to guess, I think it may end up “dual-peaked”: By far the greatest drop in guaranteed money will be for top QB prospects taken with the first few picks. These players already provide the most value, and are the main reason the original M/T performance graph inclines so steeply on the left. Additionally, they provide an opportunity for continued surplus value beyond the length of the initial contract. This should make the top of the draft extremely attractive, at least in years with top QB prospects.

On the other hand, I think the bulk of the effect on the rest of the surplus-value curve will be to shift it to the left. My reasons for thinking this are much more complicated, and include my belief that the original Massey/Thaler study has problems with its valuation model, but the extremely short version is that I have reason to believe that people systematically overvalue upper/middle 1st round picks.

How the Patriots Responded

Since I’ve been following the Patriots’ 2nd-round-oriented drafting strategy for years now, naturally my first thoughts after seeing the details of the new deal went to how this could kill their edge. Here’s a question I tweeted at the Sloan conference:

Actually, my concern about the Patriots drafting strategy was two-fold:

  1. The Patriots favorite place to draft could obviously lose its comparative value under the new system. If they left their strategy as-is, it could lead to their picking sub-optimally. At the very least, it should eliminate their exploitation opportunity.
  2. Though a secondary issue for this post, at some point  taking an extreme bang-for-your-buck approach to player value can run into diminishing returns and cause stagnation. Since you can only have so many players on your roster or on the field at a time, your ability to hoard and exploit “cheap” talent is constrained. This is a particularly big concern for teams that are already pretty good, especially if they already have good “value” players in a lot of positions: At some point, you need players who are less cheap but higher quality, even if their value per dollar is lower than the alternative.

Of course, if you followed the draft, you know that the Patriots, entering the draft with far fewer picks than usual, still traded up in the 1st round, twice.

Taken out of context, these moves seem extremely out of character for the Patriots. Yet the moves are perfectly consistent with an approach that understands and attacks my concerns: Making fewer, higher-quality picks is essentially the correct solution, and if the value-curve has indeed shifted up as I expect it has, the new epicenter of the Patriots’ draft activity may be directly on top of the new sweet spot.

Baseball

The entire affair reminds me of an old piece of poker wisdom that goes something like this: In a mixed game with one truly expert poker player and a bunch of completely outclassed amateurs, the expert’s biggest edge wouldn’t come in the poker variant with which he has the most expertise, but in some ridiculous spontaneous variant with tons of complicated made-up rules.

I forget where I first read the concept, but I know it has been addressed in various ways by many authors, ranging from Mike Caro to David Sklansky. I believe it was the latter (though please correct me if I’m wrong), who specifically suggested a Stud variant some of us remember fondly from childhood:

Several different games played only in low-stakes home games are called Baseball, and generally involve many wild cards (often 3s and 9s), paying the pot for wild cards, being dealt an extra upcard upon receiving a 4, and many other ad-hoc rules (for example, the appearance of the queen of spades is called a “rainout” and ends the hand, or that either red 7 dealt face-up is a rainout, but if one player has both red 7s in the hole, that outranks everything, even a 5 of a kind). These same rules can be applied to no peek, in which case the game is called “night baseball”.

The main ideas are that A) the expert would be able to adapt to the new rules much more quickly, and B) all those complicated rules make it much more likely that he would be able to find profitable exploitations (for Baseball in particular, there’s the added virtue of having several betting rounds per hand).

It will take a while to see how this plays out, and of course the abnormal outcome could just be a circumstances-driven coincidence rather than an explicit shift in the Patriots’ approach. But if my intuitions about the situation are right, Belichick may deserve extra credit for making deft adjustments in a changing landscape, much as you would expect from the Baseball-playing shark.

The Clock: A Graph and Some Thoughts

If you’re a hardcore follower of this blog, you know that one of things I have frequently complained about is the failure of NBA play-by-play data to include the shot clock. It’s so obviously important and—relative to other play-by-play data—so easy to track, that it’s a complete mystery to me why doing so isn’t completely standard. OTOH, I see stats broken down by “early” and “late” in the shot clock all the time, so someone must have this information.

In the meantime, I went through the 2010 play-by-play dataset and kluged a proxy stat from the actual clock, reflecting the number of seconds passed since a team took possession. Here’s a chart summarizing the number and outcomes of possessions of various lengths:

The orange X’s represent the number of league-wide possessions in which the first shot took place at the indicated time. The red diamonds represent the average number of points scored on those possessions (including from any subsequent shots following an offensive rebound, etc).

We should expect there to be a constant trade-off at any given time between taking a shot “now” and waiting for a better one to open up: the deeper you get into a possession, the more your shot standards should drop. And, indeed, this is reflected in the graph by the downward-sloping curve.

For now, I’m just throwing this out there. Though it represents a very basic idea, it is difficult to overstate its importance:

  1. Accounting for the clock can help evaluate players where standard efficiency ratings break down. Most simply, you can take the results of each shot and compare them to the expected value of a shot taken under the same amount of time-pressure. E.g., if someone averages .9 points per attempt with only a couple of seconds left, you can spot value where normal efficiency calculations wouldn’t.
  2. Actually, I’ve calculated just such preliminary “value-added” shooting for the entire league (with pretty interesting results), but I’d like to see more accurate data before posting or basing any substantial analysis on it. Among other problems, I think the right side of the curve is overly generous, as it includes possessions where it took a while to get the clock started (a process that is, unfortunately, highly variable), or where time was added and the cause wasn’t scored (also disappointingly common).
  3. Examining this information can tell you some things about the league generally: For example, it’s interesting to me that there’s a noticeable dip right around where the most shots actually take place (14 to 16 seconds in). Though speculative, I suspect that this is when players are most likely to settle for mediocre 2 point jumpers. Similarly, but a bit more difficultly, you can compare the actual curve with a derived curve to examine whether NBA players, on the whole, seem to wait too long (or not long enough) to pull the trigger.

With better data, the possibilities would open up further (even moreso when combined with other play-by-play information, like shot type, position, defense, etc). For example, you could look at the curve for individual players and impute whether they should be more or less aggressive with their shot selection.

So, yeah, if any of you can direct me to a dataset that has what I want, please let me know.

Sports Geek Mecca: Recap and Thoughts, Part 2

This is part 2 of my “recap” of the Sloan Sports Analytics Conference that I attended in March (part 1 is here), mostly covering Day 2 of the event, but also featuring my petty way-too-long rant about Bill James (which I’ve moved to the end).

Day Two

First I attended the Football Analytics despite finding it disappointing last year, and, alas, it wasn’t any better. Eric Mangini must be the only former NFL coach willing to attend, b/c they keep bringing him back:

Overall, I spent more time in day 2 going to niche panels, research paper presentations and talking to people.

The last, in particular, was great. For example, I had a fun conversation with Henry Abbott about Kobe Bryant’s lack of “clutch.” This is one of Abbott’s pet issues, and I admit he makes a good case, particularly that the Lakers are net losers in “clutch” situations (yes, relative to other teams), even over the periods where they have been dominant otherwise.

Kobe is kind of a pivotal case in analytics, I think. First, I’m a big believer in “Count the Rings, Son” analysis: That is, leading a team to multiple championships is really hard, and only really great players do it. I also think he stands at a kind of nexus, in that stats like PER give spray shooters like him an unfair advantage, but more finely tuned advanced metrics probably over-punish the same. Part of the burden of Kobe’s role is that he has to take a lot of bad shots—the relevant question is how good he is at his job.

Abbott also mentioned that he liked one of my tweets, but didn’t know if he could retweet the non-family-friendly “WTF”:

I also had a fun conversation with Neil Paine of Basketball Reference. He seemed like a very smart guy, but this may be attributable to the fact that we seemed to be on the same page about so many things. Additionally, we discussed a very fun hypo: How far back in time would you have to go for the Charlotte Bobcats to be the odds-on favorites to win the NBA Championship?

As for the “sideshow” panels, they’re generally more fruitful and interesting than the ESPN-moderated super-panels, but they offer fewer easy targets for easy blog-griping. If you’re really interested in what went down, there is a ton of info at the SSAC website. The agenda can be found here. Information on the speakers is here. And, most importantly, videos of the various panels can be found here.

Box Score Rebooted

Featuring Dean Oliver, Bill James, and others.

This was a somewhat interesting, though I think slightly off-target, panel. They spent a lot of time talking about new data and metrics and pooh-poohing things like RBI (and even OPS), and the brave new world of play-by-play and video tracking, etc. But too much of this was discussing a different granularity of data than what can be improved in the current granularity levels. Or, in other words:

James acquitted himself a bit on this subject, arguing that boatloads of new data isn’t useful if it isn’t boiled down into useful metrics. But a more general way of looking at this is: If we were starting over from scratch, with a box-score-sized space to report a statistical game summary, and a similar degree of game-scoring resources, what kinds of things would we want to include (or not) that are different from what we have now?  I can think of a few:

  1. In basketball, it’s archaic that free-throws aren’t broken down into bonus free throws and shot-replacing free throws.
  2. In football, I’d like to see passing stats by down and distance, or at least in a few key categories like 3rd and long.
  3. In baseball, I’d like to see “runs relative to par” for pitchers (though this can be computed easily enough from existing box scores).

In this panel, Dean Oliver took the opportunity to plug ESPN’s bizarre proprietary Total Quarterback Rating. They actually had another panel devoted just to this topic, but I didn’t go, so I’ll put a couple of thoughts here.

First, I don’t understand why ESPN is pushing this as a proprietary stat. Sure, no-one knows how to calculate regular old-fashioned quarterback ratings, but there’s a certain comfort in at least knowing it’s a real thing. It’s a bit like Terms of Service agreements, which people regularly sign without reading: at least you know the terms are out there, so someone actually cares enough to read them, and presumably they would raise a stink if you had to sign away your soul.

As for what we do know, I may write more on this come football season, but I have a couple of problems:

One, I hate the “clutch effect.” TQBR makes a special adjustment to value clutch performance even more than its generic contribution to winning. If anything, clutch situations in football are so bizarre that they should count less. In fact, when I’ve done NFL analysis, I’ve often just cut the 4th quarter entirely, and I’ve found I get better results. That may sound crazy, but it’s a bit like how some very advanced Soccer analysts have cut goal-scoring from their models, instead just focusing on how well a player advances the ball toward his goal: even if the former matters more, its unreliability may make it less useful.

Two, I’m disappointed in the way they “assign credit” for play outcomes:

Division of credit is the next step. Dividing credit among teammates is one of the most difficult but important aspects of sports. Teammates rely upon each other and, as the cliché goes, a team might not be the sum of its parts. By dividing credit, we are forcing the parts to sum up to the team, understanding the limitations but knowing that it is the best way statistically for the rating.

I’m personally very interested in this topic (and have discussed it with various ESPN analytics guys since long before TQBR was released). This is basically an attempt to address the entanglement problem that permeates football statistics.  ESPN’s published explanation is pretty cryptic, and it didn’t seem clear to me whether they were profiling individual players and situations or had created credit-distribution algorithms league-wide.

At the conference, I had a chance to talk with their analytics guy who designed this part of the metric (his name escapes me), and I confirmed that they modeled credit distribution for the entire league and are applying it in a blanket way.  Technically, I guess this is a step in the right direction, but it’s purely a reduction of noise and doesn’t address the real issue.  What I’d really like to see is like a recursive model that imputes how much credit various players deserve broadly, then uses those numbers to re-assign credit for particular outcomes (rinse and repeat).

Deconstructing the Rebound With Optical Tracking Data

Rajiv Maheswaran, and other nerds.

This presentation was so awesome that I offered them a hedge bet for the “Best Research Paper” award. That is, I would bet on them at even money, so that if they lost, at least they would receive a consolation prize. They declined. And won. Their findings are too numerous and interesting to list, so you should really check it out for yourself.

Obviously my work on the Dennis Rodman mystery makes me particularly interested in their theories of why certain players get more rebounds than others, as I tweeted in this insta-hypothesis:

Following the presentation, I got the chance to talk with Rajiv for quite a while, which was amazing. Obviously they don’t have any data on Dennis Rodman directly, but Rajiv was also interested in him and had watched a lot of Rodman video. Though anecdotal, he did say that his observations somewhat confirmed the theory that a big part of Rodman’s rebounding advantage seemed to come from handling space very well:

  1. Even when away from the basket, Rodman typically moved to the open space immediately following a shot. This is a bit different from how people often think about rebounding as aggressively attacking the ball (or as being able to near-psychically predict where the ball is going to come down.
  2. Also rather than simply attacking the board directly, Rodman’s first inclination was to insert himself between the nearest opponent and the basket. In theory, this might slightly decrease the chances of getting the ball when it heads in toward his previous position, but would make up for it by dramatically increasing his chances of getting the ball when it went toward the other guy.
  3. Though a little less purely strategical, Rajiv also thought that Rodman was just incredibly good at #2. That is, he was just exceptionally good at jockeying for position.

To some extent, I guess this is just rebounding fundamentals, but I still think it’s very interesting to think about the indirect probabilistic side of the rebounding game.

Live B.S. Report with Bill James

Quick tangent: At one point, I thought Neil Paine summed me up pretty well as a “contrarian to the contrarians.”  Of course, I’m don’t think I’m contrary for the sake of contrariness, or that I’m a negative person (I don’t know how many times I’ve explained to my wife that just because I hated a movie doesn’t mean I didn’t enjoy it!), it’s just that my mind is naturally inclined toward considering the limitations of whatever is put in front of it. Sometimes that means criticizing the status quo, and sometimes that means criticizing its critics.

So, with that in mind, I thought Bill James’s showing at the conference was pretty disappointing, particularly his interview with Bill Simmons.

I have a lot of respect for James.  I read his Historical Baseball Abstract and enjoyed it considerably more than Moneyball.  He has a very intuitive and logical mind. He doesn’t say a bunch of shit that’s not true, and he sees beyond the obvious. In Saturday’s “Rebooting the Box-score” panel, he made an observation that having 3 of 5 people on the panel named John implied that the panel was [likely] older than the rest of the room.  This got a nice laugh from the attendees, but I don’t think he was kidding.  And whether he was or not, he still gets 10 kudos from me for making the closest thing to a Bayesian argument I heard all weekend.  And I dutifully snuck in for a pic with him:

James was somewhat ahead of his time, and perhaps he’s still one of the better sports analytic minds out there, but in this interview we didn’t really get to hear him analyze anything, you know, sportsy. This interview was all about Bill James and his bio and how awesome he was and how great he is and how hard it was for him to get recognized and how much he has changed the game and how, without him, the world would be a cold, dark place where ignorance reigned and nobody had ever heard of “win maximization.”

Bill Simmons going this route in a podcast interview doesn’t surprise me: his audience is obviously much broader than the geeks in the room, and Simmons knows his audience’s expectations better than anyone. What got to me was James’s willingness to play along, and everyone else’s willingness to eat it up. Here’s an example of both, from the conference’s official Twitter account:

Perhaps it’s because I never really liked baseball, and I didn’t really know anyone did any of this stuff until recently, but I’m pretty certain that Bill James had virtually zero impact on my own development as a sports data-cruncher.  When I made my first PRABS-style basketball formula in the early 1990’s (which was absolutely terrible, but is still more predictive than PER), I had no idea that any sports stats other than the box score even existed. By the time I first heard the word “sabermetrics,” I was deep into my own research, and didn’t bother really looking into it deeply until maybe a few months ago.

Which is not to say I had no guidance or inspiration.  For me, a big epiphanous turning point in my approach to the analysis of games did take place—after I read David Sklansky’s Theory of Poker. While ToP itself was published in 1994, Sklansky’s similar offerings date back to the 70s, so I don’t think any broader causal pictures are possible.

More broadly, I think the claim that sports analytics wouldn’t have developed without Bill James is preposterous. Especially if, as i assume we do, we firmly believe we’re right.  This isn’t like L. Ron Hubbard and Incident II: being for sports analytics isn’t like having faith in a person or his religion. It simply means trying to think more rigorously about sports, and using all of the available analytical techniques we can to gain an advantage. Eventually, those who embrace the right will win out, as we’ve seen begin to happen in sports, and as has already happened in nearly every other discipline.

Indeed, by his own admission, James liked to stir controversy, piss people off, and talk down to the old guard whenever possible. As far as we know, he may have set the cause of sports analytics back, either by alienating the people who could have helped it gain acceptance, or by setting an arrogant and confrontational tone for his disciples (e.g., the uplifting “don’t feel the need to explain yourself” message in Moneyball). I’m not saying that this is the case or even a likely possibility, I’m just trying to illustrate that giving someone credit for all that follows—even a pioneer like James—is a dicey game that I’d rather not participate in, and that he definitely shouldn’t.

On a more technical note, one of his oft-quoted and re-tweeted pearls of wisdom goes as follows:

Sounds great, right? I mean, not really, I don’t get the metaphor: if the sea is full of ignorance, why are you collecting water from it with a bucket rather than some kind of filtration system? But more importantly, his argument in defense of this claim is amazingly weak. When Simmons asked what kinds of things he’s talking about, he repeatedly emphasized that we have no idea whether a college sophomore will turn out to be a great Major League pitcher.  True, but, um, we never will. There are too many variables, the input and outputs are too far apart in time, and the contexts are too different.  This isn’t the sea of ignorance, it’s a sea of unknowns.

Which gets at one of my big complaints about stats-types generally.  A lot of people seem to think that stats are all about making exciting discoveries and answering questions that were previously unanswerable. Yes, sometimes you get lucky and uncover some relationship that leads to a killer new strategy or to some game-altering new dynamic. But most of the time, you’ll find static. A good statistical thinker doesn’t try to reject the static, but tries to understand it: Figuring out what you can’t know is just as important as figuring out what you can know.

On Twitter I used this analogy:

Success comes with knowing more true things and fewer false things than the other guy.

The Case Against the Case for Dennis Rodman: Initial Volleys

When I began writing about Dennis Rodman, I was so terrified that I would miss something and the whole argument would come crashing down that I kept pushing it further and further and further, until a piece I initially planned to be about 10 pages of material ended up being more like 150. [BTW, this whole post may be a bit too inside-baseball if you haven’t actually read—or at least skimmed—my original “Case for Dennis Rodman.” If so, that link has a helpful guide.]

The downside of this, I assumed, is that the extra material should open up many angles of attack. It was a conscious trade-off, knowing that individual parts in the argument would be more vulnerable, but the Case as a whole would be thorough and redundant enough to survive any battles I might end up losing.

Ultimately, however, I’ve been a bit disappointed in the critical response. Most reactions I’ve seen have been either extremely complimentary or extremely dismissive.

So a while ago, I decided that if no one really wanted to take on the task, I would do it myself. In one of the Rodman posts, I wrote:

Give me an academic who creates an interesting and meaningful model, and then immediately devotes their best efforts to tearing it apart!

And thus The Case Against the Case for Dennis Rodman is born.

Before starting, here are a few qualifying points:

  1. I’m not a lawyer, so I have no intention of arguing things I don’t believe. I’m calling this “The Case Against the Case For Dennis Rodman,” because I cannot in good faith (barring some new evidence or argument I am as yet unfamiliar with) write The Case Against Dennis Rodman.
  2. Similarly, where I think an argument is worth being raised and discussed but ultimately fails, I will make the defense immediately (much like “Objections and Replies”).
  3. I don’t have an over-arching anti-Case hypothesis to prove, so don’t expect this series to be a systematic takedown of the entire enterprise. Rather, I will point out weaknesses as I consider them, so they may not come in any kind of predictable order.
  4. If you were paying attention, of course you noticed that The Case For Dennis Rodman was really (or at least concurrently) about demonstrating how player valuation is much more dynamic and complicated than either conventional or unconventional wisdom gives it credit for. But, for now, The Case Against the Case will focus mainly on the Dennis Rodman part.

Ok, so with this mission in mind, let me start with a bit of what’s out there already:

A Not-Completely-Stupid Forum Discussion

I admit, I spend a fair amount of time following back links to my blog. Some of that is just ego-surfing, but I’m also desperate to find worthy counter-arguments.

As I said above, that search is sometimes more fruitless than I would like. Even the more intelligent discussions usually include a lot of uninspired drivel. For example, let’s look at a recent thread on RealGM. After one person lays out a decent (though imperfect) summary of my argument, there are several responses along the lines of poster “SVictor”s:

I won’t pay attention to any study that states that [Rodman might be more valuable than Michael Jordan].

Actually, I’m pretty sympathetic to this kind of objection. There can be a bayesian ring of truth to “that is just absurd on its face” arguments (I once made a similar argument against an advanced NFL stat after it claimed Neil O’Donnell was the best QB in football). However, it’s not really a counter-argument, it’s more a meta-argument, and I think I’ve considered most of those to death. Besides, I don’t actually make the claim in question, I merely suggest it as something worth considering.

A much more detailed and interesting response comes from poster “mysticbb.” Now, he starts out pretty insultingly:

The argumentation is biased, it is pretty obvious, which makes it really sad, because I know how much effort someone has to put into such analysis.

I cannot say affirmatively that I have no biases, or that bias never affects my work. Study after study shows that this is virtually impossible. But I can say that I am completely and fundamentally committed to identifying it and stamping it out wherever I can. So, please—as I asked in my conclusion—please point out where the bias is evident and I will do everything in my power to fix it.

Oddly, though, mysticbb seems to endorse (almost verbatim) the proposition that I set out to prove:

Let me start with saying that Dennis Rodman seems to be underrated by a lot of people. He was a great player and deserved to be in the HOF, I have no doubt about that. He had great impact on the game and really improved his team while playing.

(People get so easily distracted: You write one article about a role-player maybe being better than Michael Jordan, and they forget that your overall claim is more modest.)

Of course, my analysis could just be way off, particularly in ways that favor Rodman. To that end, mysticbb raises several valid points, though with various degrees of significance.

Here he is on Rodman’s rebounding:

Let me start with the rebounding aspect. From 1991 to 1998 Rodman was leading the league in TRB% in each season. He had 17.7 ORB%, 33 DRB% and overall 25.4 TRB%. Those are AWESOME numbers, if we ignore context. Let us take a look at the numbers for the playoffs during the same timespan: 15.9 ORB%, 27.6 DRB% and 21.6 TRB%. Still great numbers, but obviously clearly worse than his regular season numbers. Why? Well, Rodman had the tendency to pad his rebounding stats in the regular season against weaker teams, while ignoring defensive assignments and fighting his teammates for rebounds. All that was eliminated during the playoffs and his numbers took a hit.

Now, I don’t know how much I talked about the playoffs per se, but I definitely discussed—and even argued myself—that Rodman’s rebounding numbers are likely inflated. But I also argued that if that IS the case, it probably means Rodman was even more valuable overall (see that same link for more detail). He continues:

Especially when we look at the defensive rebounding part, during the regular season he is clearly ahead of Duncan or Garnett, but in the playoffs they are all basically tied. Now imagine, Rodman brings his value via rebounding, what does that say about him, if that value is matched by players like Duncan or Garnett who both are also great defenders and obviously clearly better offensive players?

Now, as I noted at the outset Rodman’s career offensive rebounding percentage is approximately equal to Kevin Garnett’s career overall rebounding percentage, so I think Mystic is making a false equivalency based on a few cherry-picked stats.

But, for a moment, let’s assume it were true that Garnett/Duncan had similar rebounding numbers to Rodman, so what? Rodman’s crazy rebounding numbers cohere nicely with the rest of the puzzle as an explanation of why he was so valuable—his absurd rebounding stats make his absurd impact stats more plausible and vice versa—but they’re technically incidental. Indeed, they’re even incidental to his rebounding contribution: The number (or even percent) of rebounds a player gets does not correlate very strongly with the number of rebounds he has actually added to his team (nor does a player’s offensive “production” correlate very strongly with improvement in a team’s offense), and it does so the most on the extremes.

But I give the objection credit in this regard: The playoff/regular season disparity in Rodman’s rebounding numbers (though let’s not overstate the case, Rodman has 3 of the top 4 TRB%’s in playoff history) do serve to highlight how dynamic basketball statistics are. The original Case For Dennis Rodman is perhaps too willing to draw straight causal lines, and that may be worth looking into. Also, a more thorough examination of Rodman’s playoff performance may be in order as well.

On the indirect side of The Case, mysticbb has this to say:

[T]he high difference between the team performance in games with Rodman and without Rodman is also caused by a difference in terms of strength of schedule, HCA and other injured players.

I definitely agree that my crude calculation of Win % differentials does not control for a number of things that could be giving Rodman, or any other player, a boost. Controlling for some of these things is probably possible, if more difficult than you might think. This is certainly an area where I would like to implement some more robust comparison methods (and I’m slowly working on it).

But, ultimately, all of the factors mysticbb mentions are noise. Circumstances vary and lots of things happen when players miss games, and there are a lot of players and a lot of circumstances in the sample that Rodman is compared to: everyone has a chance to get lucky. That chance is reflected in my statistical significance calculations.

Mysticbb makes some assertions about Rodman having a particularly favorable schedule, but cites only the 1997 Bulls, and it’s pretty thin gruel:

If we look at the 12 games with Kukoc instead of Rodman we are getting 11.0 SRS. So, Rodman over Kukoc made about 0.5 points.

Of course, if there is evidence that Rodman was especially lucky over his career, I would like to see it. But, hmm, since I’m working on the Case Against myself, I guess that’s my responsibility as well. Fair enough, I’ll look into it.

Finally, mysticbb argues:

The last point which needs to be considered is the offcourt issues Rodman caused, which effected the outcome of games. Take the 1995 Spurs for example, when Rodman refused to guard Horry on the perimeter leading to multiple open 3pt shots for Horry including the later neck-breaker in game 6. The Spurs one year later without Rodman played as good as in 1995 with him.

I don’t really have much to say on the first part of this. As I noted at the outset, there’s some chance that Rodman caused problems on his team, but I feel completely incompetent to judge that sort of thing. But the other part is interesting: It’s true that the Spurs were only 5% worse in 95-96 than they were in 94-95 (OFC, they would be worse measuring only against games Rodman played in), but cross-season comparisons are obviously tricky, for a number of reasons. And if they did exist, I’m not sure they would break the way suggested. For example, the 2nd Bulls 3-peat teams were about as much better than the first Bulls 3-peat as the first Bulls 3-peat was better than the 93-95 teams that were sans Michael Jordan.

That said, I actually do find multi-season comparisons to be a valid area for exploration. So, e.g., I’ve spent some time looking at rookie impact and how predictive it is of future success (answer: probably more than you think).

Finally, a poster named “parapooper” makes some points that he credits to me, including:

He also admits that Rodman actually has a big advantage in this calculation because he missed probably more games than any other player due to reasons other than health and age.

I don’t actually remember making this point, at least this explicitly, but it is a valid concern IMO. A lot of the In/Out numbers my system generated include seasons where players were old or infirm, which disadvantages them. In fact, I initially tried to excise these seasons, and tried accounting for them in a variety of ways, such as comparing “best periods” to “best periods”, etc. But I found such attempts to be pretty unwieldy and arbitrary, and they shrunk the sample size more than I thought they were worth, without affecting the bottom line: Rodman just comes out on top of a smaller pile. That said, some advantage to Rodman relative to others must exist, and quantifying that advantage is a worthy goal.

A similar problem that “para” didn’t mention specifically is that a number of the in/out periods for players include spots where the player was traded. In subsequent analysis, I’ve confirmed what common sense would probably indicate: A player’s differential stats in trade scenarios are much less reliable. Future versions of the differential comparison should account for this, one way or another.

The differential analysis in the series does seem to be the area that most needs upgrading, though the constant trade-off between more information and higher quality information means it will never be as conclusive as we might want it to be. Not mentioned in this thread (that I saw), but what I will certainly deal with myself, are broader objections to the differential comparisons as an enterprise. So, you know. Stay tuned.