The Case for Dennis Rodman, Part 4/4(a): All-Hall?

First of all, congrats to Dennis for his well-deserved selection as a 2011 Hall of Fame inductee—of course, I take full credit. But seriously, when the finalists were announced, I immediately suspected that he would make the cut, mostly for two reasons:

Making the finalists this year after failing to make the semi-finalists last year made it more likely that last year’s snub really was more about eligibility concerns than general antipathy or lack of respect toward him as a player.
The list of co-finalists was very favorable. First, Reggie Miller not making the list was a boon, as he could have taken the “best player” spot, and Rodman would have lacked the goodwill to make it as one of the “overdue”—without Reggie, Rodman was clearly the most accomplished name in the field. Second, Chris Mullen being available to take the “overdue” spot was the proverbial “spoonful of sugar” that allowed the bad medicine of Rodman’s selection go down.

Congrats also to Artis Gilmore and Arvydas Sabonis. In my historical research, Gilmore’s name has repeatedly popped up as an excellent player, both by conventional measures (11-time All-Star, 1xABA Champion, 1xABA MVP, led league in FG% 7 times), and advanced statistical ones (NBA career leader in True Shooting %, ABA career leader in Win Shares and Win Shares/48, and a great all-around rebounder). It was actually only a few months ago that I first discovered—to my shock—that he was NOT in the Hall [Note to self: cancel plans for “The Case for Artis Gilmore”]. Sabonis was an excellent international player with a 20+ year career that included leading the U.S.S.R. to an Olympic gold medal and winning 8 European POY awards. I remember following him closely when he finally came to the NBA, and during his too-brief stint, he was one of the great per-minute contributors in the league (though obviously I’m not a fan of the stat, his PER over his first 5 season—which were from age 31-35—was 21.7, which would place him around 30th in NBA history). Though his sample size was too small to qualify for my study, his adjusted win percentage differential over his NBA career was a very respectable 9.95%, despite only averaging 24 minutes per game.

I was hesitant to publish Part 4 of this series before knowing whether Rodman made the Hall or not, as obviously the results shape the appropriate scope for my final arguments. So by necessity, this section has changed dramatically from what I initially intended. But I am glad I waited, as this gives me the opportunity to push the envelope of the analysis a little bit: Rather than simply wrapping up the argument for Rodman’s Hall-of-Fame candidacy, I’m going to consider some more ambitious ideas. Specifically, I will articulate two plausible arguments that Rodman may have been even more valuable than my analysis so far has suggested. The first of these is below, and the second—which is the most ambitious, and possibly the most shocking—will be published Monday morning in the final post of this series.

Introduction

I am aware that I’ve picked up a few readers since joining “the world’s finest quantitative analysts of basketball” in ESPN’s TrueHoop Stat Geek Smackdown. If you’re new, the main things you need to know about this series are that it’s 1) extremely long (sprawling over 13 sections in 4 parts, plus a Graph of the Day), 2) ridiculously (almost comically) detailed, and 3) only partly about Dennis Rodman. It’s also a convenient vehicle for me to present some of my original research and criticism about basketball analysis.

Obviously, the series includes a lot of superficially complicated statistics, though if you’re willing to plow through it all, I try to highlight the upshots as much as possible. But there is a lot going on, so to help new and old readers alike, I have a newly-updated “Rodman Series Guide,” which includes a broken down list of articles, a sampling of some of the most important graphs and visuals, and as of now, a giant new table summarizing the entire series by post, including the main points on both sides of the analysis. It’s too long to embed here, but it looks kind of like this:

As I’ve said repeatedly, this blog isn’t just called “Skeptical” Sports because the name was available: When it comes to sports analysis—from the mundane to the cutting edge—I’m a skeptic. People make interesting observations, perform detailed research, and make largely compelling arguments—which is all valuable. The problems begin when then they start believing too strongly in their results: they defend and “develop” their ideas and positions with an air of certainty far beyond what is objectively, empirically, or logically justified.

With that said, and being completely honest, I think The Case For Dennis Rodman is practically overkill. As a skeptic, I try to keep my ideas in their proper context: There are plausible hypotheses, speculative ideas, interim explanations requiring additional investigation, claims supported by varying degrees of analytical research, propositions that have been confirmed by multiple independent approaches, and the things I believe so thoroughly that I’m willing to write 13-part series’ to prove them. That Rodman was a great rebounder, that he was an extremely valuable player, even that he was easily Hall-of-Fame caliber—these propositions all fall into that latter category: they require a certain amount of thoughtful digging, but beyond that they practically prove themselves.

Yet, surely, there must be a whole realm of informed analysis to be done that is probative and compelling but which might fall short of the rigorous standards of “true knowledge.” As a skeptic, there are very few things I would bet my life on, but as a gambler—even a skeptical one—there are a much greater number of things I would bet my money on. So as my final act in this production, I’d like to present a couple of interesting arguments for Rodman’s greatness that are both a bit more extreme and a bit more speculative than those that have come before. Fortunately, I don’t think it makes them any less important, or any less captivating:

Read the rest of this entry »

The Case for Dennis Rodman, Part 3/4(d)—Endgame: Statistical Significance

The many histograms in sections (a)-(c) of Part 3 reflect fantastic p-values (probability that the outcome occurred by chance) for Dennis Rodman’s win percentage differentials relative to other players, but, technically, this doesn’t say anything about the p-values of each metric in itself. What this means is, while we have confidently established that Rodman’s didn’t just get lucky in putting up better numbers than his peers, we haven’t yet established the extent to which his being one of the best players by this measure actually proves his value. This is probably a minor distinction to all but my most nitpicky readers, but it is exactly one of those nagging “little insignificant details” that ends up being a key to the entire mystery.

The Technical Part (Feel Free to Skip)

The challenge here is this: My preferred method for rating the usefulness and reliability of various statistics is to see how accurate they are at predicting win differentials. But, now, the statistic I would like to test actually is win differential. The problem, of course, is that a player’s win differential is always going to be exactly identical to his win differential. If you’re familiar with the halting problem or Gödel’s incompleteness theorem, you probably know that this probably isn’t directly solvable: that is, I probably can’t design a metric for evaluating metrics that is capable of evaluating itself.

To work around this, our first step must be to independently assess the reliability of win predictions that are based on our inputs. As in sections (b) and (c), we should be able to do this on a team-by-team basis and adapt the results for player-by-player use. Specifically, what we need to know is the error distribution for the outcome-predicting equation—but this raises its own problems.

Normally, to get an error distribution of a predictive model, you just run the model a bunch of times and then measure the predicted results versus the actual results (calculating your average error, standard deviation, correlation, whatever). But, because my regression was to individual games, the error distribution gets “black-boxed” into the single-game win probability.

[A brief tangent: “Black box” is a term I use to refer to situations where the variance of your input elements gets sucked into the win percentage of a single outcome. E.g., in the NFL, when a coach must decide whether to punt or go for it on 4th down late in a game, his decision one way or the other may be described as “cautious” or “risky” or “gambling” or “conservative.” But these descriptions are utterly vapid: with respect to winning, there is no such thing as a play that is more or less “risky” than any other—there are only plays that improve your chances of winning and plays that hurt them. One play may seem like a bigger “gamble,” because there is a larger immediate disparity between its possible outcomes, but a 60% chance of winning is a 60% chance of winning. Whether your chances comes from superficially “risky” plays or superficially “cautious” ones, outside the “black box” of the game, they are equally volatile.]

For our purposes, what this means is that we need to choose something else to predict: specifically, something that will have an accurate and measurable error distribution. Thus, instead of using data from 81 games to predict the probability of winning one game, I decided to use data from 41 season-games to predict a team’s winning percentage in its other 41 games.

To do this, I split every team season since 1986 in half randomly, 10 times each, leading to a dataset of 6000ish randomly-generated half-season pairs. I then ran a logistic regression from each half to the other, using team winning percentage and team margin of victory as the input variables and games won as the output variable. I then measured the distribution of those outcomes, which gives us a baseline standard deviation for our predicted wins metric for a 41 game sample.

Next, as I discussed briefly in section (b), we can adapt the distribution to other sample sizes, so long as everything is distributed normally (which, at every point in the way so far, it has been). This is a feature of the normal distribution: it is easy to predict the error distribution of larger and smaller datasets—your standard deviation will be directly proportional to the square-root of the ratio of the new sample size to the original sample size.

Since I measured the original standard deviations in games, I converted each player’s “Qualifying Minutes” into “Qualifying Games” by dividing by 36. So the sample-size-adjusted standard deviation is calculated like this:

=[41GmStDev]*SQRT([PlQualGames]/41)

Since the metrics we’re testing are all in percentages, we then divide the new standard deviation by the size of the sample, like so:

=([41GmStDev]*SQRT([PlQualGames]/41))/[PlQualGames]

This gives us a standard deviation for actual vs. predicted winning percentages for any sample size. Whew!

The Good, Better, and Best Part

The good news is: now that we can generate standard deviations for each player’s win differentials, this allows us to calculate p-values for each metric, which allows us to finally address the big questions head on: How likely is it that this player’s performance was due to chance? Or, put another way: How much evidence is there that this player had a significant impact on winning?

The better news is: since our standard deviations are adjusted for sample size, we can greatly increase the size of the comparison pool, because players with smaller samples are “punished” accordingly. Thus, I dropped the 3-season requirement and the total minutes requirement entirely. The only remaining filters are that the player missed at least 15 games for each season in which a differential is computed, and that the player averaged at least 15 minutes per game played in those seasons. The new dataset now includes 1539 players.

Normally I don’t weight individual qualifying seasons when computing career differentials for qualifying players, because the weights are an evidentiary matter rather than an impact matter: when it comes to estimating a player’s impact, conceptually I think a player’s effect on team performance should be averaged across circumstances equally. But this comparison isn’t about whose stats indicate the most skill, but whose stats make for the best evidence of positive contribution. Thus, I’ve weighted each season (by the smaller of games missed or played) before making the relevant calculations.

So without further ado, here are Dennis Rodman’s statistical significance scores for the 4 versions of Win % differential, as well as where he ranks against the other players in our comparison pool:

Note: I’ve posted a complete table of z scores and p values for all 1539 players on the site. Note also that due to the weighting, some of the individual differential stats will be slightly different from their previous values.

You should be careful to understand the difference between this table of p-values and ranks vs. similar ones from earlier sections. In those tables, the p-value was determined by Rodman’s relative position in the pool, so the p-value and rank basically represented the same thing. In this case, the p-value is based on the expected error in the results. Specifically, they are the answer to the question “If Dennis Rodman actually had zero impact, how likely would it be for him to have posted these differentials over a sample of this size?” The “rank” is then where his answer ranks among the answers to the same question for the other 1538 players. Depending on your favorite flavor of win differential, Rodman ranks anywhere from 1st to 8th. His average rank among those is 3.5, which is 2nd only to Shaquille O’Neal (whose differentials are smaller but whose sample is much larger).

Of course, my preference is for the combined/adjusted stat. So here is my final histogram:

^{Note: N=1539.}

Now, to be completely clear, as I addressed in Part 3(a) and 2(b), so that I don’t get flamed (or stabbed, poisoned, shot, beaten, shot again, mutilated, drowned, and burned—metaphorically): Yes, actually I AM saying that, when it comes to empirical evidence based on win differentials, Rodman IS superior to Michael Jordan. This doesn’t mean he was the better player: for that, we can speculate, watch the tape, or analyze other sources of statistical evidence all day long. But for this source of information, in the final reckoning, win differentials provide more evidence of Dennis Rodman’s value than they do of Michael Jordan’s.

The best news is: That’s it. This is game, set, and match. If the 5 championships, the ridiculous rebounding stats, the deconstructed margin of victory, etc., aren’t enough to convince you, this should be: Looking at Win% and MOV differentials over the past 25 years, when we examine which players have the strongest, most reliable evidence that they were substantial contributors to their teams’ ability to win more basketball games, Dennis Rodman is among the tiny handful of players at the very very top.

The Case for Dennis Rodman, Part 3/4(c)—Beyond Margin of Victory

In the conventional wisdom, winning is probably overrated. The problem ultimately boils down to information quality: You only get one win or loss per game, so in the short-run, great teams, mediocre team, or teams that just get lucky can all achieve the same results. Margin of victory, on the other hand, has a whole range of possible outcomes that, while imperfectly descriptive of the bottom line, correlate strongly with team strength. You can think about it like sample size: a team’s margin of victory over a handful of games gives you a lot more data to work with than their won-loss record. Thus, particularly when the number of games you draw your data from is small, MOV tends to be more probative.

Long ago, the analytic community recognized this fact, and has moved en masse to MOV (and its ilk) as the main element in their predictive statistics. John Hollinger, for example, uses margin exclusively in his team power ratings—completely ignoring winning percentage—and these ratings are subsequently used for his playoff prediction odds, etc. Note, Hollinger’s model has a lot of baffling components, like heavily weighting a team’s performance in their last 10 games (or later in their last 25% of games), when there is no statistical evidence that L10 is any more predictive than first 10 (or any other 10). But this particular choice is of particular interest, especially as it is indicative of an almost uniform tendency among analysts to substitute MOV-style stats for winning percentage entirely.

This is both logically and empirically a mistake. As your sample size grows, winning percentage becomes more and more valuable. The reason for this is simple: Winning percentage is perfectly accurate—that is, it perfectly reflects what it is that we want to know—but has extremely high variance, while MOV is an imperfect proxy, whose usefulness stems primarily from its much lower variance. As sample sizes increase, the variance for MOV decreases towards 0 (which happens relatively quickly), but the gap between what it measures and what we want to know will persist in perpetuity. Thus, after a certain point, the “error” in MOV remains effectively constant, while the “error” in winning percentage continuously decreases. To get a simple intuitive sense of this, imagine the extremes: after 5 games, clearly you will have more faith in a team that has won 2 but has a MOV of +10 over a team that has won 3 but has a MOV of +1. But now imagine 1000 games with the same MOV’s and winning percentages: one team has won 400 and the other has won 600. If you had to place money on one of the two teams to win their next game, you would be a fool to favor the first. But beyond the intuitive point, this is essentially an empirical matter: with sufficient data, we should be able to establish the relative importance of each for any given sample-size.

So for this post, I’ve employed the same method that I used in section (b) to create our MOV-> Win% formula (logistic regression for all 55,000+ team games since 1986), except this time I included both Win % and MOV (over the team’s other 81 games) as the predictive variables. Here, first, are the coefficients and corresponding p-values (probability that the variable is not significant):

It is thus empirically incontrovertible that, even with an 81-game predictive sample, both MOV and Win% are statistically significant predictive factors. Also, for those who don’t eat logistic regression outputs for breakfast, I should be perfectly clear what this means: It doesn’t just mean that both W% and MOV are good at predicting W%—this is trivially true—it means that, even when you have one, using the other as well will make your predictions substantially better. To be specific, here is the formula that you would use to predict a team’s winning percentage based on these two variables:

$\large{PredictedWin\% = \dfrac{1}{1+e^{-(1.43wp+.081mv-.721)}}}$

Note: Again, e is euler’s number, or ~2.72. wp is the variable for winning % over the other 81 games, and mv is the variable for Margin of Victory over the other 81 games.

And again, for your home-viewing enjoyment, here is the corresponding Excel formula:

=1/(1+EXP(-(1.43*[W%]+.081[MOV]-.721)))

Finally, in order to visualize the relative importance of each variable, we can look at their standardized coefficients (shown here with 95% confidence bars):

^{Note: Standardized coefficients, again, are basically a unit of measurement for comparing the importance of things that come in different shapes and sizes.}

For an 81-game sample (which is about as large of a consistent sample as you can get in the NBA), Win% is about 60% as important as MOV when it comes to predicting outcomes. At the risk of sounding redundant, I need to make this extremely clear again: this does NOT mean that Win% is 60% as good at predicting outcomes as margin of victory (actually, it’s more like 98% as good at that)—it means that, when making your ideal prediction, which incorporates both variables, Win % gets 60% as much weight as MOV (as an aside, I should also note that the importance of MOV drops virtually to zero when it comes to predicting playoff outcomes, largely—though not entirely—because of home court advantage).

This may not sound like much, but I think it’s a pretty significant result: At the end of the day, this proves that there IS a skill to winning games independent of the rates at which you score and allow points. This is a non-obvious outcome that is almost entirely dismissed by the analytical community. If NBA games were a random walk based on possession-to-possession reciprocal advantages, this would not be the case at all.

Now, note that this is formally the same as the scenario discussed in section (b): We want to predict winning percentages, but using MOV alone leaves a certain amount of error. What this regression proves is that this error can be reduced by incorporating win percentage into our predictions as well. So consider this proof-positive that X-factors are predictively valuable. Since the predictive power of Win% and MOV should be equivalent no matter their source, we can now use this regression to make more accurate predictions about each player’s true impact.

Adapting this equation for individual player use is simple enough, though slightly different from before: Before entering the player’s Win% differential, we have to convert it into a raw win percentage, by adding .5. So, for example, if a player’s W% differential were 21.6%, we would enter 71.6%. Then, when a number comes out the other side, we can convert it back into a predicted differential by subtracting .5, etc.

Using this method, Rodman’s predicted win differential comes out to 14.8%. Here is the new histogram:

^{Note: N is still 470.}

This histogram is also weighted by the sample size for each player (meaning that a player with 100 games worth of qualifying minutes counts as 100 identical examples in a much larger dataset, etc.). I did this to get the most accurate distribution numbers to compute P values (which, in this case, work much like a percentile) for individual players. Here is a summary of the major factors for Dennis Rodman:

For comparison, I’ve also listed the percentage of eligible players that match the qualifying thresholds of my dataset (minus the games missed) who are in the Hall of Fame. Specifically, that is, those players who retired in 2004 or earlier and who have at least 3 seasons since 1986 with at least 15 games played in which they averaged at least 15 minutes per game. This gives us a list of 462 players, of which 23 are presently IN the Hall. The difference in average skill between that set of players and the differential set is minimal, and the reddish box on the histogram above surrounds the top 5% of predicted Win% differentials in our main data.

While we’re at it, let’s check in on the list of “select” players we first saw in section (a) and how they rank in this metric, as well as in some of the others I’ve discussed:

For fun, I’ve put average rank and rank of ranks (for raw W% diff, adjusted W% diff, MOV-based regression, raw W%/MOV-based regression, raw X-Factor, adjusted X-Factor, and adjusted W%/MOV-based regression) on the far right. I’ve also uploaded the complete win differential table for all 470 players to the site, including all of the actual values for these metrics and more. No matter which flavor of metric you prefer (and I believe the highlighted one to be the best), Rodman is solidly in Hall of Fame territory.

Finally, I’m not saying that the Hall of Fame does or must pick players based on their ability to contribute to their team’s winning percentages. But if they did, and if these numbers were accurate, Rodman would deserve a position with room to spare. Thus, naturally, one burning question remains: how much can we trust these numbers (and Dennis Rodman’s in particular)? This is what I will address in section (d) tomorrow.

The Case for Dennis Rodman, Part 3/4(b)—Rodman’s X-Factor

The sports analytical community has long used Margin of Victory or similar metrics as their core component for predicting future outcomes. In situations with relatively small samples, it generally slightly outperforms win percentages, even when predicting win percentages.

There are several different methods for converting MOV into expected win-rates. For this series, I took the 55,000+ regular-season team games played since 1986 and compared their outcomes to the team’s Margin of Victory over the other 81 games of the season. I then ran this data through a logistic regression (a method for predicting things that come in percentages) with MOV as the predictor variable. Here is the resulting formula:

$\large{PredictedWin\% = \dfrac{1}{1+e^{-(.127mv-.004}}}$

^{Note: e is euler’s number, or ~2.72. mv is the variable for margin of victory.}

This will return the probability between 0 and 1, corresponding to the odds of winning the predicted game. If you want to try it out for yourself, the excel formula is:

1 / (1 + EXP(-(-0.0039+0.1272*[MOV])))

So, for example, if a team’s point differential (MOV) over 81 games is 3.78 points per game, their odds of winning their 82nd game would be 61.7%.

Of course, we can use this same formula to predict a player’s win% differential based on his MOV differential. If, based on his MOV contribution alone, a player’s team would be expected to win 61.7% of the time, then his predicted win% differential is what his contribution would be above average, in this case 11.7% (this is one reason why, for comparison purposes, I prefer to use adjusted win differentials, as discussed in Part 3(a)).

As discussed in the part 2(b) of this series (“With or Without Worm”), Dennis Rodman’s MOV differential was 3.78 points, which was tops among players with at least a season’s worth of qualifying data, corresponding to the aforementioned win differential of 11.7%. Yet this under-predicts his actual win percentage differential by 9.9%. This could be the result of a miscalibrated prediction formula, but as you can see in the following histogram, the mean for win differential minus predicted win differential for our 470 qualifying player dataset is actually slightly below zero at –0.7%:

Rodman has the 2nd highest overall, which is even more crazy considering that he had one of the highest MOV’s (and the highest of anyone with anywhere close to his sample size) to begin with. Note how much of an outlier he is in this scatterplot (red dot is Rodman):

I call this difference the “X-Factor.” For my purposes, “X” stands for “unknown”: That is, it is the amount of a player’s win differential that isn’t explained by the most common method for predicting win percentages. For any particular player, it may represent an actual skill for winning above and beyond a player’s ability to contribute to his team’s margin of victory (in section (c), I will go about proving that such a skill exists), or it may simply be a result of normal variance. But considering that Rodman’s sample size is significantly larger than the average in our dataset, the chances of it being “error” should be much smaller. Consider the following:

Again, Rodman is a significant outlier: no one with more than 2500 qualifying minutes breaks 7.5%. Rodman’s combination of large sample with large Margin of Victory differential with large X-Factor is remarkable. To visualize this, I’ve put together a 3-D scatter plot of all 3 variables:

It can be hard to see where a point stands in space in a 2-D image, but I’ve added a surface grid to try to help guide you: the red point on top of the red mountain is Dennis Rodman.

To get a useful measure of how extreme this is, we can approximate a sample-size adjustment by comparing the number of qualifying minutes for each player to the average for the dataset, and then adjusting the standard deviation for that player accordingly (proportional to the square root of the ratio, a method which I’ll discuss in more detail in section (d)). After doing this, I can re-make the same histogram as above with the sample-adjusted numbers:

No man is an island. Except, apparently, for Dennis Rodman. Note that he is about 4 standard deviations above the mean (and observe how the normal distribution line has actually blended with the axis below his data point).

Naturally, of course, this raises the question:

Where does Rodman’s X-Factor come from?

Strictly speaking, what I’m calling “X-Factor” is just the prediction error of this model with respect to players. Some of that error is random and some of it is systematic. In section (c), I will prove that it’s not entirely random, though where it comes from for any individual player, I can only speculate.

Margin of Victory treats all contributions to a game’s point spread equally, whether they came at the tail end of a blowout, or in the final seconds of squeaker. One thing that could contribute to a high X-factor is “clutch”ness. A “clutch” shooter (like a Robert Horry), for example, might be an average or even slightly below-average player for most of the time he is on the floor, but an extremely valuable one near the end of games that could go either way. The net effect from the non-close games would be small for both metrics, but the effect of winning close games would be much higher on Win% than MOV. Of course, “clutch”ness doesn’t have to be limited to shooters: e.g., if one of a particular player’s skill advantages over the competition is that he makes better tactical decisions near the end of close games (like knowing when to intentionally foul, etc.), that would reflect much more strongly in his W% than in his MOV.

Also, a player who contributes significantly whenever they are on the floor but is frequently taken out of non-close games as a precaution again fatigue or injury may have a Win % that accurately reflects his impact, but a significantly understated MOV. E.g., in the Boston Celtics “Big 3” championship season, Kevin Garnett was rested constantly—a fact that probably killed his chances of being that season’s MVP—yet the Celtics won by far the most games in the league. In this case, the player is “clutch” just by virtue of being on the floor more in clutch spots.

The converse possibility also exists: A player could be “reverse clutch,” meaning that he plays worse when the game is NOT on the line. This would ultimately have the same statistical effect as if he played better in crunch time. And indeed, based on completely non-rigorous and anecdotal speculation, I think this is a possible factor in Rodman’s case. During his time in Chicago, I definitely recall him doing a number of silly things in the 4th quarter of blowout games (like launching up ridiculous 3-pointers) when it didn’t matter—and in a game of small margins, these things add up.

Finally, though it cuts a small amount against the absurdity of Rodman’s rebounding statistics, I would be derelict as an analyst not to mention the possibility that Rodman may have played sub-optimally in non-close games in order to pad his rebounding numbers. The net effect, of course, would be that his rebounding statistics could be slightly overstated, while his value (which is already quite prodigious) could be substantially understated. To be completely honest, with his rebounding percentages and his X-Factor both being such extreme outliers, I have to think that at least some relationship existing between the two is likely.

If you’re emotionally attached to the freak-alien-rebounder hypothesis, this might seem to be a bad result for you. But if you’re interested in Rodman’s true value to the teams he played for, you should understand that, if this theory is accurate, it could put Rodman’s true impact on winning into the stratosphere. That is, this possibility gives no fuel to Rodman’s potential critics: the worst cases on either side of the spectrum are that Rodman was the sickest rebounder with a great impact on his teams, or that he was a great rebounder with the sickest impact.

In the next section, I will be examining the relative reliability and importance of Margin of Victory vs. Win % generally, across the entire league. In my “endgame” analysis, this is the balance of factors that I will use. But the league patterns do not necessarily apply in all situations: In some cases, a player’s X-factor may be all luck, in some cases it may be all skill, and in most it is probably a mixture of both. So, for example, if my speculation about Rodman’s X-Factor were true, my final analysis of Rodman’s value could be greatly understated.

The Case for Dennis Rodman, Part 3/4(a)—Just Win, Baby (in Histograms)

First off, congratulations to Dennis for making the Hall of Fame finalist list for 2011. The circumstances seem favorable to his making it, and if I had to guess I’d say he probably will. While his under-appreciated status has been a useful vehicle for my analytical agenda, I certainly hope he will be voted in—though I might prefer it be with a copy of my series in the voters’ hands.

Second, I apologize for the delay in getting this section out. I’m reminded of the words of the always brilliant Detective Columbo:

I worry. I mean, little things bother me. I’m a worrier. I mean, little insignificant details – I lose my appetite. I can’t eat. My wife, she says to me, “You know, you can really be a pain.”

Of course, as Columbo understood, the “insignificant” details that nag at you are usually anything but. Since Part 3 of this series should be the last to include heavily-quantitative analysis—and because it is so important to understanding Rodman’s true value—I really tried to tie up all the loose ends (even those that might at first seem to be redundant or obvious).

As a result, what began as a simple observation grew into something painfully detailed and extremely long (even by my standards)—but well worth it. So, once again, I’ve decided to break it down into 4 sections—however, each of these will be relatively short, and I’ll be posting them back-to-back each morning from now through Saturday. Here is the Cliff’s Notes version:

Rodman had an observably great impact on his teams’ winning percentages.
This impact was much greater than his already great impact on Margin of Victory would have predicted.
Contrary to certain wisdom in the analytical community, Margin of Victory and Win% are both valuable indicators predictively, and combining Rodman’s differentials in both put him deep in Hall of Fame territory.
Rodman’s differentials are statistically significant at one of the highest levels in NBA history.

Now, on with the show:

Introduction

One of the most common doubts I hear about Dennis Rodman’s value stems from the belief that his personal successes—5 NBA championships, freakish rebounding statistics, etc.—were probably largely a result of his having played for superior teams. For example, his prodigious rebounding may have been something he was “allowed” to do because he played for good offensive teams and (as the argument goes) had few other offensive responsibilities.

In it’s weaker form, I think this argument is plausible but irrelevant: Perhaps Rodman would not have been able to put up the numbers that he did if he were “required” to do more on offense. But the implication that this diminishes his value is absurd—it would be like saying that Cy Young wasn’t a particularly valuable baseball player because he couldn’t have put up such a great ERA if he were “required” to hit every night.

The stronger form, however, suggests that Rodman’s anomalous rebounding statistics probably weren’t due to any particularly anomalous talent or contribution, but were merely (or at least mostly) a byproduct of his fortunate circumstances.

If this were true, however, one of the following things would necessarily have to follow:

His rebounding must not have contributed much value to his teams, or
The situations he played in must have been uniquely favorable to leveraging value from a designated rebounder, or
The choice to use a designated rebounder on an offensively strong team must have been an extremely successful exploitative strategy.

The third, I technically cannot disprove: It is theoretically possible that Rodman’s refusal to take a lot of shots on offense unintentionally caused his teams to stumble upon an amazing exploitative strategy that no one had discovered before and that no-one has duplicated since (though, if that were the case, he still might deserve some credit for forcing their hands).

But 1 and 2 simply aren’t supported by the data: As I will show, Rodman had wildly positive impacts on 4 different teams that had little in common, except of course for being solid winners with Rodman in the lineup.

Rodman’s Win % Differential

As I’ve discussed previously, a player’s differential statistics are simply the difference in their team’s performance in the games they played versus the games they missed. One very important differential stat we might be interested in is winning percentage.
To look at Rodman’s numbers in this area, I used exactly the same process that I described in Part 2(b) to look at his other differentials. However, for comparison purposes, I’ve greatly expanded the pool of players by dropping the qualifying minutes requirement from 3000 to 1000. This grows the pool from 164 players to 470.

Why expand? Honestly, because Rodman’s extreme win % differential allows it. I think the more stringent filters produce a list that is more reliable from top to bottom—but in this case, I am mostly interested in (literally) the top. There are some players on the list with barely 1/3 of a season’s worth of qualifying playing time to back up their numbers—which should produce extreme volatility—yet still no one is able to overtake Rodman.

Here is Rodman’s raw win differential, along with those of a number of select players (including a few whose styles are often compared to Rodman’s, some Hall of Famers, some future first-ballot Hall of Fame selections, and Rodman’s 2011 Hall of Fame co-finalists Chris Mullin and Maurice Cheeks):

I will put up a table of the entire list of 470 players—including win differentials and a number of other metrics that I will discuss throughout the rest of Part 3—along with section (c) on Friday.
Amazingly, this number may not even reflect Rodman’s true impact, because he generally played for extremely good teams, where it is not only harder to contribute, but where a given impact will have less of an effect on win percentage (for example, if your team normally wins 90% of its games, it is clearly impossible to have a win% differential above 10%). To account for this, I’ve also created “adjusted” win% differentials, which attempt to normalize a player’s percentage increase/decrease to what it would be on a .500 team.

This adjustment is done somewhat crudely, by measuring how far the player gets you toward 100% (for positive impacts) or toward 0% (for negative). E.g., if someone plays for a team that normally wins 70%, and they win 85% with him in the lineup, that is 50% of the way to 100%. Thus, as 50% of the way from 50% to 100% is 75%, that player’s adjusted differential is 25% (as opposed to their raw value of 15%).
A few notes about this method: While I prefer the adjusted numbers for this situation, they have their drawbacks. They are most accurate when dealing with consistently good or bad teams, over multiple seasons, and with bigger sample sizes. They are less accurate with smaller sample sizes, in individual seasons, and with uncertain team quality. This is because regression to the mean can become an interfering factor. When looking at individual seasons in a void, it is relatively easy to account for both effects, which I do for my league-wide win differential analysis. But when aggregating independent seasons that have a common related element—such as the same team or player—you basically have to pick your poison (of course, there may be some way to deal with this issue that I just don’t know or haven’t thought of yet). I will tend to use the adjusted numbers for this analysis, but though they are slightly more favorable to Rodman, either metric leads to the same bottom line. In any case, the tables I will be posting include both metrics (as well as other options).

Dennis Rodman’s adjusted numbers boost his win differential to 21.6%, widening the margin between him and 2nd place. I know I will be flamed if I don’t add that (just as I noted in part 2(b)) I am not claiming that Rodman was actually the best player in the last 25 years. This is a volatile statistic, and Rodman merely happening to have the best win differential among the group of 470 qualifying players does not mean he was actually the best player overall, or even that he was the best player in the group. That said, we should not dismiss the extremeness of the result either:

I will be using a number of (eerily similar) histograms through the rest of Part 3 as well. If you’re not familiar, histograms are one of the simplest and most useful graphical representations of single-variable data (yet, inexplicably, they aren’t built into Excel): each bar represents the number of data points of the designated value. If the variable is continuous (as it is in this case), each bar is basically a “container” that tells you how many data points fit in between the left and right values of the bar (technically it tells you the “density” of points near the center of the container, but those are effectively the same in most circumstances). Their main purpose is to eyeball how the variable is distributed—in this case, as you can see it is distributed normally.

The red line is an overlay of the normal distribution of the sample, which has a mean of –0.5% and standard deviation of 6.3%. This puts Rodman just over 3.5 standard deviations above the mean, a value that should occur about once in every 4000 instances—and he does this based on a standard deviation that is derived from a pool that includes the statistics of many players that have as little as 1/4th as much relevant data as he has.

Moreover, as I will discuss in section (b) tomorrow, his win % differential is not only extreme relative to the rest of the NBA, it is even extreme relative to himself—and this has important implications in its own right.

The Case for Dennis Rodman, Part 2/4 (a)(ii)—Player Valuation and Unconventional Wisdom

In my last post in this series, I outlined and criticized the dominance of gross points (specifically, points per game) in the conventional wisdom about player value. Of course, serious observers have recognized this issue for ages, responding in a number of ways—the most widespread still being ad hoc (case by case) analysis. Not satisfied with this approach, many basketball statisticians have developed advanced “All in One” player valuation metrics that can be applied broadly.

In general, Dennis Rodman has not benefitted much from the wave of advanced “One Size Fits All” basketball statistics. Perhaps the most notorious example of this type of metric—easily the most widely disseminated advanced player valuation stat out there—is John Hollinger’s Player Efficiency Rating:

In addition to ranking Rodman as the 7th best player on the 1995-96 Bulls championship team, PER is weighted to make the league average exactly 15—meaning that, according to this stat, Rodman (career PER: 14.6) was actually a below average player. While Rodman does significantly better in a few predictive stats (such as David Berri’s Wages of Wins) that value offensive rebounding very highly, I think that, generally, those who subscribe to the Unconventional Wisdom typically accept one or both of the following: 1) that despite Rodman’s incredible rebounding prowess, he was still just a very good a role-player, and likely provided less utility than those who were more well-rounded, or 2) that, even if Rodman was valuable, a large part of his contribution must have come from qualities that are not typically measurable with available data, such as defensive ability.

My next two posts in this series will put the lie to both of those propositions. In section (b) of Part 2, I will demonstrate Rodman’s overall per-game contributions—not only their extent and where he fits in the NBA’s historical hierarchy, but exactly where they come from. Specifically, contrary to both conventional and unconventional wisdom, I will show that his value doesn’t stem from quasi-mystical unmeasurables, but from exactly where we would expect: extra possessions stemming from extra rebounds. In part 3, I will demonstrate (and put into perspective) the empirical value of those contributions to the bottom line: winning. These two posts are at the heart of The Case for Dennis Rodman, qua “case for Dennis Rodman.”

But first, in line with my broader agenda, I would like to examine where and why so many advanced statistics get this case wrong, particularly Hollinger’s Player Efficiency Rating. I will show how, rather than being a simple outlier, the Rodman data point is emblematic of major errors that are common in conventional unconventional sports analysis – both as a product of designs that disguise rather than replace the problems they were meant to address, and as a product of uncritically defending and promoting an approach that desperately needs reworking.

Player Efficiency Ratings

John Hollinger deserves much respect for bringing advanced basketball analysis to the masses. His Player Efficiency Ratings are available on ESPN.com under Hollinger Player Statistics, where he uses them as the basis for his Value Added (VA) and Expected Wins Added (EWA) stats, and regularly features them in his writing (such as in this article projecting the Miami Heat’s 2010-11 record), as do other ESPN analysts. Basketball Reference includes PER in their “Advanced” statistical tables (present on every player and team page), and also use it to compute player Value Above Average and Value Above Replacement (definitions here).

The formula for PER is extremely complicated, but its core idea is simple: combine everything in a player’s stat-line by rewarding everything good (points, rebounds, assists, blocks, and steals), and punishing everything bad (missed shots, turnovers). The value of particular items are weighted by various league averages—as well as by Hollinger’s intuitions—then the overall result is calculated on a per-minute basis, adjusted for league and team pace, and normalized on a scale averaging 15.

Undoubtedly, PER is deeply flawed. But sometimes apparent “flaws” aren’t really “flaws,” but merely design limitations. For example: PER doesn’t account for defense or “intangibles,” it is calculated without resort to play-by-play data that didn’t exist prior to the last few seasons, and it compares players equally, regardless of position or role. For the most part, I will refrain from criticizing these constraints, instead focusing on a few important ways that it fails or even undermines its own objectives.

Predictivity (and: Introducing Win Differential Analysis)

Though Hollinger uses PER in his “wins added” analysis, its complete lack of any empirical component suggests that it should not be taken seriously as a predictive measure. And indeed, empirical investigation reveals that it is simply not very good at predicting a player’s actual impact:

This bubble-graph is a product of a broader study I’ve been working on that correlates various player statistics to the difference in their team’s per-game performance with them in and out of the line-up. The study’s dataset includes all NBA games back to 1986, and this particular graph is based on the 1300ish seasons in which a player who averaged 20+ minutes per game both missed and played at least 20 games. Win% differential is the difference in the player’s team’s winning percentage with and without him (for the correlation, each data-point is weighted by the smaller of games missed or played. I will have much more to write about nitty-gritty of this technique in separate posts).

So PER appears to do poorly, but how does it compare to other valuation metrics?

SecFor (or “Secret Formula”) is the current iteration of an empirically-based “All in One” metric that I’m developing—but there is no shame in a speculative purely a priori metric losing (even badly) as a predictor to the empirical cutting-edge.

However, as I admitted in the introduction to this series, my statistical interest in Dennis Rodman goes way back. One of the first spreadsheets I ever created was in the early 1990’s, when Rodman still played for San Antonio. I knew Rodman was a sick rebounder, but rarely scored—so naturally I thought: “If only there were a formula that combined all of a player’s statistics into one number that would reflect his total contribution.” So I came up with this crude, speculative, purely a priori equation:

Points + Rebounds + 2*Assists + 1.5*Blocks + 2*Steals – 2*Turnovers.

Unfortunately, this metric (which I called “PRABS”) failed to shed much light on the Rodman problem, so I shelved it. PER shares the same intention and core technique, albeit with many additional layers of complexity. For all of this refinement, however, Hollinger has somehow managed to make a bad metric even worse, getting beaten by my OG PRABS by nearly as much as he is able to beat points per game—the Flat Earth of basketball valuation metrics. So how did this happen?

Minutes

The trend in much of basketball analysis is to rate players by their per-minute or per-possession contributions. This approach does produce interesting and useful information, and they may be especially useful to a coach who is deciding who to give more minutes to, or to a GM who is trying to evaluate which bench player to sign in free agency.

But a player’s contribution to winning is necessarily going to be a function of how much extra “win” he is able to get you per minute and the number of minutes you are able to get from him. Let’s turn again to win differential:

For this graph, I set up a regression using each of the major rate stats, plus minutes played (TS%=true shooting percentage, or one half of average points per shot, including free throws and 3 pointers). If you don’t know what a “normalized coefficient” is, just think of it as a stat for comparing the relative importance of regression elements that come in different shapes and sizes. The sample is the same as above: it only includes players who average 20+ minutes per game.

Unsurprisingly, “minutes per game” is more predictive than any individual rate statistic, including true shooting. Simply multiplying PER by minutes played significantly improves its predictive power, managing to pull it into a dead-heat with PRABS (which obviously wasn’t minute-adjusted to begin with).

I’m hesitant to be too critical of the “per minute” design decision, since it is clearly an intentional element that allows PER to be used for bench or rotational player valuation, but ultimately I think this comes down to telos: So long as PER pretends to be an arbiter of player value—which Hollinger himself relies on for making actual predictions about team performance—then minutes are simply too important to ignore. If you want a way to evaluate part-time players and how they might contribute IF they could take on larger roles, then it is easy enough to create a second metric tailored to that end.

Here’s a similar example from baseball that confounds me: Rate stats are fine for evaluating position players, because nearly all of them are able to get you an entire game if you want—but when it comes to pitching, how often someone can play and the number of innings they can give you is of paramount importance. E.g., at least for starting pitchers, it seems to me that ERA is backwards: rather than calculate runs allowed per inning, why don’t they focus on runs denied per game? Using a benchmark of 4.5, it would be extremely easy to calculate: Innings Pitched/2 – Earned Runs. So, if a pitcher gets you 7 innings and allows 2 runs, their “Earned Runs Denied” (ERD) for the game would be 1.5. I have no pretensions of being a sabermetrician, and I’m sure this kind of stat (and much better) is common in that community, but I see no reason why this kind of statistic isn’t mainstream.

More broadly, I think this minutes SNAFU is reflective of an otherwise reasonable trend in the sports analytical community—to evaluate everything in terms of rates and quality instead of quantity—that is often taken too far. In reality, both may be useful, and the optimal balance in a particular situation is an empirical question that deserves investigation in its own right.

PER Rewards Shooting (and Punishes Not Shooting)

As described by David Berri, PER is well-known to reward inefficient shooting:

“Hollinger argues that each two point field goal made is worth about 1.65 points. A three point field goal made is worth 2.65 points. A missed field goal, though, costs a team 0.72 points. Given these values, with a bit of math we can show that a player will break even on his two point field goal attempts if he hits on 30.4% of these shots. On three pointers the break-even point is 21.4%. If a player exceeds these thresholds, and virtually every NBA player does so with respect to two-point shots, the more he shoots the higher his value in PERs. So a player can be an inefficient scorer and simply inflate his value by taking a large number of shots.”

The consequences of this should be properly understood: Since this feature of PER applies to every shot taken, it is not only the inefficient players who inflate their stats. PER gives a boost to everyone for every shot: Bad players who take bad shots can look merely mediocre, mediocre players who take mediocre shots can look like good players, and good players who take good shots can look like stars. For Dennis Rodman’s case—as someone who took very few shots, good or bad— the necessary converse of this is even more significant: since PER is a comparative statistic (even directly adjusted by league averages), players who don’t take a lot of shots are punished.
Structurally, PER favors shooting—but to what extent? To get a sense of it, let’s plot PER against usage rate:

^{Note: Data includes all player seasons since 1986. Usage % is the percentage of team possessions that end with a shot, free throw attempt, or turnover by the player in question. For most practical purposes, it measures how frequently the player shoots the ball.}

That R-squared value corresponds to a correlation of .628, which might seem high for a component that should be in the denominator. Of course, correlations are tricky, and there are a number of reasons why this relationship could be so strong. For example, the most efficient shooters might take the most shots. Let’s see:

Actually, that trend-line doesn’t quite do it justice: that R-squared value corresponds to a correlation of .11 (even weaker than I would have guessed).

I should note one caveat: The mostly flat relationship between usage and shooting may be skewed, in part, by the fact that better shooters are often required to take worse shots, not just more shots—particularly if they are the shooter of last resort. A player that manages to make a mediocre shot out of a bad situation can increase his team’s chances of winning, just as a player that takes a marginally good shot when a slam dunk is available may be hurting his team’s chances. Presently, no well-known shooting metrics account for this (though I am working on it), but to be perfectly clear for the purposes of this post: neither does PER. The strong correlation between usage rate and PER is unrelated. There is nothing in its structure to suggest this is an intended factor, and there is nothing in its (poor) empirical performance that would suggest it is even unintentionally addressed. In other words, it doesn’t account for complex shooting dynamics either in theory or in practice.

Duplicability and Linearity

PER strongly rewards broad mediocrity, and thus punishes lack of the same. In reality, not every point that a player scores means their team will score one more point, just as not every rebound grabbed means that their team will get one more possession. Conversely—and especially pertinent to Dennis Rodman—not every point that a player doesn’t score actually costs his team a point. What a player gets credit for in his stat line doesn’t necessarily correspond with his actual contribution, because there is always a chance that the good things he played a part in would have happened anyway. This leads to a whole set of issues that I typically file under the term “duplicability.”

A related (but sometimes confused) effect that has been studied extensively by very good basketball analysts is the problem of “diminishing returns” – which can be easily illustrated like this: if you put a team together with 5 players that normally score 25 points each, it doesn’t mean that your team will suddenly start scoring 125 points a game. Conversely—and again pertinent to Rodman—say your team has 5 players that normally score 20 points each, and you replace one of them with somebody that normally only scores 10, that does not mean that your team will suddenly start scoring only 90. Only one player can take a shot at a time, and what matters is whether the player’s lack of scoring hurts his team’s offense or not. The extent of this effect can be measured individually for different basketball statistics, and, indeed, studies have showed wide disparities.

As I will discuss at length in Part 2(c), despite hardly ever scoring, differential stats show that Rodman didn’t hurt his teams offenses at all: even after accounting for extra possessions that Rodman’s teams gained from offensive rebounds, his effect on offensive efficiency was statistically insignificant. In this case (as with Randy Moss), we are fortunate that Rodman had such a tumultuous career: as a result, he missed a significant number of games in a season several times with several different teams—this makes for good indirect data. But, for this post’s purposes, the burning question is: Is there any direct way to tell how likely a player’s statistical contributions were to have actually converted into team results?

This is an extremely difficult and intricate problem (though I am working on it), but it is easy enough to prove at least one way that a metric like PER gets it wrong: it treats all of the different components of player contribution linearly. In other words, one more point is worth one more point, whether it is the 15th point that a player scores or the 25th, and one more rebound is worth one more rebound, whether it is the 8th or the 18th. While this equivalency makes designing an all-in one equation much easier (at least for now, my Secret Formula metric is also linear), it is ultimately just another empirically testable assumption.

I have theorized that one reason Rodman’s PER stats are so low compared to his differential stats is that PER punishes his lack of mediocre scoring, while failing to reward the extremeness of his rebounding. This is based on the hypothesis that certain extreme statistics would be less “duplicable” than mediocre ones. As a result, the difference between a player getting 18 rebounds per game vs. getting 16 per game could be much greater than the difference between them getting 8 vs. getting 6. Or, in other words, the marginal value of rebounds would (hypothetically) be increasing.

Using win percentage differentials, this is a testable theory. Just as we can correlate an individual player’s statistics to the win differentials of his team, we can also correlate hypothetical statistics the same way. So say we want to test a metric like rebounds, except one that has increasing marginal value built in: a simple way to approximate that effect is to make our metric increase exponentially, such as using rebounds squared. If we need even more increasing marginal value, we can try rebounds cubed, etc. And if our metric has several different components (like PER), we can do the same for the individual parts: the beauty is that, at the end of the day, we can test—empirically—which metrics work and which don’t.

For those who don’t immediately grasp the math involved, I’ll go into a little detail: A linear relationship is really just an exponential relationship with an exponent of 1. So let’s consider a toy metric, “PR,” which is calculated as follows: Points + Rebounds. This is a linear equation (exponent = 1) that could be rewritten as follows: (Points)^1 + (Rebounds)^1. However, if, as above, we thought that both points and rebounds should have increasing marginal values, we might want to try a metric (call it “PRsq”) that combined points and rebounds squared, as follows: (Points)^2 + (Rebounds)^2. And so on. Here’s an example table demonstrating the increase in marginal value:

The fact that each different metric leads to vastly different magnitudes of value is irrelevant: for predictive purposes, the total value for each component will be normalized — the relative value is what matters (just as “number of pennies” and “number of quarters” are equally predictive of how much money you have in your pocket). So applying this concept to an even wider range of exponents for several relevant individual player statistics, we can empirically examine just how “exponential” each statistic really is:

For this graph, I looked at each of the major rate metrics (plus points per game) individually. So, for each player-season in my (1986-) sample, I calculated the number of points, points squared, points^3rd. . . points^10th power, and then correlated all of these to that player’s win percentage differential. From those calculations, we can find roughly how much the marginal value for each metric increases, based on what exponent produces the best correlation: The smaller the number at the peak of the curve, the more linear the metric is—the higher the number, the more exponential (i.e., extreme values are that much more important). When I ran this computation, the relative shape of each curve fit my intuitions, but the magnitudes surprised me: That is, many of the metrics turned out to be even more exponential than I would have guessed.

As I know this may be confusing to many of my readers, I need to be absolutely clear: the shape of each curve has nothing to do with the actual importance of each metric. It only tells us how much that particular metric is sensitive to very large values. E.g., the fact that Blocks and Assists peak on the left and sharply decline doesn’t make them more or less important than any of the others, it simply means that having 1 block in your scoreline instead of 0 is relatively just as valuable as having 5 blocks instead of 4. On the other extreme, turnovers peak somewhere off the chart, suggesting that turnover rates matter most when they are extremely high.

For now, I’m not trying to draw a conclusive picture about exactly what exponents would make for an ideal all-in-one equation (polynomial regressions are very very tricky, though I may wade into those difficulties more in future blog posts). But as a minimum outcome, I think the data strongly supports my hypothesis: that many stats—especially rebounds—are exponential predictors. Thus, I mean this less as a criticism of PER than as an explanation of why it undervalues players like Dennis Rodman.

Gross, and Points

In subsection (i), I concluded that “gross points” as a metric for player valuation had two main flaws: gross, and points. Superficially, PER responds to both of these flaws directly: it attempts to correct the “gross” problem both by punishing bad shots, and by adjusting for pace and minutes. It attacks the “points” problem by adding rebounds, assists, blocks, steals, and turnovers. The problem is, these “solutions” don’t match up particularly well with the problems “gross” and “points” present.
The problem with the “grossness” of points certainly wasn’t minutes (note: for historical comparisons, pace adjustments are probably necessary, but the jury is still out on the wisdom of doing the same on a team-by-team basis within a season). The main problem with “gross” was shooting efficiency: If someone takes a bunch of shots, they will eventually score a lot of points. But scoring points is just another thing that players do that may or may not help their teams win. PER attempted to account for this by punishing missed shots, but didn’t go far enough. The original problem with “gross” persists: As discussed above, taking shots helps your rating, whether they are good shots or not.

As for “points”: in addition to any problems created by having arbitrary (non-empirical) and linear coefficients, the strong bias towards shooting causes PER to undermine its key innovation—the incorporation of non-point components. This “bias” can be represented visually:

^{Note: This data comes from a regression to PER including each of the rate stats corresponding to the various components of PER.}

This pie chart is based on a linear regression including rate stats for each of PER’s components. Strictly, what it tells us is the relative value of each factor to predicting PER if each of the other factors were known. Thus, the “usage” section of this pie represents the advantage gained by taking more shots—even if all your other rate stats were fixed. Or, in other words, pure bias (note that the number of shots a player takes is almost as predictive as his shooting ability).

For fun, let’s compare that pie to the exact same regression run on Points Per Game rather than PER:

^{Note: These would not be the best variables to select if you were actually trying to predict a player’s Points Per Game. Note also that “Usage” in these charts is NOT like “Other”—while other variables may affect PPG, and/or may affect the items in this regression, they are not represented in these charts.}

Interestingly, Points Per Game was already somewhat predictable by shooting ability, turnovers, defensive rebounding, and assists. While I hesitate to draw conclusions from the aesthetic comparison, we can guess why perhaps PER doesn’t beat PPG as significantly as we might expect: it appears to share much of the same DNA. (My more wild and ambitious thoughts suspect that these similarities reflect the strength of our broader pro-points bias: even when designing an All-in-One statistic, even Hollinger’s linear, non-empirical, a priori coefficients still mostly reflect the conventional wisdom about the importance of many of the factors, as reflected in the way that they relate directly to points per game).

I could make a similar pie-chart for Win% differential, but I think it might give the wrong impression: these aren’t even close to the best set of variables to use for that purpose. Suffice it to say that it would look very, very different (for an imperfect picture of how much so, you can compare to the values in the Relative Importance chart above).

Conclusions

The deeper irony with PER is not just that it could theoretically be better, but that it adds many levels of complexity to the problem it purports to address, ultimately failing in strikingly similar ways. It has been dressed up around the edges with various adjustments for team and league pace, incorporation of league averages to weight rebounds and value of possession, etc. This is, to coin a phrase, like putting lipstick on a pig. The energy that Hollinger has spent on dressing up his model could have been better spent rethinking the core of it.

In my estimation, this pattern persists among many extremely smart people who generate innovative models and ideas: once created, they spend most of their time—entire careers even—in order: 1) defending it, 2) applying it to new situations, and 3) tweaking it. This happens in just about every field: hard and soft sciences, economics, history, philosophy, even literature. Give me an academic who creates an interesting and meaningful model, and then immediately devotes their best efforts to tearing it apart! In all my education, I have had perhaps two professors who embraced this approach, and I would rank both among my very favorites.

This post and the last were admittedly relatively light on Rodman-specific analysis, but that will change with a vengeance in the next two. Stay tuned.

Update (5/13/11): Commenter “Yariv” correctly points out that an “exponential” curve is technically one in the form y^x (such as 2^x, 3^x, etc), where the increasing marginal value I’m referring to in the “Linearity” section above is about terms in the form x^y (e.g., x^2, x^3, etc), or monomial terms with an exponent not equal to 1. I apologize for any confusion, and I’ll rewrite the section when I have time.