This was my first time attending this conference, and Day 1 was an amazing experience. At this point last year, I literally didn’t know that there was a term (“sports analytics”) for the stuff I liked to do in my spare time. Now I learn that there is not only an entire industry built up around the practice, but a whole army of nerds in its society. Naturally, I have tons of criticisms of various things that I saw and heard—that’s what I do—but I loved it, even the parts I hated.
Here are the panels and presentations that I attended, along with some of my thoughts:
Birth to Stardom: Developing the Modern Athlete in 10,000 Hours?
Featuring Malcolm Gladwell (Author of Outliers), Jeff Van Gundy (ESPN), and others I didn’t recognize.
In this talk, Gladwell rehashed his absurdly popular maxim about how it takes 10,000 hours to master anything, and then made a bunch of absurd claims about talent. (Players with talent are at a disadvantage! Nobody wants to hire Supreme Court clerks! Etc.) The most re-tweeted item to come out of Day 1 by far was his highly speculative assertion that “a lot of what we call talent is the desire to practice.”
While this makes for a great motivational poster, IMO his argument in this area is tautological at best, and highly deceptive at worst. Some people have the gift of extreme talent, and some people have the gift of incredible work ethic. The streets of the earth are littered with the corpses of people who had one and not the other. Unsurprisingly, the most successful people tend to have both. To illustrate, here’s a random sample of 10,000 “people” with independent normally distributed work ethic and talent (each with a mean of 0, standard deviation of 1):
The blue dots (left axis) are simply Hard Work plotted against Talent. The red dots (right axis) are Hard Work plotted against the sum of Hard Work and Talent—call it “total awesome factor” or “success” or whatever. Now let’s try a little Bayes’ Theorem intuition check: You randomly select a person and they have an awesome factor of +5. What are the odds that they have a work ethic of better than 2 standard deviations above the mean? High? Does this prove that all of the successful people are just hard workers in disguise?
Hint: No. And this illustration is conservative: This sample is only 10,000 strong: increase to 10 billion, and the biggest outliers will be even more uniformly even harder workers (and they will all be extremely talented as well). Moreover, this “model” for greatness is just a sum of the two variables, when in reality it is probably closer to a product, which would lead to even greater disparities. E.g.: I imagine total greatness achieved might be something like great stuff produced per minute worked (a function of talent) times total minutes worked (a function of willpower, determination, fortitude, blah blah, etc).
The general problem with Gladwell I think is that his emphatic de-emphasis of talent (which has no evidence backing it up) cheapens his much stronger underlying observation that for any individual to fully maximize their potential takes the accumulation of a massive amount of hard work—and this is true for people regardless of what their full potential may be. Of course, this could just be a shrewd marketing ploy on his part: you probably sell more books by selling the hope of greatness rather than the hope of being an upper-level mid-manager (especially since you don’t have to worry about that hope going unfulfilled for at least 10 years).
Performance and Injury Analytics
Featuring John Brenkus (ESPN Sports Science guy), Will Carroll (Sports Illustrated), and others. Moderated by Peter Keating (ESPN).
From what I stayed for, this panel was pretty disappointing. There are two burning “analytical” questions about injuries that I would love to know the answers to:
- Is there really such a thing as “injury-prone” (i.e., how strongly do injuries correlate with more injuries)? And,
- How much do injuries affect a player’s future performance?
But rather than addressing these issues, the panel seemed to focus more on concussions and how bad they are, and whether pre-season games are worth it, etc. (including an incredible claim by one panelist that 70% of all NFL injuries occur within the first two weeks of minicamp, which I am not even close to believing without seeing the data for myself).
The Real Reasons Behind the Home Field Advantage
By Tobias J. Moskowitz.
This talk got a lot of buzz all day, but by the time I showed up it was almost over and observers were 3-deep out the door. The Cliffs Notes version: Biased refs. Unfortunately I missed most of the details, but apparently it relied heavily on European soccer data. I will definitely go back into my materials to look this one over, but it seems like a plausible and obviously very meaningful hypothesis.
What Optical Tracking Data Says About NBA Field Goal Shooting
By Sandy Weill.
This talk was an extremely rich look at a number of different things that can be examined with the most cutting-edge data out there. I think it may have literally been an infomercial for STATS Inc. (please correct me if I’m wrong), but it was very impressive regardless. The biggest disappointment to me is that their new system tracks approximately 1 million data points *per game* (yes, that’s double-bold), yet STILL doesn’t tell us how much time was left on the shot clock when each shot was taken. Argh!
- Number of defenders in the vicinity is much more important than proximity of defender(s).
- Shots immediately following a pass are more successful, even when controlling for distance and defensive positioning.
- The presenter did a long, detailed hypothesis-test about whether there was arbitrage opportunity for players to “step back” and take worse shots against softer defenses, but concluded that there wasn’t.
- Maybe the most interesting part to me was the ultra-nerdy discussion of how they cleaned the new data-set by matching up inconsistencies with the old data-set.
Overall, the guy admitted that he hadn’t really made many ostensibly exciting or counter-intuitive discoveries, but I think that’s probably a better result on balance. While it may not be as sexy, I think a system that confirms a bunch of prior beliefs is more likely to be right when it finally finds something wrong than one that finds fault everywhere it looks.
Featuring Eric Mangini, Aaron Schatz, and others. Moderated by Gary Belsky (ESPN the Magazine).
I was warned that this panel is usually a snore, and it was. Aaron seemed like the only one who knew what he was talking about, and even he was being modest and charitable. Some of the low-lights include:
- A lengthy discussion of “subjectivity” in football statistics, as if this were a meaningful problem. Um, data doesn’t have to be perfect to be useful. This is a clear case of “taboo”: “subjective” is a taboo word that people think is supposed to be bad w/r/t data, and thus they reflexively react negatively whenever they see “subjective” and “data” in the same paragraph. Then they say the data is “flawed” and go back to making decisions the old way—which is to say, based on their subjective intuitions.
- When asked about why NFL teams don’t go for it on 4th downs, Eric Mangini rambled on about how it may look like you should go for it based on league averages, but in the circumstances of a particular game there are a lot of other factors to consider, like your kicker might be tired or there might be too much wind or something. In addition to everything else that’s wrong about this answer, can I just note the obvious point that both of those things would seem to favor going for it instead of kicking?
- Somebody gave us this brilliant introduction to game theory: If teams started going for it on 4th down, defenses might adjust to try to stop them from converting on 4th down! While this might be relevant for fake punts and fake field goals, I’m pretty sure that in most cases defenses do that already (relatedly, moratorium on “Schrodinger’s Cat” metaphors, please).
Anyway, I didn’t stay to see if it got any better.
Sports Gambling: The Source of Sports Innovation?
Featuring Jeff Ma (of Brining Down the House fame), Michael Konik, and others. Moderated by Chad Millman (ESPN).
I only caught a few minutes of this one. As a former professional gambler, I suspected that I would find this discussion tiresome, and I was right.
- Jeff Ma claimed that the main problem with sports gambling as a profitable enterprise is the psychological strain of potentially losing for “weeks” at a time (hmm… as opposed to, say, that other game that people play for money called the stock market)?
- Everyone on the panel seemed to be in agreement that the sports-betting markets are heading for near-perfect efficiency, which I guess is probably true. But I’m not sure it follows, as everyone up there seemed to think it does, that combined with the rake, this will necessarily make sports-betting an unprofitable endeavor. This is pure (and fresh) speculation, but couldn’t the rake also provide a buffer against market perfection?: E.g., say the line based on casual money would be 7% off for a particular game, and the rake at the book is effectively 5%. We would expect the “smart” money to pour in immediately, but then stop once the line drops to within that 5%, where correcting the line is no longer profitable. This is functionally very similar to the rake being 0%, where the money would keep pouring in until the line was perfect. Either way, you only have to be able to beat the smart money by a small percentage in order for it to be profitable. The rake makes profitable opportunities more rare (as they will only exist when the casual money would be off by more than it), but if the smart money is rational, it shouldn’t make those opportunities any less exploitable. In fact, the relative scarcity could decrease the amount of smart money in the market overall, actually making it less efficient w/r/t individual bets (I’m not saying this scenario is accurate, just that it’s possible).
Anyway, as had been the case all day, the paper presentations were getting all the buzz, so I went to another of those:
How Much Trouble is Early Foul Trouble?
By Phillip Z. Maymin, Allan Maymin, and Eugene Shen.
In this one, the authors really did take a counter-intuitive stand (at least relative to the intuitions of the analytical community), by claiming that the default NBA strategy of pulling starters with Quarter+1 fouls is actually a good thing. The main reasoning is this: while no one seems to disagree that pulling starters in foul trouble reduces the number of minutes they are able to contribute, there is actually strong empirical evidence that starters in foul trouble play much worse, making fewer minutes at full strength more valuable.
With apologies to anyone who was in the room, I very inarticulately tried to ask the author whether this didn’t in fact suggest that the problem was with players reacting to the possibility of ejection sub-optimally, rather than with the strategy of keeping them in itself. The author noted that playing worse to avoid ejection may itself be optimal, and we went around in circles a bit from there.
After wasting everyone’s time, I did get a chance to talk with one of the authors at length about this afterward, and we made much more headway. Upon reflection, I’m increasingly convinced that my point was correct. The issue breaks down like this:
- No one disputes that pulling your starters hurts your team by failing to maximize their number of minutes played. So for our purposes, it’s fair to assume that’s true.
- The main justification given for the default player-pulling strategy is that late-game minutes are more important than early-game minutes. While undoubtedly true to some (small) extent, the conventional wisdom almost certainly overstates this effect. In any case, the authors don’t appear to rely on this difference in forming their conclusions (obv I could be mistaken), so again, for purposes of analyzing their results, it’s fair to assume that all minutes are more or less equal.
- Now, for logical purposes, assume that you could command your players to play exactly the same as usual regardless of their foul situation.
- Consider the strategy of maximizing your best player’s minutes combined with the commandment to play the same regardless of number of fouls.
- Value-added per minute would be the same as normal, yet their average minutes would increase. Thus, this strategy weakly dominates the pulling strategy.
At this point, the author objected that players playing worse than normal to avoid being ejected may actually be playing closer to optimal than if they ignored their foul situation—which is to say, seemingly playing worse may actually be better for their team to some degree.
But this objection misses its mark: if the strategy of not pulling/instructing is better than the strategy of pulling, any further adjustments that the players make toward optimality should make this strategy even better.
Thus, working within the authors’ framework, and combined with the authors’ findings, I think it follows that players are probably making suboptimal adjustments to foul trouble (whether this is a result of bad instructions or inability to follow good instructions doesn’t matter).
Featuring Mark Cuban, John Hollinger, Kevin Prichard, and Mike Zarren. Moderated by Marc Stein.
This was the “main event” of the day, playing to a completely packed main hall. All-in-all, it was thoroughly entertaining and interesting, if not especially provocative or informative. As someone tweeted, Mark Cuban appears to be genetically engineered for panel discussions, and he was wearing a T-shirt (I believe the only one in the room) reading “talk nerdy to me.”
Here are some interesting things that came up:
- Cuban repeatedly said that the main problem with analytics is coaching: Every coach thinks they can coach a guy up, but you don’t know how a player will actually respond, etc. This seems like just another variable to me, but interesting to think about.
- Lots of interesting stuff on trades between Cuban, Pritchard, and Zarren. Cuban thinks all that action right before the deadline is arbitrary, but I think that’s incorrect as a game-theoretical matter: the deadline adds credibility to walk-away threats. Zarren seemed charmingly shaken-up by Perk trade. Very interesting to me: Cuban & Prichard both said that teams deal with each other differently based on their reputations as analytics-oriented or not. At first I thought they meant that the more analytical teams were shunned by the more old-fashioned, but then it also sounded like maybe they were saying more analytical teams are wary of each other.
- 2/3 of teams have analytics operations of some sort. I haven’t decided yet whether that sounds low or high.
- There was a modest discussion of the Heat losing close games and how much of a sample you need to start taking such trends seriously. I believe the answer is probably smaller than most analytical types think: As I’ve discussed before, it can be shown that winning is a skill beyond a random walk of points scored and allowed. I assume that a lot of this comes from superior execution in close games (although this could be looked at in more detail). Bayes’ Theorem does the rest.
- One of the audience questions was about why NBA players are such bad free throw shooters. Pritchard gave a series of lengthy answers having to do with pressure, but I think those miss the main point. The more you are required to do aside from free throws, the more variables are going to go into your selection to play in the NBA—thus, the more likely it is that your strengths will lie in other areas. NBA players are the most talented free throw shooters in the world for people with their skillsets. This is almost true by definition for the present, but I suspect it is probably true historically as well. If we asked how good of a free throw shooter every player in the league is relative to the history of players who do everything else as well as they do, I suspect that modern players are the best free throw shooters ever. E.g., Dwight Howard may seem like a clanker, but he may be the best free throw shooter among people who play center as weIl as he does in NBA history.
Guts vs. Data: How Do Coaches Make Decisions?
Featuring Del Harris, Mike Leach, Eric Mangini, and Steve Pagliuca. Moderated by Howard Beck (New York Times).
I thought this panel was awful. Del Harris rambled on and on about stuff that apparently a lot of people found entertaining—maybe I was just burnt out. He did apparently use the term “statted out” repeatedly, which I love. As in: “That player statted out as a second round draft pick, but we thought he was better than that” (not an actual example).
Mangini again offered us a taste of his extensive wisdom, such as: when deciding whether to go for it on 4th down, teams should take distance and field position into account.
I can’t remember anything Leach said, but it wasn’t much better.
Overall, it seemed like a bunch of the old guard lamenting rather than celebrating the onset of sports analytics. They literally all guffawed over the idea that things were so much easier back before they had all these stats to confuse them.
Featuring Mark Cuban, Mike Carey (actual NFL ref), Jon Wertheim (Sports Illustrated), and Phil Birnbaum. Moderated by Bill Simmons.
This panel also featured Cuban, who was on his best behavior despite Bill Simmons’s repeated attempts to get him fined. Cuban offered many knowing smiles before declining to answer pointed questions, much to the audience’s amusement. A couple of notes:
- Bill Simmons was a great moderator. He introduced a number of provocative topics and asked relevant, very challenging, questions and follow-ups. He was such a great moderator, in fact, that he basically demonstrated the major flaw with all of the other panels that I saw: lack of confrontation! E.g., Simmons literally tried to start a labor war between the NFL and its refs, but Carey wouldn’t bite. He also asked Carey what his biggest mistake was as a ref, and Carey responded “probably agreeing to do this panel.”
- Shortly after Simmons cracked one of his many (nearly identical) jokes about how few women were in attendance, a woman asked a great question that I thought was very interesting and didn’t get an adequate response from the panel (paraphrasing): If the regular season is all about jockeying for home court advantage in the playoffs, and the main source of home court advantage is officiating bias, are we really sure that we want unbiased officials? Obv this seemed to be a natural offshoot of the earlier paper about where HFA comes from, but it was met mostly with platitudes about fairness. I’m not so sure: I mean, even more broadly, is home field advantage really something we’re willing to sacrifice? Not just in the playoffs, but from game to game: paying fans love it when home teams win! Of course, you can’t codify unfairness, so perhaps the optimal solution is actually the status quo: tolerate biased refs but act like you don’t.
Overall, a great day for fans of sports analysis.