MIT Sloan Sports Analytics Conference, Day 1: Recap and Thoughts

This was my first time attending this conference, and Day 1 was an amazing experience. At this point last year, I literally didn’t know that there was a term (“sports analytics”) for the stuff I liked to do in my spare time. Now I learn that there is not only an entire industry built up around the practice, but a whole army of nerds in its society. Naturally, I have tons of criticisms of various things that I saw and heard—that’s what I do—but I loved it, even the parts I hated.

Here are the panels and presentations that I attended, along with some of my thoughts:

Birth to Stardom: Developing the Modern Athlete in 10,000 Hours?

Featuring Malcolm Gladwell (Author of Outliers), Jeff Van Gundy (ESPN), and others I didn’t recognize.

In this talk, Gladwell rehashed his absurdly popular maxim about how it takes 10,000 hours to master anything, and then made a bunch of absurd claims about talent. (Players with talent are at a disadvantage! Nobody wants to hire Supreme Court clerks! Etc.) The most re-tweeted item to come out of Day 1 by far was his highly speculative assertion that “a lot of what we call talent is the desire to practice.”

While this makes for a great motivational poster, IMO his argument in this area is tautological at best, and highly deceptive at worst. Some people have the gift of extreme talent, and some people have the gift of incredible work ethic. The streets of the earth are littered with the corpses of people who had one and not the other. Unsurprisingly, the most successful people tend to have both. To illustrate, here’s a random sample of 10,000 “people” with independent normally distributed work ethic and talent (each with a mean of 0, standard deviation of 1):

The blue dots (left axis) are simply Hard Work plotted against Talent. The red dots (right axis) are Hard Work plotted against the sum of Hard Work and Talent—call it “total awesome factor” or “success” or whatever. Now let’s try a little Bayes’ Theorem intuition check: You randomly select a person and they have an awesome factor of +5. What are the odds that they have a work ethic of better than 2 standard deviations above the mean? High? Does this prove that all of the successful people are just hard workers in disguise?

Hint: No. And this illustration is conservative: This sample is only 10,000 strong: increase to 10 billion, and the biggest outliers will be even more uniformly even harder workers (and they will all be extremely talented as well). Moreover, this “model” for greatness is just a sum of the two variables, when in reality it is probably closer to a product, which would lead to even greater disparities. E.g.: I imagine total greatness achieved might be something like great stuff produced per minute worked (a function of talent) times total minutes worked (a function of willpower, determination, fortitude, blah blah, etc).

The general problem with Gladwell I think is that his emphatic de-emphasis of talent (which has no evidence backing it up) cheapens his much stronger underlying observation that for any individual to fully maximize their potential takes the accumulation of a massive amount of hard work—and this is true for people regardless of what their full potential may be. Of course, this could just be a shrewd marketing ploy on his part: you probably sell more books by selling the hope of greatness rather than the hope of being an upper-level mid-manager (especially since you don’t have to worry about that hope going unfulfilled for at least 10 years).

Performance and Injury Analytics

Featuring John Brenkus (ESPN Sports Science guy), Will Carroll (Sports Illustrated), and others. Moderated by Peter Keating (ESPN).

From what I stayed for, this panel was pretty disappointing. There are two burning “analytical” questions about injuries that I would love to know the answers to:

Is there really such a thing as “injury-prone” (i.e., how strongly do injuries correlate with more injuries)? And,
How much do injuries affect a player’s future performance?

But rather than addressing these issues, the panel seemed to focus more on concussions and how bad they are, and whether pre-season games are worth it, etc. (including an incredible claim by one panelist that 70% of all NFL injuries occur within the first two weeks of minicamp, which I am not even close to believing without seeing the data for myself).

The Real Reasons Behind the Home Field Advantage

By Tobias J. Moskowitz.

This talk got a lot of buzz all day, but by the time I showed up it was almost over and observers were 3-deep out the door. The Cliffs Notes version: Biased refs. Unfortunately I missed most of the details, but apparently it relied heavily on European soccer data. I will definitely go back into my materials to look this one over, but it seems like a plausible and obviously very meaningful hypothesis.

What Optical Tracking Data Says About NBA Field Goal Shooting

By Sandy Weill.

This talk was an extremely rich look at a number of different things that can be examined with the most cutting-edge data out there. I think it may have literally been an infomercial for STATS Inc. (please correct me if I’m wrong), but it was very impressive regardless. The biggest disappointment to me is that their new system tracks approximately 1 million data points *per game* (yes, that’s double-bold), yet STILL doesn’t tell us how much time was left on the shot clock when each shot was taken. Argh!

Highlights included:

Number of defenders in the vicinity is much more important than proximity of defender(s).
Shots immediately following a pass are more successful, even when controlling for distance and defensive positioning.
The presenter did a long, detailed hypothesis-test about whether there was arbitrage opportunity for players to “step back” and take worse shots against softer defenses, but concluded that there wasn’t.
Maybe the most interesting part to me was the ultra-nerdy discussion of how they cleaned the new data-set by matching up inconsistencies with the old data-set.

Overall, the guy admitted that he hadn’t really made many ostensibly exciting or counter-intuitive discoveries, but I think that’s probably a better result on balance. While it may not be as sexy, I think a system that confirms a bunch of prior beliefs is more likely to be right when it finally finds something wrong than one that finds fault everywhere it looks.

Football Analytics

Featuring Eric Mangini, Aaron Schatz, and others. Moderated by Gary Belsky (ESPN the Magazine).

I was warned that this panel is usually a snore, and it was. Aaron seemed like the only one who knew what he was talking about, and even he was being modest and charitable. Some of the low-lights include:

A lengthy discussion of “subjectivity” in football statistics, as if this were a meaningful problem. Um, data doesn’t have to be perfect to be useful. This is a clear case of “taboo”: “subjective” is a taboo word that people think is supposed to be bad w/r/t data, and thus they reflexively react negatively whenever they see “subjective” and “data” in the same paragraph. Then they say the data is “flawed” and go back to making decisions the old way—which is to say, based on their subjective intuitions.
When asked about why NFL teams don’t go for it on 4th downs, Eric Mangini rambled on about how it may look like you should go for it based on league averages, but in the circumstances of a particular game there are a lot of other factors to consider, like your kicker might be tired or there might be too much wind or something. In addition to everything else that’s wrong about this answer, can I just note the obvious point that both of those things would seem to favor going for it instead of kicking?
Somebody gave us this brilliant introduction to game theory: If teams started going for it on 4th down, defenses might adjust to try to stop them from converting on 4th down! While this might be relevant for fake punts and fake field goals, I’m pretty sure that in most cases defenses do that already (relatedly, moratorium on “Schrodinger’s Cat” metaphors, please).

Anyway, I didn’t stay to see if it got any better.

Sports Gambling: The Source of Sports Innovation?

Featuring Jeff Ma (of Brining Down the House fame), Michael Konik, and others. Moderated by Chad Millman (ESPN).

I only caught a few minutes of this one. As a former professional gambler, I suspected that I would find this discussion tiresome, and I was right.

Jeff Ma claimed that the main problem with sports gambling as a profitable enterprise is the psychological strain of potentially losing for “weeks” at a time (hmm… as opposed to, say, that other game that people play for money called the stock market)?
Everyone on the panel seemed to be in agreement that the sports-betting markets are heading for near-perfect efficiency, which I guess is probably true. But I’m not sure it follows, as everyone up there seemed to think it does, that combined with the rake, this will necessarily make sports-betting an unprofitable endeavor. This is pure (and fresh) speculation, but couldn’t the rake also provide a buffer against market perfection?: E.g., say the line based on casual money would be 7% off for a particular game, and the rake at the book is effectively 5%. We would expect the “smart” money to pour in immediately, but then stop once the line drops to within that 5%, where correcting the line is no longer profitable. This is functionally very similar to the rake being 0%, where the money would keep pouring in until the line was perfect. Either way, you only have to be able to beat the smart money by a small percentage in order for it to be profitable. The rake makes profitable opportunities more rare (as they will only exist when the casual money would be off by more than it), but if the smart money is rational, it shouldn’t make those opportunities any less exploitable. In fact, the relative scarcity could decrease the amount of smart money in the market overall, actually making it less efficient w/r/t individual bets (I’m not saying this scenario is accurate, just that it’s possible).

Anyway, as had been the case all day, the paper presentations were getting all the buzz, so I went to another of those:

How Much Trouble is Early Foul Trouble?

By Phillip Z. Maymin, Allan Maymin, and Eugene Shen.

In this one, the authors really did take a counter-intuitive stand (at least relative to the intuitions of the analytical community), by claiming that the default NBA strategy of pulling starters with Quarter+1 fouls is actually a good thing. The main reasoning is this: while no one seems to disagree that pulling starters in foul trouble reduces the number of minutes they are able to contribute, there is actually strong empirical evidence that starters in foul trouble play much worse, making fewer minutes at full strength more valuable.

With apologies to anyone who was in the room, I very inarticulately tried to ask the author whether this didn’t in fact suggest that the problem was with players reacting to the possibility of ejection sub-optimally, rather than with the strategy of keeping them in itself. The author noted that playing worse to avoid ejection may itself be optimal, and we went around in circles a bit from there.

After wasting everyone’s time, I did get a chance to talk with one of the authors at length about this afterward, and we made much more headway. Upon reflection, I’m increasingly convinced that my point was correct. The issue breaks down like this:

No one disputes that pulling your starters hurts your team by failing to maximize their number of minutes played. So for our purposes, it’s fair to assume that’s true.
The main justification given for the default player-pulling strategy is that late-game minutes are more important than early-game minutes. While undoubtedly true to some (small) extent, the conventional wisdom almost certainly overstates this effect. In any case, the authors don’t appear to rely on this difference in forming their conclusions (obv I could be mistaken), so again, for purposes of analyzing their results, it’s fair to assume that all minutes are more or less equal.
Now, for logical purposes, assume that you could command your players to play exactly the same as usual regardless of their foul situation.
Consider the strategy of maximizing your best player’s minutes combined with the commandment to play the same regardless of number of fouls.
Value-added per minute would be the same as normal, yet their average minutes would increase. Thus, this strategy weakly dominates the pulling strategy.

At this point, the author objected that players playing worse than normal to avoid being ejected may actually be playing closer to optimal than if they ignored their foul situation—which is to say, seemingly playing worse may actually be better for their team to some degree.

But this objection misses its mark: if the strategy of not pulling/instructing is better than the strategy of pulling, any further adjustments that the players make toward optimality should make this strategy even better.

Thus, working within the authors’ framework, and combined with the authors’ findings, I think it follows that players are probably making suboptimal adjustments to foul trouble (whether this is a result of bad instructions or inability to follow good instructions doesn’t matter).

Basketball Analytics

Featuring Mark Cuban, John Hollinger, Kevin Prichard, and Mike Zarren. Moderated by Marc Stein.

This was the “main event” of the day, playing to a completely packed main hall. All-in-all, it was thoroughly entertaining and interesting, if not especially provocative or informative. As someone tweeted, Mark Cuban appears to be genetically engineered for panel discussions, and he was wearing a T-shirt (I believe the only one in the room) reading “talk nerdy to me.”

Here are some interesting things that came up:

Cuban repeatedly said that the main problem with analytics is coaching: Every coach thinks they can coach a guy up, but you don’t know how a player will actually respond, etc. This seems like just another variable to me, but interesting to think about.
Lots of interesting stuff on trades between Cuban, Pritchard, and Zarren. Cuban thinks all that action right before the deadline is arbitrary, but I think that’s incorrect as a game-theoretical matter: the deadline adds credibility to walk-away threats. Zarren seemed charmingly shaken-up by Perk trade. Very interesting to me: Cuban & Prichard both said that teams deal with each other differently based on their reputations as analytics-oriented or not. At first I thought they meant that the more analytical teams were shunned by the more old-fashioned, but then it also sounded like maybe they were saying more analytical teams are wary of each other.
2/3 of teams have analytics operations of some sort. I haven’t decided yet whether that sounds low or high.
There was a modest discussion of the Heat losing close games and how much of a sample you need to start taking such trends seriously. I believe the answer is probably smaller than most analytical types think: As I’ve discussed before, it can be shown that winning is a skill beyond a random walk of points scored and allowed. I assume that a lot of this comes from superior execution in close games (although this could be looked at in more detail). Bayes’ Theorem does the rest.
One of the audience questions was about why NBA players are such bad free throw shooters. Pritchard gave a series of lengthy answers having to do with pressure, but I think those miss the main point. The more you are required to do aside from free throws, the more variables are going to go into your selection to play in the NBA—thus, the more likely it is that your strengths will lie in other areas. NBA players are the most talented free throw shooters in the world for people with their skillsets. This is almost true by definition for the present, but I suspect it is probably true historically as well. If we asked how good of a free throw shooter every player in the league is relative to the history of players who do everything else as well as they do, I suspect that modern players are the best free throw shooters ever. E.g., Dwight Howard may seem like a clanker, but he may be the best free throw shooter among people who play center as weIl as he does in NBA history.

Guts vs. Data: How Do Coaches Make Decisions?

Featuring Del Harris, Mike Leach, Eric Mangini, and Steve Pagliuca. Moderated by Howard Beck (New York Times).

I thought this panel was awful. Del Harris rambled on and on about stuff that apparently a lot of people found entertaining—maybe I was just burnt out. He did apparently use the term “statted out” repeatedly, which I love. As in: “That player statted out as a second round draft pick, but we thought he was better than that” (not an actual example).

Mangini again offered us a taste of his extensive wisdom, such as: when deciding whether to go for it on 4th down, teams should take distance and field position into account.

I can’t remember anything Leach said, but it wasn’t much better.

Overall, it seemed like a bunch of the old guard lamenting rather than celebrating the onset of sports analytics. They literally all guffawed over the idea that things were so much easier back before they had all these stats to confuse them.

Referee Analytics

Featuring Mark Cuban, Mike Carey (actual NFL ref), Jon Wertheim (Sports Illustrated), and Phil Birnbaum. Moderated by Bill Simmons.

This panel also featured Cuban, who was on his best behavior despite Bill Simmons’s repeated attempts to get him fined. Cuban offered many knowing smiles before declining to answer pointed questions, much to the audience’s amusement. A couple of notes:

Bill Simmons was a great moderator. He introduced a number of provocative topics and asked relevant, very challenging, questions and follow-ups. He was such a great moderator, in fact, that he basically demonstrated the major flaw with all of the other panels that I saw: lack of confrontation! E.g., Simmons literally tried to start a labor war between the NFL and its refs, but Carey wouldn’t bite. He also asked Carey what his biggest mistake was as a ref, and Carey responded “probably agreeing to do this panel.”
Shortly after Simmons cracked one of his many (nearly identical) jokes about how few women were in attendance, a woman asked a great question that I thought was very interesting and didn’t get an adequate response from the panel (paraphrasing): If the regular season is all about jockeying for home court advantage in the playoffs, and the main source of home court advantage is officiating bias, are we really sure that we want unbiased officials? Obv this seemed to be a natural offshoot of the earlier paper about where HFA comes from, but it was met mostly with platitudes about fairness. I’m not so sure: I mean, even more broadly, is home field advantage really something we’re willing to sacrifice? Not just in the playoffs, but from game to game: paying fans love it when home teams win! Of course, you can’t codify unfairness, so perhaps the optimal solution is actually the status quo: tolerate biased refs but act like you don’t.

Overall, a great day for fans of sports analysis.

7 Responses to “MIT Sloan Sports Analytics Conference, Day 1: Recap and Thoughts”

Sander says:

March 5, 2011 at 11:01 am

Malcolm Gladwell’s argument reminds me of the argument that to be a millionaire all you have to do is take risks and believe in yourself/your idea, because that’s what all millionaires do(read a random biography of a rich person and you’ll see this argument). Problem is: that ignores all the people who do that and fail.

I find the sports betting argument interesting because it dovetails closely with what I did for a living for a couple years: poker. And yes, the biggest problem with playing poker and gambling is dealing with an extended losing streak, and yes that also holds true for the stock market (there are hundreds of books on how to handle stock market swings mentally out there). I’ve seen a lot of seemingly good pro poker player play well and make a profit for months on end, only to completely blow up once they hit a rough spot and throw away all their money in a matter of two days. Other times it’s more subtle, like a refusal to play for lower stakes because of ego concerns. Gambling, playing poker and gaming the stock market all aren’t that hard in and of themselves and you can teach everyone to do that, but it’s a lot harder for people to deal with it mentally.
Interestingly in the early years of online casinos you could see this quite clearly. Those casinos used to hand out certain sign-up bonuses, which would make playing on their site profitable. A whole sub-culture of gaming these bonuses arose, and a lot of people made some money off that. The casinos knew this but didn’t care, because a lot of the people who were trying to game these sites for a profit couldn’t do it correctly and blew up their bankroll – because they couldn’t handle the mentality needed to do this sort of thing.

In any case, I don’t know if it’s that easy to say that sports gambling will turn into a perfect market precisely for the reason you illustrate: there will always be dumb money, and as such there will always be money to be made for some people. It’ll get harder, but I don’t see it becoming a perfect market well, ever.
The same thing has been happening for years in poker. The games are getting harder and harder, but there are always people who will throw away their money. There will always be people who profit from paying poker, that group will simply get smaller.

As for home field advantage, I don’t think it matters if we get rid of it or not (which I don’t think is doable if the source is biased refs, but that’s beside the point). Whether home field is important for teams or not in terms of winning percentage, they will always be important for teams in terms of gate receipts and hence teams will place a premium on getting home field advantage in the playoffs.
Furthermore, for public perception I don’t think it matters whether home field advantage is real or not. People will continue to believe it is regardless of the reality of the situation.

Reply
Tony C. says:

March 9, 2011 at 7:53 pm

Thanks for your interesting and lively review. With regards to this:

“NBA players are the most talented free throw shooters in the world for people with their skillsets. This is almost true by definition for the present, but I suspect it is probably true historically as well. If we asked how good of a free throw shooter every player in the league is relative to the history of players who do everything else as well as they do, I suspect that modern players are the best free throw shooters ever. E.g., Dwight Howard may seem like a clanker, but he may be the best free throw shooter among people who play center as weIl as he does in NBA history.”

I’d say the following:

Jabbar and Olajuwon were both clearly better all-around centers than Howard, and were far superior free-throw shooters. Pedantic, perhaps, but still…

I’m also not completely clear on your use of skillsets in this context. I say that because, for example, WNBA players, while obviously nowhere near comparable physically, develop similar skillsets and are (as a group) far superior free-throw shooters relative to their male counterparts. So, even though it is apples and oranges in terms of their level of play, how do you explain the striking disparity between male and female professional basketball players when it comes to free-throws?

Lebron James, a player whose “skillset” is obviously off the charts, is an inferior free-throw shooter relative to most earlier, dominant non-centers who were good outside shooters.

Etc.

Finally, I’ve long been struck by the fact that free-throw shooting might be the most important skill in any sport that many highly paid professional athletes choose not to develop to their full potential for aesthetic reasons. I have no doubt whatsoever that an extremely high percentage of the poor free-throw shooters in the NBA (especially big men) would see their percentages rise to a meaningful degree if they were to switch to an underhand style. However, as Rick Barry discovered when he attempted to convince some players years ago to switch, the embarrassment of “shooting like a girl” prevents virtually all players from making a change that could potentially help their respective teams enormously.

As a related aside, I don’t think that there is any doubt that physics plays a role in that it is an important advantage to be able to initiate a shot from well below the rim. In other words, really big men are at a distinct disadvantage given the lack of arc on their shots. This is also a likely contributing factor to the superiority of WNBA players.

Reply
- Sam says:
  
  April 26, 2012 at 1:40 pm
  
  I know this is super late reply to this post, but a women’s basketball is 28.5″ in diameter compared to 30″ for a men’s ball. I would assume this accounts much of FT% advantage that women have.
  
  Reply
Was sabermetrics, now analytics. « Code and Football says:

April 13, 2011 at 7:50 am

[…] this kind of analysis is going in-house (a group of speakers (including Mark Cuban) are quoted here as saying that perhaps 2/3 of all basketball teams now have a team of analysts), it’s being […]

Reply
Alex T says:

March 30, 2012 at 9:51 am

As a student of sport psychology, I can say from reading studies on “talent” that the maxim postulated above is not actually Gladwell’s. Gladwell merely rehashes actual studies/literature done on talent and they are in fact out there. A lot of practice but also specific practice has been conclusive among some other requirements like social support, sport specific financial requirements are key. It sounds simpler than it is but it’s not entirely off base either.

Reply
Great plays + bad decision = season over – Steve Prestegard.com: The Presteblog says:

January 22, 2016 at 7:01 am

[…] situation-specific variables affect the balance of probabilities. And the variables cited often don’t even cut the way they think they do. For example: In this case, an oft-cited factor is that the Packers’ receiving corps was […]

Reply
NFL Coaches Are Getting Away With Crimes Against Middle-School Math – FiveThirtyEightAll Breaking News | All Breaking News says:

January 24, 2016 at 9:28 am

[…] situation-specific variables affect the balance of probabilities. And the variables cited often don’t even cut the way they think they do. For example: In this case, an oft-cited factor is that the Packers’ receiving corps was […]

Reply