I did some freelance work for ESPN the magazine. You can see it here, I am responsible for the little charts at the bottom of each preview. ESPN’s design department is actually responsible for the look, I just did the analysis and provided them with something more akin to my shot charts, which they then converted, so kudos to them on the nifty design.
This was a fun project but there were a lot of little pitfalls and while most fans won’t care that much, I feel like I have to do a post mortem for the analytics community to explain some of the details.
What the hell are these things?
The ESPN Mag obviously isn’t going to explain all the methodological stuff that went into this so let me give a quick overview. They asked me to predict each team’s defense (so an opposing shot chart for each team basically) based on the the team’s projected starting 5. THIS IS SO IMPORTANT. These are not predictions for the team per se. They are predictions for when all five of a team’s starters are on the court. That’s going to have a big impact on certain teams.
I won’t repeat anything from those two posts. The gist of it is that I am running a ton of RAPMs to try to predict a player’s defensive prowess at particular points on the court. I have made a few modifications since those posts. What I did for ESPN uses two years of data instead of one and uses ridge regression instead of plain OLS.
Are these good predictions?
Fair question. I think that overall they are pretty decent but I admit that this is still a pretty nascent area of bball stats. I would have liked to have done a lot of things differently but I was in the position of needing to get this done in about 3 weeks and having essentially none of the analysis ready to go. In fact I didn’t even really reuse any old code because the requirements were so different in some ways.
I don’t have a lot of data, so I wasn’t able to do much out-of-sample and this is definitely a weakness. I did some in-sample stuff and I think it is pretty robust. If I were to just make a shotchart of opposing shots taken against some common 5-man lineup for last year, and compare it to a prediction, things match up pretty nicely in some ways. That said, there are two huge stinking caveats here:
1. These charts only apply to the starting 5. There are some teams that have subs who specialize in defense, or who have really bad defenders in their starting 5 but better defenders on the bench, and so at the end of the year, their defense is not going to look like what I have. These are not predictions of team defense.
2. These are just predictions of FG%. As I was making these, sometimes I’d get a really weird result, and I’d think ‘how does team x not have a better looking chart? They’re a good defensive team!’ So I’d go look at opposing FG% on NBA.com and I’d try to think about things holistically, and more often than not it all sort of started to make sense. I don’t have an example offhand, but sometimes I’d find out that a team I thought of as a good defensive team was mostly good at forcing turnovers, forcing midrange shots, and avoiding fouls. Those things are great for your DRTG, and you can allow a good FG% at the basket and still be a good defense if you do those things.
Speaking of shot locations…
One thing that has kind of driven me insane with shot charts is accurately portraying where a player shoots from. There’s just no robust way (as far as I know) to represent shot volume accurately when you’re putting a bunch of 1ft x 1ft boxes down on a court. A heat map would be better, but heat maps don’t show FG% so I have made some compromises, detailed in other blog posts, to try to get good volume representation while maintaining ease of interpretation on the shot charts.
This project presented a different challenge. Forcing midrange shots is a super important part of a good defense, but it’s so so hard to show graphically. The difference between the worst defense in the league at forcing midrange shots and the best defense in the league at forcing midrange shots is just 10 percentage points! If you do the math, that means that the very best team should have something like 40% more shot volume in the midrange. That may seem like a lot but visually, you’re talking about the difference between 40 boxes and 56 boxes, and that visual difference is not easy to spot with the naked eye. And that’s the best team to the worst, how do you distinguish between the 5th best team and the 25th best team? I thought about exaggerating the midrange discrepancies, and maybe I should have, but ultimately I just went with what the model spit out. Unfortunately I think the changes in midrange volume are just too damn subtle.
This makes the Lakers look ok
Let me close by addressing some results that surprised me, and will probably surprise you.
1. What’s the deal with the Lakers? Maybe I’m crazy but I stand by this one. I think the Lakers are going to be slightly better defensively this year than they were last year. Here’s the thing: they lost Jodie Meeks and Kendall Marshall. Those guys are bad defensive players. Really bad. Look at just about any APM if you don’t believe me. I don’t try to account for Kobe’s injury so basically my analysis assumes that he hasn’t lost a step. That’s probably not true but we’ll see. Also, if you look at opposing FG% for the Lakers last year, they really weren’t that bad at depressing opposing FG%. They weren’t good, but they weren’t last anywhere and their only really awful rating was >25 feet (29th in the league from there), which I think has something to do with Meeks and Marshall. This goes to point 2 about the charts only showing FG% above.
2. What’s the deal with the Mavs? Ok maybe this one only matters to me. I was shocked that Tyson Chandler doesn’t improve the Mavs interior defense more. First, my model does show him moving shot volume to the midrange, something Dalembert didn’t do. So they’ll get better just because of that. Surprisingly, the model actually attributes a lot of the Mavs’ interior problems to poor guard play. The converse of this is that the Knicks look pretty decent inside! My model is weirdly into Dalembert, and to be fair, he was pretty decent last year when he was actually on the court, but he was only on the court for 20 minutes a game.
3. Are the Pacers really still going to be good? Yes stop it their whole defense isn’t going to fall apart because of PG13.
4. The Pelicans won’t be better inside? Yeah I dunno. I think the interaction between Davis and Asik will produce better results than what I’ve presented but this is definitely something I’m going to be keeping an eye on.
Those were the ones that stood out to me, but hit me up on Twitter or in comments if you have more questions. I don’t have the model up per se so I can’t say much about individual players without booting the whole thing up, but I’ll give what insight I can.