@Cmrn_DP put together some code to make Matplotlib graphs that look like fivethirtyeight.com graphs. I see the attraction–fivethirtyeight graphs have a very simple, attractive look–but I’m not much of a Matplotlib user, so I took a few minutes to try and get the same style in Rs ggplot2 package. Here’s the result:
I have finally completed work on my visual NBA stat: Adjusted Defensive Impact by Court Location. I first explained how this stat works here, but in a nutshell this is a way to visualize how a player defends shots in the NBA adjusting for the other defenders on the court with him, the expected probability of the shot being made, and the other (non-shooting) offensive players on the court. This new model also adds the possibility of a team-wide effect that you might attribute to coaching (this is not visualized in any way just yet). I had many requests to also include something about how players affect the location of a shot. You can now see this at the bottom of each player’s chart. This is a simple regression that controls for defensive players only and shows you how a given player affects the volume of shots that are close (<8 feet), midrange, or 3 pointers. I have lots more to say below the jump but here’s the widget, have fun poking around! Warm colors mean that opponents are more likely to hit a shot when the player is defending, cool colors mean opponents are less likely to hit their shots.
2013-2014 Adjusted Defensive Impact by Court Location
Select a team from the drop down below to see that team’s defense visualized for the 2013-2014 season. There’s nothing fancy going on with these visualizations–just raw, unadjusted comparison to the average. Blue squares indicate that the team defended that location better than league average, while red squares indicate that the team defended that location worse than league average. The numbers give FG% from that location for the opposing team. I talked about these visualizations previously here and here, and explained how I make them here.
I will have adjusted defensive impact for all players in the NBA soon-ish. It’s actually ready to go but I got gun-shy and decided I needed to test alternate model specifications. It might be a couple weeks before I get them all up, but it will be a more reliable product.
2013-2014 NBA team defense compared to average
This is just a catch-all post about methods that I will reference in the future when I post a graph or a regression or whatever. My plan is to update this every time I add something new that I think requires further explanation. So without further ado…
Adjusted defensive impact by court location
From here on out, the way I do the shot-chart visualizations should be fairly stable. There are really only a couple of things that need explanation here. Data is usually current as of the date of the blog post but does not update automatically, so backdate appropriately. All shots taken against a team or by a player or whatever it is are grouped into 1ft x 1ft squares that cover the court. 2 and 3 point shots are not mixed in this process. Basically if a square’s center is inside the arc, it should contain only 2pt shots, and if the square’s center is outside the arc it should contain only 3pt shots.
Face it: Greg Monroe is not going to be dealt. Kyle Lowry is not going to be dealt. Pau Gasol is not going to be dealt. We’re all going to wake up on Friday and ask ourselves, “wait a second, wasn’t yesterday the trade deadline?” Here’s a look at some NBA defenses to ease the pain.
How about them Pacers? The boxes here are colored according to the league average, so blue indicates that opponents shoot worse than the league average when they face the Pacers, while warm colors indicate that opponents shoot better than the league average. No big surprises here–the Pacers have a dominant defense. But something interesting jumped out at me and I’ve flagged it by labeling the PPS (points per shot) of high volume locations. The PPS for some midrange shots is actually higher than the PPS for some 3-pt and rim shots! That’s just crazy. Generally speaking, mid-range shots are a poor value compared to 3s and shots at the rim. In a version of this graph that used 4-week old data, there were even more mid-range locations that paid off, but it looks Indiana has even gotten a little better since then. Just brutal.
Opposing FG% compared to league average, Indiana Pacers
More graphs below the jump!
This is a quick little demonstration I made for my POLS 206 class to demonstrate how single member districts in the House of Representatives can cause weird stuff to happen. The map shows the composition of the House delegation from each state. House delegations from red states are mostly Republican, those from blue states are mostly Democratic, and those from purple states are split in some way.
What I want to demonstrate to my class is the difference between the composition of a state’s House delegation and the popular vote for members of the House in that state. In Maine, for example, two out of two Representatives are Democrats, but 38% of the state’s voters voted for a Republican representative. If you believe that a ‘fair’ House delegation is one in which the number of Rs and Ds reflect the split between Rs and Ds in the state’s voters, then Maine should have 1 Republican rep and 1 Democratic rep (0.38*2=0.76, rounds to 1 Republican). When you mouse over a state, the state will change color to reflect the House delegation split that would most accurately mirror the popular vote split. Both these numbers are shown in the upper right corner of the map.
Partisan Composition of State House Delegations
In keeping with the subject of my last post, I’ve slapped together a partial fix before I start working on a much bigger change in this whole endeavor. One way to deal with binning problems is just to smooth the data somehow. Here’s a picture of a histogram, for example, with both binned data and then a kernel density plot.
I am still tweaking the graphs I’ve already shown and working on some new things but I want to post something about the decisions I made in this process. The biggest puzzle in making these kinds of visuals has to do with binning. Binning data is taking data and sorting it into discrete bins to make it easier to interpret. Here’s a shooting graph where the data has been binned very very little (into 1ft x 1ft squares):