A couple years ago Uber claimed that it had reduced DWIs in Seattle. I analyzed that claim here (and an interesting extension by Lindsay Pettingill here). This claim is now coming back and there’s a pretty compelling test-ground for it: both Uber and Lyft withdrew from Austin, TX early in 2016. A number of media outlets have made the claim that this has increased DWIs in Austin.
So: new data! Maybe it’s time to update my priors! I went and grabbed monthly DWI arrests for 2010-2016 from the Austin Police Department (previous years are not available AFAIK). You can download the data here. Data is missing for August 2010 and 2011 and I have added a dummy variable for months where Uber was operating in Austin (June 2014 to April 2016).
One year is driving the results
That’s the headline finding. First, let’s look at a couple of descriptive graphs. Here’s the year by year total for DWIs in Austin (all graphs are made with my open-source graphing software for making presentation-ready graphs, Playfair!):
Our President-elect is a man who calls women fat or ugly to dismiss them, makes crude references to menstruation, and boasts about getting away with sexual assault. The United States has elected a misogynist (also racist, also other ists), a cartoonishly crude cave-man who will nominate at least one Supreme Court justice and set the tone for the discussion of gender in America. This election was a referendum on mens’ right to cease emotional maturation at the age of 18, and America has ruled that this is acceptable. This probably wasn’t the intent of most of the people who voted for Trump, but when you elect a bigot and a bully to the highest office in the land, you embolden bigots and bullies everywhere.
I’m going to try to do something quick for Andy Kriebel and Andy Cotgreave’s #makeovermonday every week so I can continue showcasing some of the different types of graphs you can make in my graphing web app Playfair and so I can identify and quash bugs! This week they chose an interactive from David McCandless’s informationisbeautiful.net showing the number of records leaked in various data breaches between 2004 and 2016. Andy gives a nice run-down of the pros and cons of the original graphic.
I made the mistake of looking at a few early entries and a major theme that seems to have struck several people is the division between data breeches that were the result of hacks and those that weren’t (including user error and lost equipment). A quick look at the data shows that the former are increasing rapidly. My first thought was that an area chart would show this trend nicely, but then I remembered that I recently implemented a variation on area charts that might be neat here. I’m actually not sure what you call this kind of chart, but it’s simply an area chart with two categories where both areas originate from y=0. The area for one category is above the x-axis and the area for the other is below it. Here’s my entry for this data set:
Stacked bar charts are often bad. It can be hard to compare categories that don’t start from the axis, more than a couple of categories can be confusing, and totals can be deceptive. Take a look at this stacked bar chart from Vox.com and see if you understand what’s going on in less than a couple of seconds. The data is simple and the shifts in public opinion are large, so it’s certainly not difficult to see what the graph is trying to say, but this design is muddy at best.
I re-made the graph in my web-app for creating presentation ready charts, Playfair. The major sin of this graph is that there is little distinction between the first two survey answers, which are ‘good’ answers, and the second two ‘bad’ answers. There are two distinct buckets of data here but in the Vox graph all data is on the same color spectrum and no attempt is made to draw visual attention to this distinction. The total length of each bar is only meaningful in the sense that public opinion must add to 100%, but I find even this a little confusing as these bars *do not* sum to 100, which initially made me think that the total length of each bar had meaning.
Andy Kriebel runs a twitter hashtag #makeovermonday where he revises an existing data visualization every Monday and invites his Twitter followers to do the same. The competition is Tableau-centric (or maybe even exclusive) but I’m going to inject a teensy bit of Playfair into it as an excuse to make a new Playfair graph every now and then and show off some of the app’s capabilities.
This week’s visualization is pretty basic – it started out as a mildly crummy bubble chart and the obvious thing to do is to make it a bar chart. I added a tiny bit of complexity by dividing my bars into owned and chartered capacity (all available at the original datasource, alphaliner), ordering the dataset by owned capacity, and fading out the chartered portion of the bars a little bit (you don’t have to do this by bar in Playfair, you can just fade out the chartered key element by right-clicking on it). This lends some additional visual importance to owned capacity. I have no domain knowledge here so I’m not sure if this division between owned and chartered capacity is interesting, but it seemed like a significant division in the data.
I’ve added another video to my YouTube channel demonstrating how to make the above graph in my web app for graphing data, Playfair. Find out more about Playfair at the github page. I’ve made a simple tutorial that walks you through a couple Playfair graphs. And you can try Playfair online here (Chrome only right now).
Alberto Cairo called out this crappy Fox News chart the other day and solicited remakes:
He collected the resulting submissions in a blog post. One submission that caught my eye was Catherine Mulbrandon’s. This submission has some shading and a fair amount of annotation on it, both of which showcase Playfair’s talents. It also serves as a simple example of how to create a line graph with two series.
I hadn’t intended to do another of these so soon but Nate Silver had a piece at fivethirtyeight today about Trump’s convention bounce. There’s one graph with the post and it caught my eye because it’s a great showcase for Playfair. Before I go further, take a look at the post.
So here’s why I really like the graph. It has a lot of annotation on it, and it uses two different chart elements – lines and shading. Creating the shading you see here is a snap in Playfair. Before I get to that, here’s a little gif showing you how I set up my data. Data in Playfair has to be long instead of wide so here you can see how I converted from wide format data to long (incidentally, the data isn’t given in the post, so I made it up to match the graph, but clearly I’m a little off).
One feature of Playfair that I haven’t really talked about yet is the ability to create themes. There isn’t any documentation on this yet so I’m going to give a quick walkthrough here by looking at the example theme that comes in the Github repo (the default theme is built into Playfair so that you can load it without hosting Playfair).
Because it’s easy to get the data, I’m going to re-create this FRED graph showing how some economic data can be leading indicators for unemployment. Let’s start by looking at the example theme file. Here’s the beginning of it:
This is going to be a semi-regular feature where I take some graphs I found and re-make them in my web app for charting data, Playfair. Find out more about Playfair at the github page. I’ve made a simple tutorial that walks you through a couple Playfair graphs. And you can try Playfair online here (Chrome only right now).
Two-dimensional scatter plot
Someone in my Twitter timeline posted this (15-month old) piece from the Pew Research Center. I liked the idea of remaking the first graph because it’s a bit unorthodox – basically a one-dimensional scatter plot with quite a few annotations. Here’s the Playfair version: