Vallandingham discussed “stepper graphics” at OpenVis Conf, and I’ve noticed moreofthem since I wrote about appreciating maps that walk users through stories. Claudia Núñez, my editor at Hoy and founder of Migrahack, had a special request for a visualization of sports data: use bubbles.
At first I balked at bubble viz because bar charts can be more effective, but Hoy sports staff wanted to explore Major League Soccer player salaries by nationality, and I felt more comfortable being creative with a sports dataset. A vision came to me as I tried to sleep one night, a vision with soccer balls representing player salaries organizing themselves according to different categories.
The skewed distribution of salaries made visualization tricky. Also more than half of MLS players were born in the United States, and I was surprised to discover so few players were born in Mexico. Working with sports editor Eduard Cauich, I included these quirks as steps in the interactive to help the user understand the salary data. The finished project applied this sequence of operations on svg circle elements:
bounce the 25 highest salaries around force layout
list circles in rows by salary, team and country
plotting circles on a topojson map
I probably spent the most time fine-tuning my technique listing the circles in rows and had some funny slips along the way.
I also went back and forth considering whether to plot the circles on the map as lists or using multiple force layouts clustered around map points. I chose the linear presentation in the end because my network implementation ran slowly on my phone.
I used an exponential scale to size the circles because the skewed distribution made a linear scale difficult to read. I also assigned colors in five categories based on what I determined to be significant breaks on the distribution. These were subjective decisions.
Katie and I went all in on the U.S. women’s national soccer team as they prepared for the World Cup. We supported them at a friendly match, watched every tournament game and cheered them on at a trophy rally.
Carli Lloyd’s midfield goal in the title game was perhaps the most incredible moment of the title run. Inspired by a New York Times 3D graphic of a key Super Bowl play and some tools I learned at a recent meetup, I tried to recreate the moment Lloyd struck the ball with graphics software. Click the images to see a webGL version, but be warned: the page loads an 8MB data file.
I found an awesome model of the World Cup stadium and used a soccer data tutorial and game footage to place the ball and key players as accurately as I could in Sketchup. I’d never used WebGL before, so I learned how to export Sketchup models for use in three.js and hacked my way around positioning the camera with the TweenMax library. I would have liked to more closely replicate the lighting and shadows that affected the goal, but I encountered problems.
Unfortunately, even after removing the awesome jumbotron and most of the seating in the incredible arena model, the data file is still way too large to reasonably serve online. I’ll try to build on this experience and do more efficient 3D projects in the future.
I was most interested in seeing what Lloyd and Japan goalkeeper Ayumi Kaihori saw when Lloyd took the shot, and my WebGL page animates between their two perspectives. Here are some other interesting angles from my Sketchup project.
I started fiddling with turf.js during a recent Maptime meetup. I forked the demo map and used turf to find pizza shops nearest to neighborhood council offices.
It got me thinking about the distance formula and a scatterplot by LA Times reporter Jon Schleuss that used FBI crime data. I wrote my first Makefile to grab a similar dataset of state crime rates and make a d3 scatterplot with ProPublica StateFace icons.
When a state is selected, the distance formula loops each point and sorts to find the state with the “nearest” crime rates, according to the scatterplot x and y coordinates.
“Nearest” probably has more statistical meaning than I realize, and I could have used other, more efficient d3 algorithms like quadtrees.
I also think having a short distance between x and y coordinates is not the same as being “similar.” For example, Washington’s property crime rate dropped while its violent crime rate rose, but the point closest to Washington is Colorado, which had declines in both crime types. An algorithm for the most “similar” changes in crime rates could consider whether rates increased or decreased in addition to the distance between values.
I like scatterplots because they lay out all the data, but I’ll continue exploring interactive algorithms as ways to guide the user.
It’s happened several times since I moved to Los Angeles. I’ll get off a bus or take the wrong freeway exit and find myself asking, “what neighborhood is this?” Fortunately, news developers built tools to answer this question quickly with maps.
Using those resources, I whipped up an interactive map that finds the user’s L.A. Times neighborhood location and provides a link to neighborhood data compiled by the Times. It’s only useful in LA County, and surely there are similar services, but I’ll always know where to find my tool: danhillreports.com/where/.
While reporting a story on Chicago gun regulations, I pulled a year’s worth of gun possession and shooting crimes from the city data portal (first mistake). I wanted to explore the relationship between the types of crimes but wasn’t confident in a statistical or visualization method. I applied the correlation and Bayesian probability techniques I was learning in my math class to the data, but I couldn’t grasp the output.
For two years, an ONA recap of an event in Minnesota that mentions my side project haunted me because the same data files continued to sit on my hard drive, unfinished.
Then I learned about bivariate choropleth maps and the possibilities of showing two variables at the same time with colors. This fantastic how-to by Joshua Stevens showed me the way, and I found a real-world journalism application of the method with goats and sheep in the Washington Post.
So here’s my first bivariate choropleth, which shows gun possession and assault with firearm crimes in Chicago binned into hexagons. Each hexagon has at least one possession or shooting incident. As with any visualization, make sure you understand the legend.
Selecting color breaks was tricky because the distributions of the possession and shooting crime datasets are both skewed. I’m relieved to have finally mapped this dataset, but you’ll probably learn more about gun regulation by watching my final project video than looking at that map.
Data journalist from Sacramento, not the r&b singer