The Evolution of a Bar Chart

How I transformed my chart in a few easy steps

Jason Pauley
10 min readMar 26, 2018

By Jason Pauley

“Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.” This is what Edward Tufte wrote in his book The Visual Display of Quantitative Information. From the simplest bar chart to heat maps, dashboards, radar charts or infographics, our goal is to find the most effective way to organize and interpret our data. Often great analysis can get overlooked because we fail to deliver in the final step, the communication of the data. How we choose to communicate the information is an essential final step that can make or break your project. Often in my career, and this is the same for most analysts, I have had to present a lot of analyses to audiences who are not as passionate about data as I am. But a good data visualization will make your information consumable to all types of audiences. My goal has always been to create visualizations that will help any audience quickly and accurately grasp the message that I’m trying to convey.

This month I came across a data visualization challenge called the #SWDchallenge organized by Cole Knaflic, author of Storytelling with Data. The objective was for the participants to submit their best bar chart about any topic they choose. This seemed quite easy, but it’s not easy to stand out from all the talented people who are submitting their works of art. Using your tool of choice, you can create a bar chart in seconds and it will probably be good enough to deliver the information. But if you really want your message to resonate with your audience, there is a significant level of nuance and detail required. Everything in your chart should be intentional, from grid lines to font color to annotations and labels.

This challenge had no winners or losers, no rankings, no gold medalists, it was more about learning from other examples, exploring the latest trends in data visualization and hopefully adding value with your own ideas. The timing was perfect since I had just finished up two weeks of immersion in the Winter Olympics and I was already tinkering with historic Winter Olympics medal data. So, I already had my subject. Now I had to organize the data and build my visualization. For the remainder of this article, I’ll briefly discuss the data and analysis for this project, but most of the focus will be on the steps that I took to transform my bar chart.

The Analysis

While watching the dominance of the Netherlands in speed skating this year, I recalled their success in past Olympic games. I wondered which nations are most unusually dominant in only one sport in the Winter Olympics. Although my gut feeling was that the Netherlands would be on the short list, I wasn’t sure until I pulled all the data that they would be that specialized. I wanted to further research and visualize the nations that are the biggest outliers in terms of specialization in one sport. Hopefully, I have done these unique and talented nations justice by visualizing just how specialized they are.

My hope is that this chart would inspire a curious mind to dig deeper. Maybe someone would research the data and find that all 10 of Croatia’s medals in alpine skiing are from the same family (the Kostelic siblings). This chart might trigger someone to research and analyze the most balanced nations, since the nations in my chart are arguably the least balanced. Perhaps, the Why is more interesting to some. Why do these nations excel at one particular sport? Are the driving factors culture, geography, history, funding or something else?

A good analysis or visualization will answer questions, but I believe it should spark curiosity and create questions as well.

The Data

I downloaded the data from data.world which has thousands of public data sets on a wide range of topics. The original source that data.world used for the information was sports-reference.com, which is another site I go to for most of the historical sports data I work with. I have links to both sources in the notes section. The download was already in tabular format and ready for easy importing into most tools that are used to analyze data. Below is an example of how the data looked in data.world.

Because the data was straightforward and clean, I was able to jump right into creating a few pivot tables to get the information that I needed for my analysis. I came up with two criteria that had to be met for a country to make the list.

  1. The country needs a minimum of five total Winter Olympics medals from 1924–2014. The reason for the cutoff is to eliminate the many nations with 100% of their medals in only one sport, but only one or two medals in total.
  2. A minimum of 90% of a country’s medals would have to be allocated to only one sport.

This level of specialization is extremely rare. Only five nations meet these criteria, they are listed below. This was the final table I worked with to create my bar chart. The table doesn’t look great, nor is it intended to. This is simply the data source for my bar chart.

The Visualization

For the remainder of this article, I’ll review how I started with the default Excel bar chart and created something much more consumable for the audience after a few relatively simple steps. The focus of this post is effective visualization, it is not intended to be an Excel tutorial. While Excel was the tool that I worked with, the points that I’m making about aesthetics can be applied to charts in almost any tool.

The first decision I had to make was the metric that would be the focus of my chart. Should the bars display the percentage of medals in each of the countries’ specialized sport or the number of medals? The two options are below:

Option 1: The data source is the percentage of medals in one sport

Option 2: The data source is the number of medals

I chose to use the number of medals as the data source instead of the percentage since I’ll have text on the chart highlighting the criteria. The audience will know that these nations are clustered between 90% and 100%. The quantity-based chart really emphasizes how anomalous the Dutch are by winning so many medals with nearly all those medals in one sport. If the percentage was the focus of the chart, it would take away from the magnitude of the Netherlands’ dominance and would leave the audience wondering how many medals each of the nations won. I could have also used a combination chart displaying both the percentage and quantity, but I decided that might be too much information. I want the audience to focus on quantity only, knowing that the percentage is at least 90% is good enough.

Although I decided not to use percentage as the data source, I want to show that chart one more time below and talk about a glaring flaw that is too common in bar charts. Notice that the x-axis does not begin at zero. This was the default for this data in Excel. I can’t think of any good reason that a bar or column chart should start at anything other than zero. It will only mislead the audience. Because the x-axis doesn’t start at zero, the 91% bar appears to be about 65% lower than the 100% bar, when it’s actually only 9% lower. Often when I have seen a bar or column chart with the axis starting at a number other than zero, it is from a source that has a bias and is trying to exaggerate the difference (common in politics or major networks). Always be skeptical when you see this and avoid using this strategy in your own charts.

I decided that a stacked bar chart would be the best option to represent my analysis. The first section (blue) of each stacked bar displays the number of medals for each country in their specialized sport. The second section (orange) shows the number of medals in all other sports. The two stacked bars will sum up to the total number of medals for each country.

For the remainder of this post, I’ll review all the changes I made, step by step to create a much-improved chart.

Step #1: I reversed the y-axis so that the country with the most medals would show up on top.

Step #2: I removed the grid lines which, in most cases, are redundant information and can distract the audience.

Step #3: My preference when working with bar/column charts is to decrease the gap between the bars to 40%-50%. The default in Excel looks odd, with thin bars and too much white space between the bars.

Step #4: I increased the font size significantly. I wanted to make sure the information was easily legible, especially since an increasing number of people are viewing content on their phones.

This is how the chart looked after these four steps:

Step #5: My next decision was the design of the bars. I decided I wanted something a little different, but hopefully not too distracting. For each sport represented in my graph, I clipped an image of the logo from the Winter Olympics website and used that to make the visualization more interesting. If this chart was for a presentation in a corporate setting, I’d likely choose a standard solid color, but different audiences require different tactics. I set each image on the bars to be stacked and scaled with 10 units per picture (i.e. 20 medals would have two images, 100 medals would have 10 images). The logo images represent the medal count of the sport each nation specializes in.

Step #6: The second part of the stacked bar shows the medal count in all other sports. Orange, was the Excel default, but I didn’t want much attention drawn to the second section of the bar, so I changed it to light grey, a more understated color.

Step #7: Next, I added data labels to show the value and series name. The Excel default places the labels in the middle of each bar, but the information would not be easily legible since I changed the solid bars to images. Even with a solid color, I would rather have the labels side by side at the end of the bars. I think it looks better and it’s easier for the reader to follow.

I was satisfied with my chart after these seven steps, so I submitted the version below for the #SWDchallenge. But it turns out, I had just a little more work to do.

As I mentioned earlier, this challenge is about getting feedback and improving. There were two main points that Cole Knaflic made regarding my initial submission.

Recommendation #1: She suggested that I change my labels from a black font to match the color of the stacked bars (blue and grey). This might seem minor, but when I did this, it brought everything together and made it easier for the audience to associate the text with the data.

Recommendation #2: Change my title or add text to the chart to highlight the key takeaway. As Cole writes in this post, “…if there is a key takeaway — which there absolutely should be if you’re at the point of communicating the information — you need to make that point clearly to your audience”

Both revisions required very little effort but had a large impact on how my information came across. Here is the final version after these recommendations.

All the changes in this makeover were quick and easy. The thought process took longer than the actual work. The most time-consuming part was finding and clipping the Olympic images for the bars. I hope you agree that the final version was a much more effective way to communicate the analysis and I also hope that you found the subject of the analysis to be interesting.

If you made it this far in the post, you have likely found some value in what I have to say, but don’t stop here, continue to seek out other great sources. In the notes, I provided a link to the blog storytelling with data and Cole Knaflic who organized the #SWDchallenge. You can find many great data visualization bloggers and I also urge you to do a search for data visualization best practices. What I have shared are simply some of my best practices that have worked for me in my career as an analyst.

The first step you can take is to start building your charts with data visualization in mind and being more deliberate about aesthetics. Most of the time, the chart is the only part of the project that your audience will see. You work long and hard to research, process the data, analyze, and report your findings, don’t miss out on the opportunity to effectively communicate all your hard work through appealing charts that will allow the right message to resonate with your audience.

--

--

Jason Pauley

Passionate about Analytics (Football, Sports, Marketing, Sales, Demographics)