December 17, 2014

Henry or Shearer? Who is the greatest Premier League Striker?

December 16, 2014 will always be remembered in Arsenal lore as the day Thierry Henry officially retired from football.  He's been away from Arsenal for the most part since he left for Barcelona after the 2006-07 season. At that point in his career, he had scored 174 Premier League goals, getting one more in January 2012 in a brief return, closing his Premier League account with 175 goals.

In a tribute to The King, the Premier League created this great compilation video.

Henry's retirement led to the inevitable debate of who is the greatest Premier League striker ever? Most people agree that Thierry Henry and Alan Shearer are in a class by themselves. The great thing about sport is that it leads to lots of opinions and great debates. There are Henry camps and there are Shearer camps.

I gathered their career data from wikipedia (here and here) and combined them into a single spreadsheet here. I built this quick viz in Tableau to allow you to answer the question for yourself. Download the workbook here.

December 16, 2014

College Football's Richest Teams

As a follow up to my post about how rich the University of Alabama football team is, I wanted to show a quick overview of the top 10 richest teams. Note that, despite making $47.1M in revenue in the 2012-2013 season, Alabama only ranks 6th.

The data comes from I would encourage you to read their article if you're curious to know where all of this money comes from. It's pretty fascinating just how big of a business college football has become.

Initially, I was going to create the viz for this post in Tableau, but I was reading the Interactive Inspiration article on Visualoop and saw this ad for

The ad reminded me of Datawrapper, so I thought I would give it a try. I must say I was super impressive with infogram's simplicity. In just a couple of minutes, I registered for an account and had this nice interactive chart.

There are three things I wish I could do with infogram that I can't out of the box:
  1. Auto-sort the bars based on the metric selected 
  2. Color the bars by a dimension, conference in this case; This would allow me to highlight that 5 of the top 10 teams are in the SEC. 
  3. Hide the axis; I don't need it on this view since I'm labeling the bars directly.
My advice: Keep your eyes open and continue to try new tools.  It's fun to learn new things!

December 15, 2014

Makeover Monday: ESPN is biggest reason cable TV isn’t going to die anytime soon

I'm a HUGE sports fan and love to watch live sports, which means I love ESPN. This also means I'm tied to cable or satellite TV since ESPN does not broadcast online without a subscription. Today's Makeover Monday take a look at ESPN's broadcast rights to major sports.

Consider this simple bar chart by Cork Gaines of Business Insider:

Seems simple enough, right? Simple, yes. But is it 100% truthful? No.
  • I'm not convinced the data is accurate. The rights listed on wikipedia don't align, but you shouldn't assume wikipedia is 100% accurate either.
  • The very first bar bugs me. How can you bucket a bunch of sports into NCAA when each sport has a separate contract?
  • Using a bar chart assumes that all of these contracts started in 2010, which is not the case.
  • This chart shows that the college football playoff contract started in 2010, yet the CFB playoffs don't start until 2015. This is clearly wrong.
I was torn between a couple of alternatives, so I'll show you both of them. If you want the data, you can download it here. You can download the Tableau workbook here.

My first alternative was to make a Gantt chart so that I could see the entire length of the contracts. In this view, I've updated the timeline to go back to 1981.

My second alternative is simply a dot plot view of the original chart. This view keeps the timeline starting at 2010.

Which do you prefer? I'm torn.

December 11, 2014

Makeover Monday (on Thursday): How one of the richest teams in college football makes & spends its money

In late November, Cork Gaines of Business Insider wrote about how the University of Alabama football team makes and spends its money. The two articles were accompanied by these two pie charts:

There are many issues with these charts:
  1. They are pie charts, which makes comparing the slices difficult.
  2. The slices are not labeled with the amount, so I have to make some guesses as to their contribution to the whole and then multiply that by the total shown in the titles. That's way too much work.
  3. There are categories missing from the source data.
  4. Why are these in two separate articles since they are a related story? Why aren't they combined in a single story?
  5. There's no mention of the profit the football team turns.
I could go on, but I'll stop there. One of the great things Cork does in his articles is link to the source data. This allowed me to download the data from this website and build my own visualization.

NOTE: This is an update of the original chart based on feedback from Nelson Davis. Nelson suggested making the bar sizes relative across the charts, which my first version failed to do.

First, I need to give a special thanks to Emily Kund for reviewing this viz and providing some great feedback. For example, it was her idea to use the Alabama official colors in the viz.  Thanks Em!

In my version of the viz I wanted to:
  1. Bring the revenue and expense data into the same view
  2. Provide a high-level overview, including profit
  3. Rank the categories in descending order, except for "Other", which I prefer to place last in the sort
  4. Include the actual amounts by labeling the bars
Thoughts? Which do you prefer? Why? You can build your own by downloading the Tableau workbook used to create this viz here or view it on Tableau Public here.

December 10, 2014

Using data blending to compare a superset of the same data source

Yesterday at work, I received the following scenario that one of our users was stuck on:
There is a list of products that the user can filter. The monthly sales for each product needs to be compared to median of all products across all of the months, but the median should recalculate if the user filters the region and/or priority. The end result should be a "Relative Sales" calc that is the sales for each subcategory divided by the median of all sales.
The trick here is that the Subcategory filter cannot impact the median calculation. The steps below could also easily be applied to a sum, min or max. I demonstrating a median because that was the question at hand.

First, consider that you have a view like this, which is sales by monthly by subcategory, with quick filters for region, order priority and subcategory.

Next, create the median calculation and add it to the view. Remember, the median should account for the entire view.

I've added Median to the view and changed the Compute using to Table (Across then down).

Duplicate the data source by right-clicking on it and choosing Duplicate.

You should now see a copy of the data source in the data window. Go to your secondary data source and drag Median into the view and change the Compute using to Table (Across then down).

Sweet! We're almost done!

Create a calculated field in your primary data source for the Relative Sales. This is going to use Sales from the primary data source and divide it by the Median Sales from the secondary data source.

Change the default format of this new field to percentage. Choose the right level of precision for your data. Add Relative Sales to the view and again change the Compute using to Table (Across then down).

It might look like we're done now, but we're not.  If I filter by Subcategory at this point, my median changes, which I don't want. To make this work, we have to tell Tableau which fields to use in the blend.

In the Dimensions list, we want Tableau to blend on Order Date, Region and Order Priority, but not Subcategory, so click/un-click the link icons as appropriate.

Wait! What just happened? Our median has changed! Why? Is that right? Yes, that is correct, because we removed Subcategory from the blend, therefore when Tableau calculates the median, it's no longer considering Subcategory in the calculation, which is what we want.

Now I need to make this look a little prettier. I'm going to change it to a line chart and allow the user to pick the Subcategory they want to highlight.

Download the workbook here.

December 1, 2014

Makeover Monday: What it feels like when a bar chart doesn't start at zero

This week's Makeover Monday is written by reader and friend Victor Blaer. Victor sent me this rant:

Victor provided the following two examples:

Victor's explanation about why these don't work:
The primary problem with these graphs is that they visually misrepresent the truth. They mislead the viewer by manipulating the data.
The primary misrepresentation involves the use of a non-zero baseline. If you look at the vertical axes on all of these bar graphs, you'll notice that none of them start at zero. This would be fine if lines or dots were used to encode the values, but because the length of the bar encodes its value rather than just the position of its endpoint in relation to the quantitative scale, the use of a non-zero baseline doesn't work. By starting the quantitative scale above zero, the relative differences in the values represented by the bars have been exaggerated. 

Additional references:

November 28, 2014

Tableau Tip: Conditional Axis Formatting Using an Axis Selector

Back in July 2012, I wrote about "Dynamic Axis Selections". The problem with this approach, though, was that it created a single axis, which only allows for a single format.  Reader Dave Andrade posted a different, yet related question in the comments.
Let's say we choose 3 metrics - Mail Volume, Spend, and Response Rate as part of our parameter metrics. That means we now have a regular number, a dollar value, and a percentage as part of our available measures. After building the case statement and building a chart similar to the one you've shown in this post, the best thing we can do to show the correct values (sans label of $ or %) on the y-axis is using the "automatic" number formatting option for the 'Measure chosen' measure. I know Tableau has made it much easier to add the label on the chart itself with the Label mark after v8.0, but what about the actual y-axis value? Is there any way to dynamically update the measure value's axis label to reflect it's true number format? 
While the direct answer to Dave's question is no, there's isn't a way to dynamically update the format of the Measure Values pill, there is a work around using containers, which will can give the perception of conditional formatting. Here is the final output of the technique I've used. Read farther down for detailed step-by-step instructions, plus a video.

Download the workbook here.

November 27, 2014

My new website focusing on data viz best practices

This week on the Tableau Wannabe PodcastEmily and Matt were joined by Andy Cotgreave, Tableau's Social Content Manager. When they were discussing the Viz of the Day content, one of the things they talked about was that VotD isn't necessarily about best practices and that there was a need to highlight great content.

This led me to create, which I will use to highlight examples of data viz best practices I find around the web. The content will not be exclusively Tableau focused, as there is tons of great content outside the Tableau community as well.

The first post is up. Go check it out here.

November 17, 2014

Makeover Monday: The Facebook Election

From Buzzfeed: "The social network (Facebook) may end TV’s long dominance of American politics — and open the door to a new kind of populism." The purpose of this article was to demonstrate that conversations on Facebook are becoming a dominant force in the U.S. political landscape. The article included this infographic about Facebook users' sentiment towards 2016 Presidential candidates.

There are several things about this infographic that I don't like:
  1. The title doesn't tell us what the graphic is about.
  2. The pie charts; simply too many of them making comparisons difficult.
  3. There's no apparent order to the candidates.
  4. Candidate names are in ALL CAPS...why?
  5. The label for the Democrats section is off to the right, why?
  6. Some of the pies add up to less than 100% and some to more than 100%.
Those are the immediate things that stuck out in my initial review. I recreate the data in Excel, imported it into Tableau and built my own infographic. Download the workbook here.

I believe I've made improvements to all of my concerns above:
  1. The title makes it more clear what you're looking at.
  2. I switched the pie charts to stacked bars.
  3. The candidates are ordered by positive sentiment.
  4. The candidates names are easier to read since they're in proper case.
  5. The labeling for the two sections is aligned.
  6. Since I'm using stacked bars, the fact that some of the candidates are not equal to 100% is irrelevant.
According to this sentiment data, it will be Condoleeza Rice vs. Joe Biden in 2016. I certainly am not looking forward to all of the political ads coming our way.

Thoughts? Which one do you prefer? What would you do differently?

November 9, 2014

Tableau Tip: KPIs and Sparklines in the Same Worksheet

I'm writing this blog post outside of a Starbucks in the Sao Paulo airport.  Sao Paulo you say? I'm in Brazil this week with four other folks from the San Francisco Bay Area TUG to help Tableau leaders in Rio de Janeiro and Sao Paulo get their own TUGs started. I see this as a way that I can give more than I take from the Tableau community. Yes, this is what I choose to do with my vacation time (it's a bit of a sickness) and no, Tableau doesn't pay me to do this.

Anyway, as I was on the plane, I thought it would be great to kick off the week with a new tip. Today, I'm writing about combining KPIs and sparklines in a single view. It's very common for business users to want to see KPIs and trends in the same view. These give them a sense for the overall direction of their product and also highlight the most meaningful numbers to them. I often see people create these as separate worksheets in Tableau, but with this post, I'm going to show you how to combine them into a single view.

Combining them into a single view provides a couple of benefits:
  1. Tableau only needs to render a single sheet, so until parallel processing comes out in v9, you'll see a performance benefit.
  2. If you have a hierarchy, then expanding the hierarchy will keep the table and the sparklines together. 
This example is using a very simple data set of daily volume for several stocks.  My KPIs include:
  1. Trading volume for the last 7 days
  2. Trading volume for the prior 7 days
  3. Week over week change (raw & %)
This should be a fairly typical set of KPIs for most products. You could easily expand this technique to include m/m or y/y calculations depending on how your organization calculates those.  Here's the final solution, with details on how to create this view below.