Hey all, thanks so much for making the SportsVizSunday session such a fun time this past week in Berlin! It was so fun to kick off the conference with such an engaged and excited group of attendees. It was truly a pleasure for Simon, James, and me to help facilitate this session which was hopefully full of learning and fun.
Because time flies when you’re having fun, we didn’t get to go through all the solutions to our data viz making session (Round 2 of the presentation), so I wanted to go through the solutions in a little more detail. The viz that participants were encouraged to recreate was one of my vizzes on the 500 Home Run Club, ala #WorkoutWednesday style.
There are 3 main elements to the viz, of which we highly encouraged participants to choose ONE to complete in the 20 minutes. We were extremely impressed with the amount of work that the members in the session got done, but if you didn’t finish and wanted to know how something was created, here is how I did them.
*As a side note, these are how I created them, that doesn’t mean it’s the right or the best way.
Since I was only wanting to analyze three of the players in particular, I created a Set that included Barry Bonds, Sammy Sosa, and Mark McGwire so in my calculations I could reference this set. This step chart was created using a continuous dimension for the columns and a basic running total calculation of the number of records (each row was a HR).
In order to get the dots only on the end of the lines that are for the big 3 home run hitters being analyzed we need to utilize a table calculation, which I have listed below. One of our session participants noted that you can also do this same thing with a much easier LAST() function, but this is the way I originally did it.
(IF WINDOW_MAX(RUNNING_SUM(SUM([Number of Records]))) = RUNNING_SUM(SUM([Number of Records])) THEN RUNNING_SUM(SUM([Number of Records])) ELSE NULL END)
The way I think about this calc is that is says, if the max sum in the window is equal to that running sum at that point on the x axis, then return that running sum total, else return NULL values. When you combine this field with your original running sum field via a dual axis it will plot just the point where the line has reached it’s max amount. There is no such thing as a negative home run but theoretically if this were data that did not always increase in value the dot would not be at the end of the line.
To get the colors right, drag your Set to the Color marks card. You can then right click on whichever field you wish to call out, right click, then select Assign Highlight Colors to Palette. In order to just have the dots for the Big 3 show up, right click the Out category on the Color Legend and select, Hide Data. This will hide any data that is not in your set, in this case, the end line dots.
Connected Dot Analysis
In order to compare the average number of HRs hit by each group of players at specific ages, I created a calc detailed below.
Avg Big 3: SUM(IF [Set 1]=FALSE THEN [Number of Records] ELSE NULL END)/COUNTD(IF [Set 1]=FALSE THEN [Player] ELSE NULL END)
Avg Field: SUM(IF [Set 1]=FALSE THEN [Number of Records] ELSE NULL END)/COUNTD(IF [Set 1]=FALSE THEN [Player] ELSE NULL END)
I then used a Measure Values dual axis setting one to line and one to circles. By dragging Measure Names to the line card the Measure Values get connected. I then created a field that color codes whether the Big 3 hit more homers during that age or the rest of the field did.
The labels are above the top dot and below the bottom dot so that the viz itself does not get covered up by text. I do this by clicking the Show Mark Labels box in the Label card. For the dots I chose the max value to be shown and for the line I chose the min values to be shown.
Small Multiple Chart
Using a discrete bin of the ages and Ryan Sleeper’s trellis chart formulas, I created a small multiple chart with one slight variation. Instead of just one measure, in this instance the number of HRs, I added a second measure of ATTR(1) to be the second measure. This creates another row, but since the 1 plots against all other values in the calculations it creates a horizontal line. If you go to Color for that measure and then make Opacity=0%, you can then use that space to drag fields to the Text card to create labels underneath each section of the Trellis chart.
y-axis: int((Index()-1)/ (int(sqrt(size()))))
The other elements in here are reference lines that show the median number of home runs hit by each player in their careers and max number of HRs hit by age of the player.
I hope these solutions are helpful for anyone who was trying to recreate this during our session. Thanks again for all who came and made this session a special one. If you have any other questions, comments, or feedback about this presentation make sure to reach out to me or one of the SportsVizSunday guys so we can help you out. Thanks!