Recently I had a dashboard that included 6 data sources, 5 of them being Hadoop extracts, the last one being an Excel file used to make a waterfall chart. It was just a singular column numbered 1-20 in column A that I used to blend in another data source. This Excel files in my workbook was causing my scheduled refreshes on Tableau Server to fail. After consulting with Tessellation’s Tableau Server expert, Aidan Bramel, there are two normal fixes to this issue:
- Putting it on a shared drive that the tableau service account (s-tableau-p for prod) and connecting to it using the UNC path
- Putting it on One Drive and using the One Drive connector
In this blog post you’ll learn about a third way to work around this issue that does not involve either of the solutions above.
Understanding the Issue
First, we have to understand our problem. Tableau Server will not allow you to schedule a refresh on an embedded data source within a workbook. Why not make it a published data source you say? Well, in this instance the data is a unique join that is being used for a pretty specific use case, so it’s not something I’d want to publicize as the data isn’t easily understandable. This leaves me wanting to have the data sources embedded within the workbook, not published as certified data sets on Server. But I still want to schedule them to refresh. Setting a refresh schedule on an Excel workbook usually causes the refresh to fail….until now!
In the screenshot above you can see all of my data sets including the Excel file that I’m using for the waterfall chart. All of the other extracts were refreshing properly with the exception being the Excel file. Originally I had scheduled all of the extracts to refresh in 4 hour intervals as the data we were trying to capture was supposed to be as real time as possible. As you can see in the screenshot below, those data refreshes were failing every time. This was due to the Excel file.
Since the Excel file, when zipped up in the workbook, was being saved to a temporary location on my laptop, Tableau Server was not able to refresh the file. Then came my first attempt at a work around.
What if I copied in from clipboard just the column I needed to my workbook?
This did not work when I tried to extract the source and publish it, as the temporary file was still being stored on my laptop. When a refresh would happen the data source would point to the file on my laptop and cause the refresh to fail.
What if I created a stale Tableau extract, connected my workbook to the extract as a new data source, then replaced data source and published?
Yes. This works. So the workaround is this:
- Create an extract of your Excel file and place it somewhere on your local drives.
- Add data source, and choose the newly created excel extract as your data source
- Replace data source so that the waterfall calculations are now being driven off of the stale extract (I call it stale because it’s not actually connected to a data source at this point)
- Deleted old excel file from your workbook
- Publish data source to Server and schedule extract refreshes
Since Tableau Server will read this as an extract that is not necessarily pointed to any data source, then this refresh will occur without a problem and your workbook will be able to refresh on time. Now, I know that this will only work on Excel files that do not need to change on a regular basis, but in this case the file is static. Also, I find when this issue arises, often the Excel sheet is being used for a mapping or a supplementary data source that will not change often. This will allow you to create a dashboard with scheduled extract refreshes while pulling from an Excel file.
I do not endorse using Excel files as data sources, but when the need is there, hopefully you can use this work around to complete your dashboard refreshes.
Thanks for reading! If you have any questions, comments, or concerns please feel free to email me at firstname.lastname@example.org.
Credit to Mark Boone for helping me with this solution