3 Tips for Data Ingestion

Here are three tips for ingesting your data into Tableau!

Join Calculations

Oftentimes when joining data from several different sources you will run into a time when two fields that seemingly should join, don’t. This is commonly due to the two fields not being in the same format. Let’s say we’re trying to join a data set on Invoice Number, but in data set A the Invoice Number shows as an integer, “1”, and data set B has the Invoice Number field showing as text, “01”. When you try and join these fields, you will get an error and your circle on the join in Tableau will turn red.

The way to get around this without having to insert another field is to use a Join Calculation. Click on the join circle that has an error and scroll down to the bottom until you see the option to “Create Join Calculation.” Click on this option and a screen will pop up.

If you are trying to get your field to show as an integer so that all Invoice Numbers show as an integer, use the INT() function to do this. If you are wanting to convert to string use the STR() function.

Once both of your fields are in the same format your join should now be completed.

Custom SQL Field Naming

When bringing in your data to Tableau via a Customer SQL query, it is good to name your fields in a way that make them easy to group later. Instead of naming fields as FieldName_Topic, I name them as Topic_FieldName. This way, when the data comes into Tableau it’s that much easier to create folders and group the fields coming from the same table since they will sort alphabetically on the Topic name.

a.InvoiceNo as ‘Invoice_InvoiceNo’
a.InvoiceAmt as ‘Invoice_InvoiceAmt’
b.InvoiceNo as ‘InvoiceDetails_InvoiceNo’
b.InvoiceItem as ‘InvoiceDetails_InvoiceItem’
b.InvoiceAmount as ‘InvoiceDetails_InvoiceAmount’
FROM a.Invoice
LEFT JOIN b.InvoiceDetails ON a.InvoiceNo=b.InvoiceNo

Once you bring this query into Tableau you will be able to group these tables together in folders in order to keep your query fields organized.

Hiding All Unused Fields

When you are done with your viz, make sure to go back in your extract and click ‘Hide All Unused Fields.’ This will help immensely with performance and rendering on Server. One thing to remember is that when you are using an extract, each time you hide/unhide fields in your extract, the extract will be rerun each time.

As an example, I am wanting to publish a draft of my dashboard to Server so I go in and hide all unused fields before creating my extract and uploading to Server. I realize that I forgot to include a metric that my stakeholder needed, so I go in and realize a field needed for my calc was hidden. Once you unhide this field and try to go back into the workbook to make the necessary changes, your extract will need to be recreated.

Long story short, be careful in hiding/unhiding your fields so that you don’t waste a bunch of time.

If you have any comments of questions about this post please make sure to send me a DM on Twitter (@jsbaucke) or email me at spencer.baucke@gmail.com.

Leave a Reply