Self-service BI

Review: Tableau makes sophisticated analysis a snap

Innovative, self-service, interactive data visualization tool makes quick work of exploratory data analysis from 50-odd sources

At a Glance
  • Tableau Software Tableau 9.0

Data science is best left to Ph.D. statisticians who can program in R and Python, compose complex SQL and MDX queries in their sleep, and leap tall Hadoop data sets in a single bound. Right?

Not according to Tableau. The company claims its products make analytics easy not only for analysts but for “executives, IT, everyone.” While my own training (a doctorate in physics with lots of statistics, SQL, and programming experience) is more than adequate for conventional data science, I tested Tableau 9.0, hot off the presses, with a “beginner’s mind.”

Tableau is considered the market leader in the BI and analytics space, having usurped Qlik, which in turn displaced Cognos (now IBM Cognos) and the other first-generation enterprise BI tools. Tableau is a prime exemplar of the business-user-driven data discovery and interactive analysis trend in BI that has largely taken over from traditional IT-driven reporting and analytics.

New in Tableau 9.0

If you’re already familiar with Tableau, you might want to know what’s new and different in the latest version. The two big areas of improvement:

  1. The tool is smarter about what you are doing.
  2. It's faster to process data and show you analyses.

In the area of “smarter,” Tableau 9.0 has a much better start experience, both on the desktop and on the server, with easy access to your workbooks, data connections, training, and shared visualizations (Figure 1). It offers stories, which I will discuss later on, and it preview thumbnails for your sheets, so you can insert the right sheet into a dashboard or story. It adds ways to easily create analytics, zoom into your data, and create level of detail expressions. Smart maps include geographic search and census data. Data preparation does more with whatever format you happen to have in your data source, without requiring you to reformat your spreadsheets.

Tableau 9 Welcome screen

Figure 1: Tableau 9’s Welcome screen offers easy access to data connections, your workbooks, training, and resources.

In the area of “faster,” Tableau 9.0 consolidates queries and aggregates in parallel; sends the queries for all independent views to the database in parallel; fuses multiple queries at the same level of detail into a single query; and caches query results to avoid rerunning them in the same session. The difference is night and day: 20 seconds versus five minutes for a moderate-sized project.

Choose your data source

Tableau Professional can connect to a wide assortment of file (Figure 2) and server data sources, including Excel workbooks, character- and tab-delimited files, statistical files, and upward of 40 server types, although 19 of those are only available from Windows. Tableau Personal is restricted to six kinds of data source; the free Tableau Public can only use four kinds of data source.

You can connect multiple data sources to a worksheet and create joins between tables and/or files. If you know the joined data has referential integrity, you can improve performance by telling Tableau to assume referential integrity.

Tableau 9 file import

Figure 2: Tableau 9 can read a wide range of files (shown above) and servers. New in this version is support for SAS, SPSS, and R data files; Apache Spark SQL servers; and regular expressions in calculated fields for PostgreSQL, TDE, Apache Hive, and Oracle.

It’s very common for raw data to be full of nulls, to have fields (especially name, date, and geographical fields) that aren’t quite in the right format for analysis, and (along the lines of the toast always falling jelly side down) to have the rows and columns reversed from where you need them. It’s also common for there to be a big title and subtitle in a spreadsheet that has been used for a presentation, which can mess up the usual assumption that the table starts with a row of column titles, then a block of actual data.

Tableau 9 handles all of that easily. It’s no problem to turn addresses into a hierarchy of country, state, city, and street address, and Tableau can infer latitude and longitude from addresses. It’s also easy to skip over any irrelevant material before the real table without going back to Excel and to pivot the rows and columns at any time during the analysis in Tableau.

Powerful analytics

Analysis, specifically ad-hoc analysis, is an area where Tableau shines. Once you have imported data, you can explore it using as many views into the data as you like.

Tableau analysis is a drag-and-drop process with property sheets, kind of like a Visual Basic for data scientists. As we can see in Figure 3, the data dimensions (fields used for classification) and measures (data values both primary and calculated) appear in the tab to the left. You can drag them into rows and columns, attributes and filters. By the way, this chart shows Case-Shiller home price index data; the geocoding and the U.S. map background were all generated by Tableau without any help from me.

The floating Show Me palette seen at the lower right has hints about what measures and dimensions each kind of display requires. If you add more rows and columns than required for individual charts, Tableau will automatically create a grid of smaller charts for you.

Tableau worksheet

Figure 3: Setting up a Tableau worksheet is even easier than creating a graphic or pivot table in Excel, and it offers many more options both in terms of graphics and analytics.

If you'd prefer letting the viewer select one or more parameters for exploratory purposes -- for example, to see the evolution of a trend over time -- you can add a quick filter with a user control, such as the Year slider at the upper right. To experience the power of this, look at the “See how your home town compares” sheet of the CNBC Recovery Watch story, drag the date slider back to January 2011, and click the right arrow to step through the housing recovery month by month.

Every feature of Tableau has additional options, though the default settings are often pretty good. For example, the size of bubbles can be controlled in the Edit Sizes dialog that comes up when you double-click the bubble card, below the marks and colors cards to the left of the chart in Figure 3. In this particular case, the default was too small and uniform for the data and my taste, so I enlarged the bubbles and widened the range of sizes to be more visible. I couldn’t decide whether the mean HPI (home price index) or the median HPI was more meaningful for this particular chart, so I assigned one to size and one to color.

Color, size, and shape give you the ability to represent extra dimensions and measures on a chart in addition to the row and column measures. You can also do a lot with actions and tool tips.

Tableau has had the ability to do calculations for a long time; version 9 adds ad-hoc calculations, which make it easy to add and edit calculated fields in your analysis. Tableau expressions look a lot like Excel formulas, except they use field names rather than cell names. Tableau formulas will conveniently autocomplete as you type them.

The new Analytics pane lets you easily drag summary, model, and reference lines onto a worksheet. Exactly what analytics you can use depends on the chart type.

InfoWorld Scorecard
Analytic power (20%)
Data sources (20%)
Presentation flexibility (20%)
Ease of use (20%)
Ease of learning (10%)
Value (10%)
Overall Score (100%)
Tableau 9.0 9 9 9 9 9 8 8.9
At a Glance
  • Pros

    • Supports a wide assortment of data sources
    • Provides a large selection of chart types
    • Provides excellent control over chart and dashboard appearance
    • Makes deep statistics available without writing code
    • Not hard to learn considering the complexity of the product

    Cons

    • It isn’t cheap
    • It isn’t the best BI product for reporting purposes
1 2 Page 1
Page 1 of 2