One of the common data requests is for the business to ask IT for a dashboard on a certain piece of data. That type of request is usually motivated by a need to understand a certain aspect of business operations more clearly and to act more quickly and with greater confidence when certain conditions occur. For instance, this could be a request to understand how certain rebates and discounts cut into sales revenues, or to analyze travel spend relative to seasonal patterns.
This often leads to a project cycle: The business gets the dashboard prioritized, time is spent on defining the various metrics to be shown, IT or consultancies pick up the task of implementing the tool, and eventually a dashboard is delivered. This process repeats, each step seems good and necessary, many dashboards are delivered -- but then large businesses often find that the dashboards proliferate beyond anyone's understanding.
Before long, one has to consult multiple dashboards to solve simple problems and information has to be cross-referenced between multiple dashboards. It's clear that the data exists somewhere, but it's a gargantuan task to connect it all. Consolidation efforts start calling for hundreds of tools to be pared down to "merely" dozens. IT teams face months of cleanup work to pull together the common view before they can go back to helping the business grow. Everything appears to bog down.
What has happened? The organization has bound data to UI.
The initial pitfall is that visual metaphors are attractive because they open up a class of data to a wide audience, so it's natural to approach each data need in terms of a new dashboard. In addition, technical teams are often keen to build from scratch. App building technology has evolved to the point where putting up an initial dashboard as a custom app is significantly easier than even less than a decade ago. Thus, many dashboards get built from scratch in python and HTML/CSS/JavaScript. Initially, this leads to a quick development and iteration cycle, but it often results in a completely custom system that requires ongoing engineering effort.
Once the iteration gets past the issues that are easy to communicate, like presentation and visualization, the harder questions of governance, security, caching models, and scalability set in. Tools like Tableau and Qliktech have provided useful generic platforms to deliver these types of dashboards quickly and securely, although they have not solved the scalability issue. Still, there are known patterns of combining these front-end systems with scalable back ends to achieve a complete solution.
The deeper issue is the data itself. Fundamentally, what the business is trying to do is reason about the events that are taking place in the real world, presented in an easy-to-understand, digital form. Over time, different aspects of the business need to be correlated to each other in unexpected ways. Is it possible that the travel patterns of the sales organization correlate with certain deal behaviors? Perhaps it had previously never occurred to anyone to correlate expense reports of a certain type to refunds or rebates of a certain other type. It would have been natural for that type of information to end up baked into different dashboards. Now combining that information is hard.
What's the fix? There is no silver bullet, but a big step is to realize that over time, organizations need to build up a common language of how to talk about the elements of the business. That entails being able to read and write precise questions using common vocabulary and learning how to talk about events in far-flung parts of the organization
The trick is that this vocabulary needs to be a living, evolving element of the business, rather than a fixed, predetermined ontology built through a massive A-priori data warehouse or master data management exercise that remains fixed going forward. This requires loose coupling between the layers that do the visual analysis, the computation, and the storage. In order to keep up, organizations need to flexibly capture the distributed institutional knowledge of how to use each data source, without necessarily requiring slow and potentially expensive face-to-face communication to achieve that.
The answer lies in being able to see how experts access the data they are using. Non-experts need to be able to learn from those examples, because the manual is always out of date and the experts themselves are always fully booked. Systems that help track the data flow within and across organizations are therefore growing in importance, whether they provide the ability to read SQL or provide the information summarized at a higher level in a different way.
This is a big topic and deserves some follow-ups. More to come. As always, please do send thoughts/comments/questions.
In the 1960s, Edsger Dijkstra wrote the heartfelt "Go To Statement Considered Harmful" that inspired many other "XYZ Considered Harmful." The title was intended to challenge orthodox views on a topic. This is a lighthearted series of posts in that vein.