Self-service BI: Why desktop cowboys need a metadata back end

Organizations need back-end metadata repository solutions to maintain the flexibility that self-service BI requires

Self-service business intelligence is one of the hottest topics in IT. But there are many definitions and implementations of self-service BI. Ask the business group, and they will say:

Self-service business intelligence is the ability to quickly get the analytics that we need without having to fill out requirements documents and talk with the IT department.

Ask the analytics architects and they will say:

Self-service BI? Right ... that's just an excuse for the desktop cowboys ... errr ... business analysts to make it up as they go along. No structure. No processes. No repeatability.

To a certain extent, both groups are correct. Self-service BI is about the freedom to use new technologies to delve deeper into data than we were able to before and to leverage the investigative skills of the business  analyst community to drive analytics deeper into the organization.

However, the problem is that most of the so-called "self-service" options are dependent on business analysts downloading whatever desktop application they want and analyzing an excel spreadsheet to find the next great insight. And while this can be just fine for organizations that employ responsible and thoughtful desktop cowboys, even the best intentions can go awry. Data can become siloed on a single laptop that gets dunked in Red Bull one night. Or a spreadsheet that's the crucial underpinning of the new marketing conversation formula can get corrupted while being distributed via email. 

What is required is a back-end server, either in the data center or in the cloud, that can manage the results of analysis and the definitions that are key to consistent and repeatable analytical results. In geeksplaining, this is referred to as a metadata repository. The metadata housed in this repository can do any or all of the following.

  • Technical description of the data: These are the types and constraints on the data set. Many times this is represented by a describe table command in SQL (see below). This prevents desktop cowboys from attempting to link two columns with the same name, but two different data types. While users can match "numbers" with "strings,", many times they flubber when they run a query or analysis.
  • Business description of the data: What do all those column names represent when they aren't labeled with the most obvious column name? CustomerName could be a full name such as "John Smith," or it could be just "Smith," or it could be a company named "Mel's Diner." In each of these instances, it's important to make sure that the business has a repository of information so that each of the desktop cowboys doesn't have to re-invent the wheel when it comes to data
  • Business description of analytics: Believe it or not, businesses can calculate concepts like profit, margin, and revenue in different ways. It depends greatly on how businesses view accounting and how they recognize revenues and define expenses. But businesses can have multiple definitions of a single, relatively vanilla, concept. By having a metadata repository backing up the desktop cowboys, businesses can avoid the problem of two different charts with the same analysis pointing to different results from the same spreadsheet.

There are different and much more complex definitions of metadata, but the core component is that, even with the best intentions, there can be issues with the world of self-service BI. Forward-looking organizations are going to look for back-end metadata repository solutions for their self-service BI platforms that can maintain the flexibility to empower the desktop cowboys ... errr ... end data consumers with the ability to develop their own analytics, but in a controlled and repeatable environment that supports the processes necessary to repeat self-service analytics.

This article is published as part of the IDG Contributor Network. Want to Join?