In response to a recent, wry Advice Line aimed at taking a little air out of the big data hype machine, reader Kayza Kleinman offered an excellent opportunity for follow-up advice on getting started with data analytics:
How does an organization with a serious amount of data that doesn't reach the heights of big data (i.e., tens of thousands of clients, and millions of service records, rather than 10X or 100X that size) and without the technical resources for a big-data setup (i.e., data mart plus Hadoop or the like), get better analytics from the data they do have? Especially since not all of the data is in one database.
I'm not talking about a cookbook recipe, but a sane approach that can be done on a reasonable budget.
There's no one right answer to this question. There might not be any right answers to this question, in fact -- data warehousing/analytics projects seem to have a special propensity for getting out of hand.
Here are some guidelines that should help you get started while keeping you out of trouble, beginning with the most important consideration: determining whether your company is ready to take on serious analytics in the first place.
How to tell if your company is ready for data analytics
With or without big data, the criteria provided in "You want big data?" are the best starting point for any effort focused on improving a company's analytics abilities:
- Does your company have sophisticated statisticians and analysts on staff?
- Do your executives prefer data-driven decision making to "trusting their guts"?
- Overall, does the company have a culture of honest inquiry? (How to tell: If everyone understands that changing your mind when new evidence doesn't fit your old opinion is a sign of strength rather than weakness, your company has a culture of honest inquiry. Otherwise, it doesn't.)
Let's take a closer look at these criteria, one at a time.
Data analytics criterion No. 1: Statisticians and analysts
Cool business intelligence tools let you put together dashboards and other types of interactive reports that anyone can use to "explore the data." But they aren't exploring the data for real, any more than someone who buys a book that provides a walking tour of some exotic locale is exploring that region.
The person who wrote the book did the exploring. In the same way, your analysts and statisticians are the people who will "do the exploring" of your data, by creating the dashboards and interactive reports -- the walking tours -- that executives can dig into.
This isn't just a matter of business executives and managers lacking the patience to learn your BI tool of choice. It's a matter of their lacking the patience to understand what constitutes a valid statistical sample and a valid statistical inference.