Cloud data warehouse: The technology no one knows about

Amazon Redshift, Google BigQuery, and Microsoft Azure SQL Data Warehouse are cool tools in search of a category

datacenter servers warehouse database

We’ve all heard of exciting new technologies in the data warehouse world—tools like Amazon Redshift, Google BigQuery, and more recently Azure SQL Data Warehouse. What would you call this category of tools?

Well, of course, “cloud data warehouse.” Check out the Google Trends graph for this search term. Explosive growth.

infoworld cloud dw image 1 Gilad David Maayan

But look at this:

infoworld cloud dw image 2 Gilad David Maayan

The red graph represents searches for “Amazon Redshift,” compared to “cloud data warehouse.” It is growing much more rapidly, and appears many times larger, than the category it represents.

In fact, according to Google, there are only approximately 300 people per month over the past year in the entire world who searched for the term “cloud data warehouse.” (By comparison, “Amazon Redshift” is searched 14,800 times per month worldwide.)

Ha, you might be thinking, “they must be searching for other things, maybe ‘data warehouse on the cloud.’” As someone who has done several rounds of market research in this field, I can tell you assertively there are no larger search terms that describe the category.

In fact, the category does not exist.

In search of a label

I know it sounds funny to say the cloud data warehouse category doesn’t exist. After all the market is hot, tools are popular and growing rapidly. But the fact is—what comes to people’s minds is the tools, the brands; not the category.

But wait a minute. Data warehouses have been around for ages. As far back as the 19th century Thomas Edison stored the results of his electricity experiments in a (legacy) data warehouse, installed on the highly scalable Kinetoscope Platform in his Menlo Park laboratory.

So surely there must be searches for just “data warehouse?”

infoworld cloud dw image 3 Gilad David Maayan

Yes there are. They’re declining. But look at the comparison between “data warehouse” in blue and “amazon redshift” in red. Redshift’s 14,800 are a drop in the ocean compared to “data warehouse.” That term alone is searched 90,500 times per month globally, and there are many other related search terms. “Data warehouse” is huge.

’Plain’ data warehouse is much bigger than cloud

Let show you another data point. I went through the laborious exercise of gathering all the possible search terms people have used recently, around “plain” data warehouse, vs. the three leading cloud data warehouses (Hey I used that phrase! That’s 301 mentions worldwide).

We’re not just talking about the brand name “Amazon Redshift” or “BigQuery” but any possible combination – what is Redshift, Redshift architecture, Redshift clusters, etc.

Excluding Redshift the cosmological phenomenon of course.

Here are the results:

infoworld cloud dw image 4 Gilad David Maayan

In words: every month there are around a half a million people, probably wannabe data engineers, expressing interest in “data warehouse” or any variation of that. Compared to 38,000 searching for Redshift, 26,700 for BigQuery, and a measly 13,000 for Microsoft’s SQL Data Warehouse.

Opium for the masses

Hundreds of thousands of people search for “data warehouse.” They are only now making their first steps in the data warehouse world. They want to learn basic things like what data warehouses are for, how they work, how much they cost. But their eyes are closed to the truth of cool new products and architectures.

This is what Google gives them (see diagram #1). You are warned, it’s not a pretty sight. A circa 2008 diagram (yes, I checked) of the “Enterprise Data Warehouse.”

Wow. That’s miles away from shiny technology from Amazon and friends—see diagram #2

Those half a million people will never (well, not really, but please allow me some dramatic effect) see diagram #2. They’ll see diagram #1 and move on to read about solutions like Oracle and Teradata. Not to knock those products, they’re great. But how many of this audience could be interested in the new generation of tools on the cloud with unlimited scalability and blazing fast query speeds?

There’s money lying on the table

Let’s sum it up:

  • Apparently no one is aware of a category called “cloud data warehouse”
  • Lots of people know about specific brands like Redshift
  • But there are about 10X more people who don’t. They simply search for “data warehouse”
  • And what they get is old architectures and old-guard solutions
    Redshift, Google, Microsoft, and friends aren’t there
  • If they were there, maybe their market share would be 5X by now

Does that make sense to you? Let me know in the comments. IMHO there is a huge missed opportunity here, a market education gap that none of the big players has noticed or seems to care about. Hundreds of thousands who are lost in Thomas Edison Land with little chance of graduating to the new and cool.

This creates a lot of room for smaller players, such as Snowflake, which is going head to head with the big players, and Panoply’s self-optimizing data warehouse, which differentiates itself by making data ingestion, preparation, and query optimization much easier (disclaimer: I am an advisor for Panoply). They can, and will, use this opportunity to grab market share from the unsexy “old guard.” Instead of fighting very hard to grab it from Amazon, Google, and Microsoft.

Copyright © 2017 IDG Communications, Inc.

InfoWorld Technology of the Year Awards 2023. Now open for entries!