Snowflake now offers data warehousing to the masses

The cloud data warehouse solution spearheaded by a former Microsoft exec is now cheaper and easier to use, but competition might loom from Amazon -- the very platform it's hosted on

Snowflake now offers data warehousing to the masses
Credit: Alexey Kljatov

Snowflake, the cloud-based data warehouse solution co-founded by Microsoft alumnus Bob Muglia, is lowering storage prices and adding a self-service option, meaning prospective customers can open an account with nothing more than a credit card.

These changes also raise an intriguing question: How long can a service like Snowflake expect to reside on Amazon, which itself offers services that are more or less in direct competition -- and where the raw cost of storage undercuts Snowflake's own pricing for same?

Open to the public

The self-service option, called Snowflake On Demand, is a change from Snowflake's original sales model. Rather than calling a sales representative to set up an account, Snowflake users can now provision services themselves with no more effort than would be needed to spin up an AWS EC2 instance.

In a phone interview, Muglia discussed how the reason for only just now transitioning to this model was more technical than anything else. Before self-service could be offered, Snowflake had to put protections into place to ensure that both the service itself and its customers could be protected from everything from malice (denial-of-service attacks) to incompetence (honest customers submitting massively malformed queries).

"We wanted to make sure we had appropriately protected the system," Muglia said, "before we opened it up to anyone, anywhere."

This effort was further complicated by Snowflake's relative lack of hard usage limits, which Muglia characterized as being one of its major standout features. "There is no limit to the number of tables you can create," Muglia said, but he further pointed out that Snowflake has to strike a balance between what it can offer any one customer and protecting the integrity of the service as a whole.

"We get some crazy SQL queries coming in our direction," Muglia said, "and regardless of what comes in, we need to continue to perform appropriately for that customer as well as other customers. We see SQL queries that are a megabyte in size -- the query statements [themselves] are a megabyte in size." (Many such queries are poorly formed, auto-generated SQL, Muglia claimed.)

Fewer costs, more competition

The other major change is a reduction in storage pricing for the service -- $30/TB/month for capacity storage, $50/TB/month for on-demand storage, and uncompressed storage at $10/TB/month.

It's enough of a reduction in price that Snowflake will be unable to rely on storage costs as a revenue source, since those prices barely pay for the use of Amazon's services as a storage provider. But Muglia is confident Snowflake is profitable enough overall that such a move won't impact the company's bottom line.

"We did the data modeling on this," said Muglia, "and our margins were always lower on storage than on compute running queries."

According to the studies Snowflake performed, "when customers put more data into Snowflake, they run more queries.... In almost every scenario you can imagine, they were very much revenue-positive and gross-margin neutral, because people run more queries."

The long-term implications for Snowflake continuing to reside on Amazon aren't clear yet, especially since Amazon might well be able to undercut Snowflake by directly offering competitive services.

Muglia, though, is confident that Snowflake's offering is singular enough to stave off competition for a good long time, and is ready to change things up if need be. "We always look into the possibility of moving to other cloud infrastructures," Muglia said, "although we don't have plans to do it right now."

He also noted that Snowflake competes with Amazon and Redshift right now, but "we have a very different shape of product relative to Redshift.... Snowflake is storing multiple petabytes of data and is able to run hundreds of simultaneous concurrent queries. Redshift can't do that; no other product can do that. It's that differentiation that allows to effective compete with Amazon, and for that matter Google and Microsoft and Oracle and Teradata." 

[An earlier version of this article incorrectly identified "uncompressed storage" as "compressed storage". The pricing of this feature is the same.]

To comment on this article and other InfoWorld content, visit InfoWorld's LinkedIn page, Facebook page and Twitter stream.
From CIO: 8 Free Online Courses to Grow Your Tech Skills
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.