Why you should jump into big data

At a low cost of entry, the emerging technologies that define the big data trend are already delivering value, so first consider the problems you need to solve -- then dive in

1 2 3 Page 3
Page 3 of 3

With an RDBMS, data needs to fit into rows and columns, and required fields rule, so a request to alter the data model kicks off an elaborate change management process. On the other hand, NoSQL databases such as Couchbase, MongoDB, or Cassandra are not intended to enforce rigid data structures, so new data elements can be added on the fly.

As for ramping up capacity, a big part of the NoSQL value proposition is the ability to scale out rather than up. In other words, with NoSQL, you can simply add commodity servers as needed. In contrast, with an RDBMS, you need to upgrade the horsepower of a single RDBMS server, and when you need to add more RDBMS servers, you must "shard" the database across them, which incurs other complications.

Aside from extreme scalability, what are NoSQL databases good for? You'll be amazed to learn that NoSQL database software vendors are inclined to say "almost everything." In a recent InfoWorld interview, 10gen CEO Dwight Merriman offered a good example: "One telco wrote a product catalog application for their company, a giant company with 100,000 products. Some of them are phones, some of them are extended warranties, and some of them are service plans. They have all these different properties to their products. They found it was very easy to do that with MongoDB because of the way the data model works."

Shoot first, aim later
Another characteristic of NoSQL databases, including Hadoop in its own way, is that developers tend to like writing to them a lot more than they enjoy working with relational databases. NoSQL lends itself the shorter dev cycles characteristic of agile development. That yields faster development times on top of breaking the bottleneck of rigid data models.

Data security is still a concern. But all the enterprise-class vendors building on the major open source projects -- Cassandra, Couchbase, Hadoop, MongoDB -- are adding security controls at a furious rate.

Meanwhile, there's little excuse to sit on the sidelines. Almost all new big data software comes in an open source version; in many organizations, developers are already downloading and experimenting whether management knows it or not.

As for analytics, everyone is accumulating terabytes of semistructured data by default for fear of breaking compliance regulations, so why not derive some insight from that obnoxious quantity of bits? In some instances, IT can benefit directly. In the case study "Big data drives high performance for Cars.com," for example, you can see how a major website used the big data tool Splunk to ensure snappy application performance and defend against malicious bots.

The cost of entry is low, and the potential benefits are high. You don't need to jump into big data with a fully baked strategy; in fact, that would run counter to the whole idea. Wade in, experiment with Hadoop and NoSQL technologies, and see what works. As you build, you'll discover along the way which investments of time, effort, and money have the potential to pay off most.

This article, "Why you should jump into big data," originally appeared at InfoWorld.com. Read more of Eric Knorr's Modernizing IT blog. And for the latest business technology news, follow InfoWorld on Twitter.

Copyright © 2013 IDG Communications, Inc.

1 2 3 Page 3
Page 3 of 3