It’s still a relational database (RDBMS) world, but Hadoop and NoSQL are starting to make a dent in the realm of structured data.
That’s the key takeaway from a Dell Software-sponsored Unisphere survey, which finds that 75 percent of enterprise data remains under the lock and key of RDBMSes, primarily Oracle and Microsoft’s SQL Server for most enterprises. More surprising is the finding that nearly one-third of organizations are not yet actively managing unstructured data at all.
What century are they living in?
Interestingly, while the survey uncovers growing adoption of NoSQL and Hadoop, the biggest finding may well be that the buttoned-down DBA is the last to know how the enterprise’s critical data is being managed.
Structured approaches to structured data
The relational database is one of technology’s great innovations. Earlier incarnations of the database (such as IMS) forced developers to think about query design and schema design upfront, limiting flexibility as data needs changed.
The relational database’s SQL (structured query language), however, decoupled query design from schema design, which let developers focus on schema design with confidence that they could later query their data as they wanted. This significant shift made databases much more accessible and powerful.
But the comfortably structured world of the RDBMS is increasingly challenged by mountains of unstructured or semistructured data. Much of this new data is created by what Geoffrey Moore calls systems of engagement, even as the last several decades have been built on systems of record (such as ERP and CRM systems). An RDBMS is great when data is predictable in terms of the variety, velocity, and volume of data.
Our big data world no longer looks much like that.
Even so, the future takes a long time. As such, it’s not surprising to see Unisphere’s survey respondents preoccupied with structured data:
- Eighty-three percent of organizations cite growth in transactional data (including e-commerce) as one of the most important sources of structured data growth within their organization, with 51 percent also citing growth in management data, such as ERP systems.
- Although there's an increasing industry focus on the proliferation of social data, an increase in the creation of internally generated documents was seen as the top driver of unstructured data growth, identified by more than 50 percent of respondents.
Despite this emphasis on RDBMS-friendly data, it’s also worth noting the increasing reliance on NoSQL and Hadoop:
- Approximately 70 percent of respondents using MongoDB are running more than 100 databases, 30 percent are running more than 500 databases, and nearly 60 percent work for companies with more than 5,000 employees.
- Sixty percent of respondents currently using Hadoop are running more than 100 databases, 45 percent are running more than 500 databases, and approximately two-thirds work for companies with more than 1,000 employees.
As great as this is for nonrelational data technologies, why isn’t it more? The answer (maybe): Blame the DBA.
Why not more?
After all, as the report notes, among respondents in companies with both Hadoop and NoSQL installed, DBAs are responsible for managing the nonrelational technologies 72 percent of the time. In fact, these same DBAs make up 48 percent of the survey respondents; IT directors comprise another 20 percent.
These roles tend to be lagging -- not leading -- indicators of technology adoption. Those same DBAs have built their careers running Oracle or Microsoft’s SQL Server. It’s not surprising they would stick with what they know.
When the survey finds that a mere 10 percent penetration rate of NoSQL databases (with more than half saying they have no plans to embrace them over the next three years), and only 20 percent of those same people claim to be using Hadoop (with 57 percent not planning to embrace it in the next three years), it’s worth calling out that these are precisely the wrong people to ask about the spread of more modern data technologies.
In fact, the surprise is that they’re running NoSQL and Hadoop at all.
Or not so surprising, when you acknowledge that they may not have any choice. To achieve the scale and flexibility required by today’s enterprise, modern data technologies are increasingly important.
These same survey respondents claim to be primarily concerned with the growth of structured and unstructured data (66 percent), as well as the impact of cloud computing, they have yet to tie together how those two trends are leading toward big data’s poster children, Hadoop and NoSQL.
More than you think
But other non-DBA-centric surveys put such technologies on display.
Hence, Forrester’s own surveys show NoSQL with a 20 percent adoption rate today, which will double by 2017, and DB-Engines, which ranks database popularity (according to jobs data, LinkedIn profiles, and more), shows three NoSQL databases in its top 10, outranking more established RDBMSes such as DB2 and Postgres.
On the Hadoop side, yes, a 451 Research report (from 2013) shows Hadoop claiming a mere 3 percent of enterprise storage, but Gartner highlights a clear march toward greater adoption of big data, generally, and Hadoop, specifically. Each year Gartner asks enterprises about their big data plans, in which Hadoop often factors heavily, and they are clearly moving out of proof-of-concept phase:
Years ago, Billy Marshall declared “the CIO is the last to know,” referring to the CIO’s ignorance of the open source adoption rampant within the enterprise. Today, the same is true of Hadoop and NoSQL.
No, they’re nowhere near displacing Oracle or SQL Server, and they won’t for traditional use cases. But as companies look for better ways to store and process the rising tide of unstructured or semistructured data, the DBAs will have to get used to Hadoop and NoSQL. They simply won’t have a choice.