In the era of cheap and powerful commodity servers, it may seem odd that some believe the era of big data may require moving on from the PC world -- and into new computing platforms that are better suited to deal with big data's specific needs.
Dexter Henderson is Vice President and Business Line Executive for Power Systems at IBM. His essay in this week's New Tech Forum explains why he believes that commodity servers aren't up to snuff for big data, and enterprise-grade servers are the way forward. -- Paul Venezia
Embracing big data means leaving old technology behind
As organizations struggle to keep pace with the changes wrought by big data, choosing the right server technology becomes ever more important. Big data is spurring an evolution of complex analytics and cognitive computing that require architectures that can do multiple tasks simultaneously, efficiently, and affordably. Servers that are built from the ground up with big data in mind are better equipped to handle new workloads than servers not optimized for the task.
The Internet of things, where intelligent devices equipped with sensors collect and transmit gobs of data, is forcing enterprises to chart their course from big data to big insights to gain competitive advantage. This journey includes supporting terabytes of streaming data sets from a variety of devices -- and analyzing those oceans of data in the context of domain knowledge in real time.
The four big data activities of gather, connect, reason, and adapt will be the keys to driving business value in the next decade, as organizations recognize the strategic importance of big data insights.
As the big data trend accelerates, complex analytics workloads will become increasingly common in both large and midsize businesses. In a survey done by Gabriel Consulting Group, big data users were asked what type of workloads they were using. Not surprisingly, MapReduce workloads ranked dead last -- with enterprise analytics, complex event processing, visualization, and data mining ranking higher.
A number of organizations have been trying to address emerging big data workloads with static data analysis models and multimachine server architectures with low throughput and high latency. What organizations really need is a new software and hardware environment that takes into consideration the new nature of these workloads, as well as data scale, and supports high compute intensity, data parallelism, data pipelining, and real-time analytic processing of data in motion.
Enterprise analytics/big data workloads are becoming increasingly compute-intensive, sharing common ground with scientific and technical computing applications. The amount of data and processing involved requires these workloads to use clusters of small systems running highly parallel code in order to handle the workload at a reasonable cost and timeframe.
Some believe that high levels of software customization in a distributed server environment is the answer to the problem. Instead, this often leads to wasted and unused resources, built-in inefficiencies, energy and floor space concerns, security issues, high software license costs, and maintenance nightmares.
Enterprise-grade servers that are well suited for modern big data analytics workloads have:
- Higher compute intensity (high ratio of operations to I/O)
- Increased parallel processing capabilities
- Increased VMs per core
- Advanced virtualization capabilities
- Modular systems design
- Elastic scaling capacity
- Enhancements for security and compliance and hardware-assisted encryption
- Increased memory and processor utilization
Superior, enterprise-grade servers also offer a built-in resiliency that comes from integration and optimization across the full stack of hardware, firmware, hypervisor, operating system, databases, and middleware. These systems are often designed, built, tuned, and supported together -- and are easier to scale and manage.
For example, many large financial institutions have embarked on aggressive programs to use predictive analytics technology to enhance their revenues. This is placing greater demand on existing compute resources. Using an enterprise-grade server helps these institutions to run thousands of tasks in parallel to deliver analytics services faster, as well as create a virtualized environment that improves server utilization and shares server resources across business units. Server consolidation and virtualization helps reduce the number of physical servers, saving data center space and yielding savings through reduced power and cooling, hardware maintenance, software licensing, and management costs.
To lay it out in more technical terms, there are three important computing requirements for big data workloads:
- Advanced big data analytics require a highly scalable system with extreme parallel processing capability and dense, modular packaging. A compute system with more memory, bandwidth, and throughput can run multiple tasks simultaneously, respond to millions of events per second, and parallel process advanced analytics algorithms in matter of seconds.
- Big data needs a computing system that is reliable and resilient and is able to absorb temporary increases in demand without failure or changes in architecture. This limits security breaches and enhances workload performance with little or no downtime.
- To support new big data workloads, computing systems must be built with open source technologies and support open innovation. Open source architecture allows more interoperability and flexibility and simplifies management of new workloads through advanced virtualization and cloud solutions.
Big data is a new, extraordinary resource to help companies gain competitive advantage. Applying real-time analytics to big data enables companies serve customers better, identify new revenue potential, and make lightning-quick decisions based on market insights. For companies to capitalize on the real-world business benefits of big data, they must first let go of their love for older technologies and look to newer, optimized alternatives.
New Tech Forum provides a means to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all enquiries to email@example.com.