Hadoop

Hadoop news, information, and how-to advice

messy cio desk worker office frustration
internet of things data

big data enter key 000034547914 medium

Spark 2.0 takes an all-in-one approach to big data

With a new streaming system, performance enhancements, and API refinements, Apache Spark 2.0 offers a big umbrella to data users

open source keyboard

Spark-powered Splice Machine goes open source

An open source version of the Hadoop-based and Spark-accelerated RDBMS is now available sans a few enterprise features

waste basket ideas trash

With big data, CEOs find garbage in is still garbage out

BI has always topped of the list of enterprise priorities -- and execs are always the least satisfied with BI initiatives. Why should big data be any different?

Network room and mainframes with virtual city in the cloud

How to get your mainframe's data for Hadoop analytics

IT's mainframe managers don't want to give you access but do want the mainframe's data used. Here's how to square that circle

Chain held together by string

HDFS: Big data analytics' weakest link

Hadoop's distributed file system isn't as fast, efficient, or easy to operate as it should be

Hadoop

Prioritize predictable performance in Hadoop

Organizations running Hadoop in production can ensure that high-priority jobs complete on time, every time

Two hands reaching and creating a spark of electricity

The next steps for Spark in the cloud

Simply having Spark in the cloud isn't enough. What matters is what it can connect to and how easy it is to use

great blue heron bird flight feathers

Had it with Apache Storm? Heron swoops to the rescue

Heron, Twitter's brand-new streaming replacement for Apache Storm, offers easier scaling and higher throughput while maintaining Storm code compatibility

security 2016 big data

Businesses harbor big data desires, but lack know-how

A fresh survey shows that while more companies are investing in big data, putting the results of all that processing to use remains dicey

stupid factory

Dear Silicon Valley: Stop saying stupid stuff

Silicon Valley has its head so far in the future it can't hear the laughter in response to its over-the-top pronouncements

beautiful green farmland with blue sky and clouds

Redis plants the seeds for an open source ecosystem

Redis Modules help the caching and in-memory storage system work with new data structures and database behaviors

train leaving

HBase: The database big data left behind

As the default database for Hadoop, you'd expect HBase to be more popular than it is, but its time may already have passed

Spark Java microframework

Apache Spark powers live SQL analytics in SnappyData

The same team that created GemFire builds on Spark in a new open source database that can analyze OLTP and OLAP workloads side-by-side

Data lakes 101: Come on in, the water's fine

How to plan for and build a central hub for data analytics with the ever-evolving Hadoop ecosystem

analytics big data stats statistics charts

Apache Beam wants to be uber-API for big data

New, useful Apache big data projects seem to arrive daily. Rather than relearn your way every time, what if you could go through a unified API?

streaming river water creek flow

What Spark's Structured Streaming really means

Thanks to an impressive grab bag of improvements in version 2.0, Spark's quasi-streaming solution has become more powerful and easier to manage

big data rescue

Review: Databricks makes big data dreams come true

Cloud-based Spark machine learning and analytics platform is an excellent, full-featured product for data scientists

Elephant dog rain tint

Hadoop project ODP regroups under Linux Foundation's umbrella

The Open Data Platform's reorg aims to assuage criticism about vendor control over the initiative to create a consistent baseline Hadoop distribution

Load More