Hadoop news, information, and how-to advice

blue sine wave 000011916953


Why Spark is spiking in the cloud

Interest and investment in Apache Spark have increased dramatically in recent months, to the benefit of cloud customers

Hadoop elephant code

Yahoo struts its Hadoop stuff

A peek under the hood of Yahoo's Hadoop deployment illustrates how vast the ecosystem has become -- and how the company that invented it is still leading the way

wide data bridge

Streaming analytics enter the fast lane

Already we've moved on to a new phase in analytics where data never rests

shrugging man unknown mystery question decision

Which freaking Hadoop engine should I use?

These four truths will help you determine which Hadoop technology to use for the types of workloads you anticipate

Briefcases marching

Hadoop keeps marching on, somehow

Deployments of Hadoop in production have been slower to arrive than many thought, but Hadoop job growth data shows that enterprises are keeping the faith

Navigating a field of uncertainty and doubt questions

Big data, big challenges: Hadoop in the enterprise

Fresh from the front lines: Common problems encountered when putting Hadoop to work -- and the best tools to make Hadoop less burdensome

Kyvos serves up Hadoop on easy-to-parse data cubes

New big data software from startup Kyvos Insights can format Hadoop data into OLAP repositories


Debunked! 9 myths about big data and Hadoop

These unfounded beliefs about budget skills, technology, and technology fit can lead you astray

flying sparks fire

IBM fires up Spark with Bluemix, machine learning contributions

IBM doubles up on Spark, adding it to Bluemix and contributing its SystemML machine-learning code to the Apache project

big data

Spark 1.4 adds support for R, Python 3, cluster management

Spark data processing framework adds languages used by many data crunchers, as well as container-based cluster management features

Data and analytics

LinkedIn fills another SQL-on-Hadoop niche

LinkedIn's open source, home-brew OLAP project is a new way for Hadoop users (and others) to query both real-time and historical data

maze simplify easy arrow easier

Hortonworks eases path to Hadoop

With new setup, management, and data-governance features, Hortonworks' latest Hadoop distribution wants to be an enterprise darling -- if enterprises will let it

6 real time

Spark and Storm face new competition for real-time Hadoop processing

DataTorrent is releasing its real-time data processing engine for Hadoop and beyond as the open source Project Apex

tahiti wave 000007128186

Salesforce wants enterprise big-data users to catch its Wave

With Wave for Big Data, Salesforce is determined to keep a foothold in enterprises with growing interests in Hadoop -- assuming existing self-service analytics outfits haven't gotten there first

the great wall of china

4 strategies to distribute your data between front end and back end

Where you store and process your data has a significant impact on issues such as privacy or performance, but also on the ability for apps to access and deliver relevant data

mind the gap london metro tube

The mythical Hadoop skills gap

Oh no! Big data is failing because we can't find enough people who know the technology! Relax, they're out there -- but don't fall for the buzzwords


Apache Drill 1.0 tears into data, with or without Hadoop

Drill 1.0 queries Hadoop data via SQL, but may have a life of its own outside of the framework

graph trend down

Hadoop demand falls as other big data tech rises

Hadoop isn’t living up to its hype -- which means that both Hadoop vendors and their customers need to widen their array of big data technologies

Load More