HDFS: Big data analytics' weakest link

HDFS: Big data analytics' weakest link

Hadoop's distributed file system isn't as fast, efficient, or easy to operate as it should be

06/23/16

5 big data sources for strategic sentiment analysis

5 big data sources for strategic sentiment analysis

Every company wants to know what its customers feel about it. But sentiment analysis can get more granular -- and turn inward to improve employee satisfaction

06/16/16

We have the big data tools -- let's learn to use them

We have the big data tools -- let's learn to use them

Few enterprises enjoy even the first-order benefits of big data. The real payoff will come when we interact with systems much less than we do today

06/09/16

Had it with Apache Storm? Heron swoops to the rescue

Had it with Apache Storm? Heron swoops to the rescue

Heron, Twitter's brand-new streaming replacement for Apache Storm, offers easier scaling and higher throughput while maintaining Storm code compatibility

06/02/16

Spark 2.0 prepares to catch fire

Spark 2.0 prepares to catch fire

Today, Databricks subscribers can get a technical preview of Spark 2.0. Improved performance, SparkSessions, and streaming lead a parade of enhancements

05/26/16

OK computer: When pop music meets machine learning

OK computer: When pop music meets machine learning

Who needs 'American Idol'? 'Algorithm Idol' is poised to take its place in the pop music mill

05/19/16

8 reasons you'll do big data this year

8 reasons you'll do big data this year

To know what people are really doing with big data technology, you need to get dirty in the trenches. Here's what we've found

05/12/16

Look out, Spark and Storm, here comes Apache Apex

Look out, Spark and Storm, here comes Apache Apex

A new open source streaming analytics solution derived from DataTorrent's RTS platform, Apex offers blazing speed and simplified programmability. Let's give it a spin

04/21/16

Apache Beam wants to be uber-API for big data

Apache Beam wants to be uber-API for big data

New, useful Apache big data projects seem to arrive daily. Rather than relearn your way every time, what if you could go through a unified API?

04/14/16

What Spark's Structured Streaming really means

What Spark's Structured Streaming really means

Thanks to an impressive grab bag of improvements in version 2.0, Spark's quasi-streaming solution has become more powerful and easier to manage

04/07/16

Which freaking big data programming language should I use?

Which freaking big data programming language should I use?

When it comes to wrangling data at scale, R, Python, Scala, and Java have you covered -- mostly

04/01/16

Learn to live with Apache Hive in 12 easy steps

Learn to live with Apache Hive in 12 easy steps

Hive lets you use SQL on Hadoop, but tuning SQL on a distributed system is different. Here are 12 tips to help your effort fly

03/24/16

8 horrifying Hollywood computing cliches

8 horrifying Hollywood computing cliches

We've all rolled our eyes at ridiculous misinterpretations of computer technology on TV or in the movies. These eight seem to pop up again and again

03/17/16

Lies your database is telling you

Lies your database is telling you

A wise person once said time is a device invented to keep everything from happening at once. Jonas Boner explains how the database world has abused time from the beginning

03/10/16

8 telltale signs of a bad data scientist

Was that a unicorn? No, it was a perfect data scientist. You won't find that person, but you can find a great hire -- if they don't suffer from these maladies

03/03/16

Let the car drive itself -- and let your business do the same

Let the car drive itself -- and let your business do the same

People who work in tech love the idea of cars that drive themselves. So why are they dragging their feet on new technology that helps them do their jobs?

02/25/16

EclairJS sweetens Spark for JavaScript coders

EclairJS sweetens Spark for JavaScript coders

What? JavaScript instead of Scala or Python? The new EclairJS project bridges the language gap, especially if you already know Node.js

02/18/16

Big data needs big security changes

Big data needs big security changes

Access control for big data analytics needs policy-based security that includes context as well as users and roles

02/12/16

Not up for a data lake? Analyze in place

Not up for a data lake? Analyze in place

Spark and big memory have the potential to run big data workloads without copying gobs of data to a new storage infrastructure

01/29/16

Cook up big data orchestration with Kettle

Cook up big data orchestration with Kettle

Hadoop jobs can get complicated. The open source ETL tool Kettle beats the alternatives in providing the orchestration you need

01/21/16

Load More