Big Data

Big Data | News, how-tos, features, reviews, and videos

clouds cloud cloudy mccloudster
one yellow arrow moving opposite a stream of white arrows

bos 2018 main rev

Bossies 2018: The Best of Open Source Software Awards

InfoWorld recognizes the leading open source projects for software development, cloud computing, big data, and machine learning

bos 2018 data

The best open source software for data storage and analytics

InfoWorld’s 2018 Best of Open Source Software Award winners in databases and data analytics

data lake

What is a data lake? Flexible big data management explained

A data lake can be a much more flexible repository than a data warehouse. Or it can be a trash dump that grows and grows

template c100.00 01 15 18.still001
video

Matei Zaharia, creator of the Apache Spark project, on the big data framework | True Technologist Ep 2

In this episode of True Technologist, host Eric Knorr talks with Matei Zaharia, chief technologist at Databricks and an assistant professor of computer science at Stanford, about the Apache Spark and Apache Mesos projects

shortcut through a maze

Why there are no shortcuts to machine learning

As long as companies understand that good data science takes time in an enterprise, and give these people room to learn and grow, they won’t need shortcuts

sort filter group birds on a wire

Why we lose out if we leave everything to algorithms

If we trust a measurement system wholly to data and algorithms, will it inevitably be gamed by the humans it measures?

heart monitor rate ekg hospital medical

How to build stateful streaming applications with Apache Flink

Take advantage of Flink’s DataStream API, ProcessFunctions, and SQL support to build event-driven or streaming analytics applications

blockchain big data

Introducing BigQuery ML for building predictive models with SQL

Google’s beta extension performs linear regression forecasting and binary logistic classification in the BigQuery data warehouse

big data code binary tunnel

Big data: enabling new approaches to IT infrastructure security

Big data technologies and advanced analytics, including AI, are promising a way to get ahead of cyber threats

big data elephant analytics risk predictions vulnerable

3 big data platforms look beyond Hadoop

Learn how the Cloudera, Hortonworks, and MapR data platforms are evolving to meet the demands for real-time analytics and machine learning

data lake

Data lakes: Just a swamp without data governance and catalog

Most businesses’ data lakes are merely repositories of undefined data sets from multiples sources, resulting in data swamps

little girl sunglasses bright future predictions big data

How to get real value from big data in the cloud

Cloud computing makes big data affordable, but few companies know how to actually take actual advantage of it

Digital explosion of data and numbers

What is Julia? A fresh approach to numerical computing

A “no compromises” programming language for data scientists, Julia combines the ease of a dynamic language with the speed of a compiled language

brain-shaped thought bubble showing flow of alphabetic characters

In an age of fake news, is there really such a thing as fake data?

The pitfalls and benefits of using synthetic data to train AI algorithms

statistics stats big data analytics

It’s time we tapped APIs for business analytics

With so much information flowing through APIs, the API management system offers a central hub for business insight

log wood chipper

9 Splunk alternatives for log analysis

Splunk may be the most famous way to make sense of mass quantities of log data, but it is far from the only player around

04 information

Human data is the future of information

That’s good for everyone, as it respects that data has become so important to people’s livelihood—their credit scores just as much their personalities—that it shouldn’t be treated differently than they would be treated

blockchain network machine learning neural network

What is TensorFlow? The machine learning library explained

TensorFlow is a Python-friendly open source library for numerical computation that makes machine learning faster and easier

Load More