Big Data

Big Data | News, how-tos, features, reviews, and videos

toy rocket ship
ifw data lakes outdoors mountains water by ryan stone via unsplash

holiday lights neurons network stream

What is Apache Spark? The big data platform that crushed Hadoop

Fast, flexible, and developer-friendly, Apache Spark is the leading platform for large-scale SQL, batch processing, stream processing, and machine learning

big data blue

Why data-driven businesses need a data catalog

Enterprises need better tools to learn and collaborate around data sources. Data catalogs with pioneering machine learning capabilities can help you tap your valuable data

data lake

Qubole review: Self-service big data analytics

Cloud-native data platform puts Spark, Presto, Hive, and Airflow at your fingertips, while controlling your cloud spending

data scientist woman at virtual monitor user interface tools for data science by metamorworks getty

Who should be responsible for your data? The knowledge scientist

Organizations that recognize the importance of clean and reliable data while elevating knowledge work will move faster along the path to true data-driven decision-making

rivalry tug war compet conflict challenge determin

Will data gravity favor the cloud or the edge?

An industry standard confidential computing framework could unlock secure data processing at both the center and the edge

big data code binary tunnel

What is big data analytics? Fast answers from diverse data sets

Analyzing large volumes of data is only part of what makes big data analytics different from traditional data analytics

maze lost question direction wayward

Semi-supervised learning explained

Using a machine learning model’s own predictions on unlabeled data to add to the labeled data set sometimes improves accuracy, but not always

fire flames

How Qubole addresses Apache Spark challenges

The Qubole Data Platform brings streamlined configuration, auto-scaling, cost management, and performance optimizations to Spark-as-a-service

neural network

PyTorch vs. TensorFlow: How to choose

If you actually need a deep learning model, PyTorch and TensorFlow are the two leading options

log wood chipper

10 Splunk alternatives for log analysis

Splunk may be the most famous way to make sense of mass quantities of log data, but it is far from the only player around

Illustration of head made out of gears with 2 hands holding it with cloud background

Automated machine learning or AutoML explained

AutoML frameworks and services eliminate the need for skilled data scientists to build machine learning and deep learning models

analytics statistics stats big data

How to do real-time analytics across historical and live data

5 in-memory computing platform capabilities that support analytical processing of both data lake data and operational streams

big data elephant analytics risk predictions vulnerable

HPE plus MapR: Too much Hadoop, not enough cloud

MapR gives HPE superior big data analytics technology and expertise, but not what HPE needs most

A human profile containing digital wireframe of technology connections.

The best machine learning and deep learning libraries

TensorFlow, Spark MLlib, Scikit-learn, PyTorch, MXNet, and Keras shine for building and training machine learning and deep learning models

Clash of fists in silhouette

Julia vs. Python: Which is best for data science?

Python has turned into a data science and machine learning mainstay, while Julia was built from the ground up to do the job

abstract binary vortex matrix motion digtial transformation disruption  by simon carter peter crowt

TensorFlow 2 review: Easier machine learning

Now more platform than toolkit, TensorFlow has made strides in everything from ease of use to distributed training and deployment

Evolution of Lighting 166160844

The data lake is becoming the new data warehouse

Platforms like AWS Lake Formation and Delta Lake point toward a central hub for decision support and AI-driven decision automation

money time clock numbers abstract

Time series analysis with KNIME and Spark

Train and evaluate a simple time series model using a random forest of regression trees and the NYC Yellow taxi data set

Load More
You Might Also Like