TensorFlow 2 review: Easier machine learning

Now more platform than toolkit, TensorFlow has made strides in everything from ease of use to distributed training and deployment

The data lake is becoming the new data warehouse

Platforms like AWS Lake Formation and Delta Lake point toward a central hub for decision support and AI-driven decision automation

Time series analysis with KNIME and Spark

Train and evaluate a simple time series model using a random forest of regression trees and the NYC Yellow taxi data set

Supervised learning explained

Supervised learning turns labeled training data into a tuned predictive model

What is TensorFlow? The machine learning library explained

TensorFlow is a Python-friendly open source library for numerical computation that makes machine learning faster and easier

Hadoop runs out of gas

As big data customers flee complexity and embrace the cloud, the Hadoop vendors are sputtering

Natural language processing explained

Deep learning has improved machine translation and other NLP tasks by leaps and bounds

Deep learning explained

Deep neural networks can solve the most challenging problems, but require abundant computing power and massive amounts of data

4 reasons big data projects fail—and 4 ways to succeed

Nearly all big data projects end up in failure, despite all the mature technology available. Here's how to make big data efforts actually succeed

Machine learning explained

Able to learn from data, machine learning algorithms can solve problems that are too complex to solve with conventional programming

Machine learning algorithms explained

Machine learning uses algorithms to turn a data set into a model. Which algorithm works best depends on the problem


Delta Lake gives Apache Spark data sets new powers

A new open source project from Databricks adds ACID transactions, versioning, and schema enforcement to Spark data sources that don't have them

Pub/sub messaging: Apache Kafka vs. Apache Pulsar

Apache Kafka set the bar for large-scale distributed messaging, but Apache Pulsar has some neat tricks of its own

IBM preps Watson AI services to run on Kubernetes

IBM Watson services arrive in versions that can run on the public cloud or on privately hosted container infrastructure

How to use Azure Data Explorer for large-scale data analysis

Microsoft’s tool for querying terabytes of data finally arrives for everyone to use

Tutorial: Spark application architecture and clusters

Learn how Spark components work together and how Spark applications run on standalone and YARN clusters

Why you should use Gandiva for Apache Arrow

An execution engine for Arrow-based in-memory processing, Gandiva brings dramatic performance improvements to analytical workloads

Review: MXNet deep learning shines with Gluon

With the addition of the high-level Gluon API, Apache MXNet rivals TensorFlow and PyTorch for developing deep learning models

