Big Data

Big Data | News, how-tos, features, reviews, and videos

Data streams through a businessman's head. / mindset / analysis / strategy / skills / knowledge
Swedish red lakehouse

iceberg under water 135415219

Why Apache Iceberg will rule data in the cloud

Apache Iceberg is an open table format that offers scalability, usability, and performance advantages for very large data sets. Here are five reasons Iceberg is optimal for cloud data workloads.

Team members collaborate / discuss / communicate in a data center.

Databricks adds data governance, marketplace features

The data marketplace and other features are expected to accelerate data engineering tasks with an option for data monetization down the road, Databricks said.

programming / coding elements / lines of code / development / developers / teamwork

Databricks open sources its Delta Lake data lakehouse

Databricks is open sourcing Delta Lake to counter criticism from rivals and take on Apache Iceberg as well as data warehouse products from Snowflake, Starburst, Dremio, Google Cloud, AWS, Oracle and HPE.

piggy bank one dollar bills money savings

12 programming tricks to cut your cloud bill

Cutting cloud costs is a team effort, and that includes developers. Here are 12 tricks for developing software that is cheaper to run in the cloud.

neural network

What is TensorFlow? The machine learning library explained

TensorFlow is a Python-friendly open source library for numerical computation that makes machine learning and developing neural networks faster and easier.

cliff diving taking the plunge dive into a project ocean swimming by aydinmutlu getty 2400x1600

What is a data lake? Massively scalable storage for big data analytics

Dive into data lakes—what they are, how they're used, and how data lakes are both different and complementary to data warehouses.

ai artificial intelligence ml machine learning robot touch human hand

Where AI has made real progress

Better data infrastructure has provided a big boost to AI’s growth, but some things still require a human.

Africa  >  Senegal  >  Ziguinchor Bridge, Casamance River

Working with Azure Managed Instance for Cassandra

Use open-source tools to build big data systems that bridge on premises and cloud.

Do More With R [video teaser/video series] - R Programming Guide - Tips & Tricks

How to use R with BigQuery

See how to use R to query data in Google BigQuery with the bigrquery and dplyr R packages.

cloud computing / cloud network

How the cloud and big compute are remaking HPC

High-performance computing projects require massive quantities of compute resources. Pairing simulation and specialized hardware with the cloud powers the breakthroughs of the future.

Abstract network of digital streams.

Why developers use Confluent to manage Apache Kafka

How the fully managed Kafka service can bring peace and simplicity to the lives of those who depend on event streaming infrastructure.

Conceptual image of an individual user working with an extruded virtual display.

Google’s Logica language addresses SQL’s flaws

Open source logic programming language compiles to SQL and runs on Google BigQuery, with experimental support for PostgreSQL and SQLite.

speed_digital_car_lights_vehicle_fabio ballasina unsplash

Ahana Cloud for Presto review: Fast SQL queries against data lakes

Ahana Cloud for Presto turns a data lake on Amazon S3 into what is effectively a data warehouse, without moving any data. SQL queries run quickly even when joining multiple heterogeneous data sources.

bolts of light speeding through the acceleration tunnel 95535268

Solving query optimization in Presto

By combining machine learning and adaptive query execution, query optimization in Presto could become smarter and more efficient over repeated use.

spiral sparks / steelwork / coil / spring

Microsoft brings .NET dev to Apache Spark

.NET for Apache Spark 1.0 provides high-performance .NET APIs to Apache Spark including Spark SQL, Spark Streaming, and MLlib

Machine learning megaguide: Amazon, Microsoft, Databricks, Google, HPE, IBM

Download InfoWorld's massive roundup of Amazon, Microsoft, Databricks, Google, HPE, and IBM machine learning toolkits

Public cloud megaguide: Amazon, Microsoft, Google, IBM, and Joyent compared

The top five public clouds pile on the services and options, while adding unique twists

Quick guide: Learn to crunch big data with R

Get started using the open source R programming language to do statistical computing and graphics on large data sets

Load More