Big Data

Big Data news, analysis, research, how-to, opinion, and video.

couple hug love

How in-memory computing drives digital transformation with HTAP

Meet in-memory computing (IMC) and hybrid transactional/analytical processing (HTAP), tech’s newest power couple

raining data on keyboard programming developer code

Are you treating your data as an asset?

The best thing you can do is encourage a culture that is data-focused, one that realizes the importance of security and privacy, as well as understanding that data is crucial to your organization’s success

wireless network - industrial internet of things edge [IoT] - edge computing

Azure Databricks: Fast analytics in the cloud with Apache Spark

Microsoft’s partnership with Databricks adds new analytics tools to Azure’s data platform

data lake

Use the cloud to create open, connected data lakes for AI, not data swamps

There needs to be a material change in the way people think of solving complex data problems

abstract fire rays 100152558

Spark tutorial: Get started with Apache Spark

A step by step guide to loading a dataset, applying a schema, writing simple queries, and querying real-time data with Structured Streaming

healthcare data thinkstock

A speedy recovery: the key to good outcomes as health care’s dependence on data deepens

Data is transforming health care, but it is also making life-saving treatments far more vulnerable to IT system failures

holiday lights neurons network stream

What is Apache Spark? The big data analytics platform explained

Fast, flexible, and developer-friendly, Apache Spark is the leading platform for large-scale SQL, batch processing, stream processing, and machine learning

marketing automation gears

Review: H2O.ai automates machine learning

Driverless AI really is able to create and train good machine learning models without requiring machine learning expertise from users

statistics stats big data analytics

Dremio: Simpler and faster data analytics

Built on Apache Arrow and Apache Parquet, Dremio brings self-service to data analysts and SQL queries to NoSQL data sources

artificial intelligence / machine learning / network

Apache PredictionIO: Easier machine learning with Spark

An open source project now under Apache’s guidance uses a template system for easy training and deployment of Spark-powered machine learning models

R programming conference

R tutorial: Learn to crunch big data with R

Get started using the open source R programming language to do statistical computing and graphics on large data sets

data analytics thinkstock

Your analytics strategy is obsolete

While analytics is a giant market and filled with confusing marketing speak, there are big trends shaping the industry that will dictate where organizations invest

5 end of life grave

ETL is dead

ETL is hard. And it's limiting. Unfortunately, there wasn't a better way because of the constraints of data technology. Until now

artificial intelligence robot brain network

AI and quantum computing: technology that's fueling innovation and solving future problems

Far from being all about who'll be first to prove its value, it seems to be more about solving real world problems for future generations in hopes of a better world

apache spark 900x600

The rise and predominance of Apache Spark

Recent surveys and forecasts of technology adoption have consistently suggested that Apache Spark is being embraced at a rate that outperforms other big data frameworks

ifw machine learning opener

11 open source tools to make the most of machine learning

Tap the predictive power of machine learning with these diverse, easy-to-implement libraries and frameworks

bossies 2017 database analytics

Bossie Awards 2017: The best databases and analytics tools

InfoWorld picks the best open source software for large-scale search, SQL, NoSQL, and streaming analytics

bossies 2017 machine learning

Bossie Awards 2017: The best machine learning tools

InfoWorld picks the best open source software for machine learning and deep learning

data science certification man at computer

The 80/20 data science dilemma

Most data scientists spend only 20 percent of their time on actual data analysis and 80 percent of their time finding, cleaning, and reorganizing huge amounts of data, which is an inefficient data strategy

data analytics information investigate study profit loss 100613708 orig

Using big data to improve customer experience and financial results

For enterprises both large and small, it’s important to have a strategy in place to better serve customers using collected information

Load More