Ian Pointer

Ian Pointer is a senior big data and deep learning architect, working with Apache Spark and PyTorch. He has more than 15 years of development and operations experience.

DIY GPU server: Build your own PC for deep learning

Spark tutorial: Get started with Apache Spark

Spark tutorial: Get started with Apache Spark

A step by step guide to loading a dataset, applying a schema, writing simple queries, and querying real-time data with Structured Streaming

What is Apache Spark? The big data analytics platform explained

What is Apache Spark? The big data analytics platform explained

Fast, flexible, and developer-friendly, Apache Spark is the leading platform for large-scale SQL, batch processing, stream processing, and machine learning

Bossie Awards 2016: The best open source big data tools

Bossie Awards 2016: The best open source big data tools

InfoWorld’s top picks in large-scale search, SQL on Hadoop, streaming analytics, and other flavors of distributed data processing

Had it with Apache Storm? Heron swoops to the rescue

Had it with Apache Storm? Heron swoops to the rescue

Heron, Twitter's brand-new streaming replacement for Apache Storm, offers easier scaling and higher throughput while maintaining Storm code compatibility

Spark 2.0 prepares to catch fire

Spark 2.0 prepares to catch fire

Today, Databricks subscribers can get a technical preview of Spark 2.0. Improved performance, SparkSessions, and streaming lead a parade of enhancements

Look out, Spark and Storm, here comes Apache Apex

Look out, Spark and Storm, here comes Apache Apex

A new open source streaming analytics solution derived from DataTorrent's RTS platform, Apex offers blazing speed and simplified programmability. Let's give it a spin

Apache Beam wants to be uber-API for big data

Apache Beam wants to be uber-API for big data

New, useful Apache big data projects seem to arrive daily. Rather than relearn your way every time, what if you could go through a unified API?

Get started with Apache Spark

Reap the performance and developer productivity advantages of Spark for batch processing, streaming analysis, machine learning, and structured queries

What Spark's Structured Streaming really means

What Spark's Structured Streaming really means

Thanks to an impressive grab bag of improvements in version 2.0, Spark's quasi-streaming solution has become more powerful and easier to manage

Which freaking big data programming language should I use?

Which freaking big data programming language should I use?

When it comes to wrangling data at scale, R, Python, Scala, and Java have you covered -- mostly

Why Spark 1.6 is a big deal for big data

Why Spark 1.6 is a big deal for big data

Already the hottest thing in big data, Spark 1.6 turns up the heat. Here are the high points, including improved streaming and memory management

Load More