IBM sets up a machine learning pipeline for z/OS

z System users with data behind their firewalls can now access IBM's training and deployment system for machine learning, packaged for convenience

IBM sets up a machine learning pipeline for z/OS
Stephen Lawson

If you're intrigued by IBM's Watson AI as a service, but reluctant to trust IBM with your data, Big Blue has a compromise. It's packaging Watson's core machine learning technology as an end-to-end solution available behind your firewall.

Now the bad news: It'll only be available to z System / z/OS mainframe users ... for now.

From start to finish

IBM Machine Learning for z/OS  isn't a single machine learning framework. It's  a collection of popular frameworks -- in particular Apache SparkML, TensorFlow, and H2O -- packaged with bindings to common languages used in the trade (Python, Java, Scala), and with support for "any transactional data type." IBM is pushing it as a pipeline for building, managing, and running machine learning models through visual tools for each step of the process and RESTful APIs for deployment and management.

There's a real need for this kind of convenience. Even as the number of frameworks for machine learning mushrooms, developers still have to perform a lot of heavy labor to create end-to-end production pipelines for training and working with models. This is why Baidu outfitted its PaddlePaddle deep learning framework with support for Kubernetes; in time the arrangement could serve as the underpinning for a complete solution that would cover every phase of machine learning.

Other components in IBM Machine Learning fit into this overall picture. The Cognitive Automation for Data Scientists element "assists data scientists in choosing the right algorithm for the data by scoring their data against the available algorithms and providing the best match for their needs," checking metrics like performance and fitness to task for a given algorithm and workload.

Another function "schedule[s] continuous re-evaluations on new data to monitor model accuracy over time and be alerted when performance deteriorates." Models trained on data, rather than algorithms themselves, are truly crucial in any machine learning deployment, so IBM's wise to provide such utilities.

z/OS for starters; Watson it ain't

The decision to limit the offering to z System machines for now makes the most sense as part of a general IBM strategy where machine learning advances are paired directly with branded hardware offerings. IBM's PowerAI system also pairs custom IBM hardware -- in this case, the Power8 processor -- with commodity Nvidia GPUs to train models at high speed. In theory, PowerAI devices could run side by side with a mix of other, more mainstream hardware as part of an overall machine learning hardware array.

The z/OS incarnation of IBM Machine Learning is aimed at an even higher and narrower market: existing z/OS customers with tons of on-prem data. Rather than ask those (paying) customers to connect to something outside of their firewalls, IBM offers them first crack at tooling to help them get more from the data. The wording of IBM's announcement -- "initially make [IBM Machine Learning] available [on z/OS]" -- implies that other targets are possible later on.

It's also premature to read this as "IBM Watson behind the firewall," since Watson's appeal isn't the algorithms themselves or the workflow IBM's put together for them, but rather the volumes of pretrained data assembled by IBM, packaged into models and deployed through APIs. Those will remain exactly where IBM can monetize them best: behind its own firewall of IBM Watson as a service.