How to use Knime for data science

Free, open-source Knime allows you to visually assemble data processing “nodes” into machine learning, deep learning, and other analytics workflows

Become An Insider

Sign up now and get FREE access to hundreds of Insider articles, guides, reviews, interviews, blogs, and other premium content. Learn more.

Knime (the K is silent, so it’s pronounced nīm) is a highly rated data analytics platform with wide applicability and many integrations with other products, such as with databases, languages, machine learning frameworks, and deep learning frameworks. The philosophy of Knime is to be inclusive and “blend” whatever software and data sources you want to use.

The exploration, model building, visualization, reporting, and development portions of the platform are open source, as are the community extensions. Knime Server, which provides collaboration, automation, management, and deployment capabilities, is commercial, as are the partner extensions. Knime Analytics Platform and Knime Server are available for on-prem installation and for the AWS and Azure clouds.

In this tutorial I’ll concentrate on the open source Knime Analytics Platform and selected open source extensions. My goal is to bring you to the point where you can find an existing Knime workflow that you can use as a starting point for your own data science work, and where you understand the Knime workflow well enough to customize it. To accomplish that in limited space, I’ll refer you to some of Knime’s own materials to fill in the details.

Why use Knime?

Choose Knime for your analytics needs if you like building models by assembling processing pipelines (called workflows) graphically from processing elements (called nodes), as exemplified by the simple classifier workflow shown below. Choose another tool if you prefer to write code or to run your models in spreadsheets.

To continue reading this article register now