How to solve AI’s reproducibility crisis

Society will never fully trust artificial intelligence’s actions unless they can be shown to produce reasonably repeatable results in line with what their developers claim

Become An Insider

Sign up now and get FREE access to hundreds of Insider articles, guides, reviews, interviews, blogs, and other premium content. Learn more.

Reproducibility is often trampled underfoot in AI’s rush to results. And the movement to agile methodologies may only exacerbate AI’s reproducibility crisis. Without reproducibility, you can’t really know what your AI system is doing or will do, and that’s a huge risk when you use AI for any critical work, from diagnosing medical conditions to driving trucks to screening for security threats to managing just-in-time production flows.

Data scientists’ natural inclination is to skimp on documentation in the interest of speed when developing, training, and iterating machine learning, deep learning, and other AI models. But reproducibility depends on knowing the sequence of steps that produced a specific data-driven AI model, process, or decision.

Reproducibility falls apart if the data scientists who built an AI model failed to follow a repeatable approach to their work or document what they actually did in precise detail. In those scenarios, neither the original developers of an AI model nor anyone else can be confident that what they found can be reproduced at a later date by themselves or anyone else.

The reproducibility issues multiply as the underlying AI pipeline platforms—including modeling frameworks, hardware accelerators, and distributed data lakes—evolves on every level, thereby reducing the feasibility of standing up a precise replica of the original platform for any later-on cross-validation.

Shared devops platforms can ensure reproducibility

To ensure that reproducibility isn’t undermined by agile methods, data science teams should perform all their work on shared devops platforms. Those platforms—which are now offered by dozens of vendors—enable AI development teams to maintain trustworthy audit trails of the specific processes used by data science professionals to develop their AI deliverables. Data-science devops tools use rich repositories of associated metadata and a log of precisely how particular data, models, metadata, code, and other artifacts executed in the context of a particular process or decision. They also automate the following AI pipeline functions:

To continue reading this article register now