Conferences >2023 IEEE International Confe...

MLOps: Automatic, Zero-Touch and Reusable Machine Learning Training and Serving Pipelines

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Due to increased interest in Machine Learning, Deep Learning and AI and easy availability of ML toolk-its/libraries/ frameworks as well as talent that has ML skillsets an...Show More

Metadata

Abstract:

Due to increased interest in Machine Learning, Deep Learning and AI and easy availability of ML toolk-its/libraries/ frameworks as well as talent that has ML skillsets and know-how to use these ML tools, Machine learning model building has become easier and is now being employed to solve a variety of problems across different fields. But when the time comes to be able to take these ML/DL models to production i.e., integrate them with business applications, it is often a challenge. This is due to the many skillsets in software development, cloud, DevOps, system design, data engineering, that this requires. Furthermore, we see that ML engineers or data scientists and software developers often work in silos to make locally optimum decisions without thinking about how the model is going to be used in production, how it is going to be trained on fresh new incoming data and how and when it is going to be replaced with a new better model. In short, most of the time, there is no plan for how models will behave when in production nor any automatic pipelines to manage them. This leads to lots of difficulty in productionalizing the ML models and leads to wasted time and effort. We we will demonstrate how we can take an ML model to production very easily using components from the Acumos AI project and do much more by creating zero-touch i.e., automatic and reusable ML model training and ML model serving pipelines and reusable model infrastructures using Acumos and Nifi. We will create a Cloud Forensics ML model, productionalize it and show how it can be consumed in a business application.

Published in: 2023 IEEE International Conference on Internet of Things and Intelligence Systems (IoTaIS)

Date of Conference: 28-30 November 2023

Date Added to IEEE Xplore: 14 December 2023

ISBN Information:

ISSN Information:

DOI: 10.1109/IoTaIS60147.2023.10346079

Conference Location: Bali, Indonesia

Contents

I. Introduction

Although Machines Learning, Deep learning, AI model building has become easily accessible and commonplace due to the availability of cheap compute [1] [2], ML skills and talent, ML tools /libraries/frameworks, and it is being increasingly employed to solve problems in various industries, the most technical debt in creating usable ML can be attributed to the process of actually productionalizing an ML model. This is because it requires considerable expertise across different skills like software development, data engineering, system design, DevOps and cloud apart from just data science and ML skills [3]–[5]. This becomes a roadblock to many individuals and small organizations or even big organizations where different teams that in fact even have these skills find it difficult to plan how the model will be put into production environment, how it will be used by the business applications, how it will be trained on new data, how it will be retired without affecting the service and how to make this process automatic or reduce manual interactions. Failure to execute this leads to wasted time and effort in only experimentation and other local optimizations versus an optimized, systematic, and automatic ML model service and pipeline roll-out. In this paper, taking the example of a Cloud Forensics ML model, we will see how to create and operationalize au-tomatic’ zero-touch and reusable ML training and serving pipelines using Acumos. [6],–[8] show other ways to create and operationalize pipelines using Acumos. These days, there are many other MLOps tools like Kubeflow [9] from Google, which is an open-source project and an end-to-end MLOps platform that helps do workflow orchestration, experiment tracking, model management, model deployment and has notebook workspace. Airflow [10] and Argo [11] are open-source projects for pipeline orchestration. MLflow [12] is another open-source project supported by Databricks and helps do experiment tracking and model management. Metaflow [13] has features for orchestrating ML pipelines. Pachyderm [14] has data versioning and pipelining features. Most of these MLOps solutions and Tensorflow Serving [15] and Seldon [16] also offer additional advanced capabilities like traffic routing etc. But Acumos offers some other distinct benefits like having inbuilt federation that allows using shared models, and having a design studio to help create AI pipelines consisting of many heterogeneous models.

MIT Libraries

MIT Libraries

MLOps: Automatic, Zero-Touch and Reusable Machine Learning Training and Serving Pipelines

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

MLOps: Automatic, Zero-Touch and Reusable Machine Learning Training and Serving Pipelines

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References