What is AnalyticOps?

Vantage AnalyticOps Accelerator provides an easy-to-use web-based user interface (UI), a command line interface (CLI) and Python or R Software Development Kit (SDK) to handle Advanced Analytics models deployment and operation (MLOps/ModelOps) in Vantage using Analytics 1-2-3 process.

Context

Machine Learning Operations (MLOps) is a methodology that unifies Machine Learning (ML) model solutions development (the ML element) with ML solutions operations (the Ops element). It automates critical steps of ML system construction. MLOps provides a set of standardized processes and technology capabilities for building, deploying, and operationalizing ML models rapidly and reliably. ModelOps applies MLOps methodologies to Machine Learning and non-Machine Learning model operations.

AnalyticOps is covering the MLOps and ModelOps aspects of Deploying and Operationalizing Advanced Analytics in Teradata Vantage. Machine Learning Operations (MLOps) is a methodology that unifies Machine Learning (ML) model solutions development (the ML element) with ML solutions operations (the Ops element). It automates critical steps of ML system construction. MLOps provides a set of standardized processes and technology capabilities for building, deploying, and operationalizing ML models rapidly and reliably. ModelOps applies MLOps methodologies to Machine Learning and non-Machine Learning model operations.

AnalyticOps is covering the MLOps and ModelOps aspects of Deploying and Operationalizing Advanced Analytics in Teradata Vantage.

Analytics 1-2-3

The Analytics 1-2-3 makes easy to use Vantage Platform for end-users for getting business value from Data Science cases.

  1. Data Preparation - It allows to leverage Vantage platform to Prepare Data: connecting to the Enterprise Data and offering a vast set of in-database functions for Data integration, Exploration, Cleansing and Feature Engineering.

  2. Train Model – 1-2-3 process allows to end-user to choose their own tools, languages and technologies they daily use. (ex: SAS, Scikit-learn, XGBoost, MLlib, H2O…)

  3. Deploy Model – it enables through AnalyticOps to bring your own model (BYOM) trained externally or tracking the training process and publishes into Vantage Platform where it will run to predict outcomes and will be monitored

AnalyticOps Accelerator Scope

AnalyticOps Accelerator automates and governs the model deployment, schedule, and monitoring process in an industrialized and repeatable fashion. Our focus is in 4 Areas:

  • Model Deployment: Once the model code is developed using end-user tool and language of choice, AnalyticOps provides an easy – guided way to deploy External and Git Models in Vantage or in Edge container. By deploying in-Vantage we can leverage In-Database scale with no extra technology or infrastructure investment. Deploying in Edge container might be useful or some use cases where model execution is required to be done in specific environment or device.

  • Model Lifecycle: With AnalyticOps the model implementation process is tracked from training, evaluation, approval to deployment and retirement. We permit the auditability of the model getting access to the end-to-end metadata of the process. For each process Dataset IDs are tracked, Parameters used, Dates, Users, Comments and Tags.

  • Model Governance: We organize the information of the models in Projects, letting the different Data Science departments have their own access to work in their models. Models represent each use case, and group all the Models trained within the Model Catalog. This provides an enterprise governance of the models.

  • Model Monitoring: When Models are deployed AnalyticOps provides best-in-class capabilities to understand model metrics and datasets / Features drift. Data Science teams and Machine Learning operations can get alerts and act early on the model updating process.

AnalyticOps Accelerator can work with both Bring your Own Models (BYOM) / 3rd party Models and git models:

Bring your Own Models (BYOM) / 3rd party Models

In AnalyticOps we enable Data Scientist to operationalize models that have been trained externally and which we don’t have the model training code available. Data Scientists will use the model binaries (executable files or scripts) that want to deploy, leveraging all the capabilities of AnalyticOps with a different lifecycle (Import -> Evaluate -> Approve -> Deploy -> Retire).

Git Models

We refer to git models to those which training, evaluation and scoring files are stored in git repository, the way we require these code recipes and files to be implemented is explained in the AnalyticOps methodology and in the different tutorials of this site: (Train -> Evaluate -> Approve -> Deploy -> Retire).

AnalyticOps Methodology Overview

Our Methodology is based on the standard Cross Industry Standard Process for Data Mining (CRISP-DM) defined by Teradata amongst others in 1997. It continues being the standard methodology used by the Data Science community.

In this methodology Data Scientists work to understand Business and Data for modelling process, making multiple iterations of the models (By Training and Evaluating multiple model experimentations) till the right model is found.

It’s important to understand that the Data Scientists will start using AnalyticOps once the AI Model is found and ready to operationalize. We don’t cover: Data Ingestions, Preparation, nor Model Experimentation stages. We let the user to use their own tools and environments in combination with Vantage for that.

  • Finalize AI Model: Code base used in Model Experimentation by user requires to be simply adapted to AnalyticOps recipes for Model training, Model Evaluation and Scoring. We have requirements and Parameters files to automate these configurations. These code files will be stored in Git repositories in a specific folder organization produced by AnalyticOps command line (CLI)

  • Train and Evaluate: Once the model recipes have been committed to git repositories, we can start using the browser user interface to train, evaluate and compare models in AnalyticOps. Visualizing the same charts and pictures from model experimentation and being able to select model champion to identify the model id that will go into Production.

  • Approve and Deploy: Request for Approval the specific model id and continue to publish the model in-Vantage or in Edge-container with scheduling optional capabilities

  • Score and Monitor: AnalyticOps is able to monitor Model Evaluation metrics over time, enabling Data Scientist to understand model degradation with new Data validated. Additional monitoring is in place for the datasets used, comparing all features distributions and statistics used in Training versus Scoring. Model owners will receive alerts when the Model or Data drift is happening.

Now we recommend you continue to learn how to use AnalyticOps on next chapters.

Learn How to Use AnalyticOps

In the next few chapters, you will learn how to use AnalyticOps and set up your own model repository. This guide provides details to help you get introduced to AnalyticOps modules and environment.

This guide is divided in the following sections:

We also recommend you to refer to frequently asked questions for additional information.

Assumptions

We have designed the documentation and the tutorials for anyone new to AnalyticOps – it is focused for a Data Scientist or Machine Learning Engineer persona.

AnalyticOps doesn’t provide or cover a full Data Science platform with an Integrated Development Environment (IDE) or Notebook Environment (i.e Jupyterhub notebooks). Every user/customer has the freedom to choose their tool of preference and we are using in this documentation a Jupyterhub environment for reference and as is an open-source tool that can be downloaded by any user.