Deployments

After the model training and evaluation processes are complete, the best performing models are deployed to production. Model deployment is a way to integrate a model into an existing production environment to make practical business decisions based on data.

Model deployment is one of the last stages in the model lifecycle and one of the most difficult processes of gaining value from machine learning. It requires coordination between data scientists, IT teams, software developers, and business professionals to ensure the model works reliably in the organization’s production environment.

AnalyticOps UI provides you an effective and seamless method to deploy your models into production that helps you start using them to make practical decisions.

The Deployments’ module lets you monitor all model versions of a project that are deployed to production on a single screen. It allows you to visualise the deployments in production and take an action based on the models’ performance. This chapter covers the following details:

Types of Deployments

AnalyticOps allows you to deploy your models in production in one of the following three environments:

  • Docker RESTful: Allows you to deploy your model using REST API.

  • Docker Batch: Lets you execute batch deployment of your model by processing input data from a dataset template and data connection and writes the output to a file. Batch deployments can be run on-demand or based on a schedule.

  • In-Vantage: Allows you to use the Teradata Vantage engine to perform the model deployment.

For details of all three environments and properties, see Model Deployment.

View List of Deployments

To view the model deployments for a project:

  1. Click the Deployments from the Navigation bar of a selected project.
    The list of deployments for the project displays in the Work area and the Deployments option highlights in the Navigation bar.

    The list of Deployments displays the following properties for each model deployment.

    Property

    Description

    Deployment ID

    Specifies the deployment job ID.

    Model Name

    Specifies the deployed model name.

    Model Version ID

    Specifies the deployed model version ID.

    Engine Type

    Specifies the type of deployment engine used to run the deployment as RESTful, Batch, In-Vantage.

    Engine Name

    Specifies the name of the engine used in deployment.

    Deployed By

    Specifies the name of the user who deployed the model.

    Deployment Date

    Lets you know the deployment date of the model.

    Published Only

    Specifies if the deployment is publised only and not served / scheduled.

    View Details

    Allows you to view the details of the selected deployment. For details, see View Deployment Details.

To retire an active deployment:

  1. Select a deployment and click the Retire Deployment button.
    The Retire Model confirmation dialog displays.

    https://docs.tdaoa.com/images/ug_deployments_retire.png
  2. Click Retire.
    The deployment retires and the status of the deployment changes to Retired. For more details of model retiring, see Model Retiring.

Active vs Retired Deployments

A model deployment status can be active or retired.

  • Active deployment: An active deployment is the one that is still running in production and being monitored.

  • Retired deployment: A retired deployment is the one that is retired and no more running in production.

The Deployments list shows both types of deployments.

Comment: Section will be updated after implementation of retired deployment list in the Deployments module.

View Deployment Details

You can view the details of a model deployment including the defined parameters and the monitoring of deployments including feature drift, performance drift and prediction drift.

In the Deployments list, click the View Details icon to view details for a deployment.

Deployment Details

The deployment detail view is specific to different deployment types and whether or not they are publish only. See Types of Deployments. Deployments details are displayed for:

Docker Batch Deployment

The deployment Properties are displayed in the first tab in three different sections: Model Details, Deployment Details and Advanced. The second tab displays the Execution Stats which are directly linked with scheduler to perform various operations. This tab is not displayed for deployments that were published only.

  1. Properties:
    The properties defined are displayed in read-only format.

    Scheduling details are not displayed when deployment is published only.

    For details of deployment properties, see Model Deployment.

  2. Execution Stats:
    The execution stats for multiple jobs are displayed. This display also provides view for the next execution date and time for the job which was set while deploying the model. AnalyticOps also allows you to trigger your job manually at any time even though it has a schedule. This can be achieved by clicking on RUN NOW button at the top right.

    To View the live logs for your on going job or historic logs for a completed job you can either double click any job row or click on the View hyperlink adjacent to every job row. A dialog appears displaying all the current logs with the ability to refresh and fetch the new ones as well.

  3. Feature Drift, Prediction Drift and Performance Drift:
    These tabs remain the same for all deployment types. See Drift Monitoring for details.

Docker RESTful Deployment

The deployment Properties are displayed in the first tab in three different sections: Model Details, Deployment Details and Advanced. The second tab displays Testing which enables you to test the current model endpoint. This tab is not displayed for deployments that were published only.

  1. Properties:
    The properties defined are displayed in read-only format.

    Testing tab is not displayed when deployment is published only.

    For details of deployment properties, see Model Deployment.

  2. Testing:
    The current model endpoint is displayed to be used directly or you can also send live request from the editor in view and retrieve response.

  3. Feature Drift, Prediction Drift and Performance Drift:
    These tabs remain the same for all deployment types. See Drift Monitoring for details.

In-Vantage Deployment

The deployment Properties are displayed in the first tab in three different sections: Model Details, Deployment Details and Advanced. The second tab displays the Execution Stats which are directly linked with scheduler to perform various operations. This tab is not displayed for deployments that were published only.

  1. Properties:
    The properties defined are displayed in read-only format.

    Scheduling details are not displayed when deployment is published only.

  2. Execution Stats:
    The execution stats for multiple jobs are displayed. This display also provides view for the next execution date and time for the job which was set while deploying the model. AnalyticOps also allows you to trigger your job manually at any time even though it has a schedule. This can be achieved by clicking on RUN NOW button at the top right.

    To View the live logs for your on going job or historic logs for a completed job you can either double click any job row or click on the View hyperlink adjacent to every job row. A dialog appears displaying all the current logs with the ability to refresh and fetch the new ones as well.

  3. Feature Drift, Prediction Drift and Performance Drift:
    These tabs remain the same for all deployment types. See Drift Monitoring for details.

Drift Monitoring

Model Drift represents the concept of a model degrading in performance over time due to changes in the underlying dataset or concepts. Monitoring model drift involves the ability to surface this performance information to the end-user and configure alerts so that they can take actions (pre-emptive preferably) to reduce the business impact of this drift. For more details, see Alerts.

AnalyticOps allows you to monitor three types of drift:

Feature Drift

Feature drift is based on understanding and monitoring of changes in the statistics of the dataset the model was trained on vs the dataset statistics the model is currently predicting. As mentioned earlier, data is expected to evolve over time. Therefore, the monitoring of this data needs to be able to capture this evolution and know when the data has evolved past a certain divergence threshold or if it has simply changed completely.

AnalyticOps lets you capture the training dataset statistics and the feature importance. Providing this information, monitoring the online statistics relative to this becomes a metric capture and comparison process.

To view the Feature Drift of the selected deployment:

  1. Click the Feature Drift tab.

    The Feature Drift details display.

Features

Feature Drift generates a histogram for each feature, and also it captures the drift value of the feature over time.

The Features table displays the following details.

Property Description
Name Specifies the name of the feature.
Group Specifies the group of the feature. By default, all features belong to the default group.
Type Specifies the data type of feature: Continuous, Categorical
Importance Specifies the feature importance. Feature importance measures the increase in the prediction error of the model after we permuted the feature's values, which breaks the relationship between the feature and the true outcome.
The importance of a feature is measured by calculating the increase in the model's prediction error after permuting the feature. A feature is important if shuffling its values increases the model error, because in this case the model relied on the feature for the prediction. A feature is unimportant if shuffling its values leaves the model error unchanged, because in this case the model ignored the feature for the prediction.

To view details of a feature:

  1. Select a feature in the Features table.
    The right section of the page displays the Distribution histogram and Drift over time for the selected feature.

    The Distribution histogram displays the feature value on the x-axis and probability the on the y-axis for both Training and Production data.

    https://docs.tdaoa.com/images/v6/deployment_details_predictiondrift.png

    The drift over time graph displays the following metrics for the selected feature.

    https://docs.tdaoa.com/images/v6/deployment_feature_drift_over_time.png
    • Population Stability Index (PSI): A measure of population stability between two population samples. PSI is a widely used statistic that measures how much a variable has shifted over time. A high PSI may alert the business to a change in the characteristics of a population. This shift may require investigation and possibly a model update.

    • Kullback–Leibler (KL) divergence: A way to measure the dissimilarity of two probability distributions.

    • Kolmogorov-Smirnov (KS) test: A nonparametric test of the equality of continuous (or discontinuous), one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution (one-sample K–S test), or to compare two samples (two-sample K–S test).

    • Chi-Square: A statistical test used to compare observed results with expected results. The purpose of this test is to determine if a difference between observed data and expected data is due to chance, or if it is due to a relationship between the variables you are studying. A chi-square test helps better understand and interpret the relationship between two categorical variables.

Performance Drift

The simplest form of model drift monitoring is to ensure you regularly compare the predicted results to the ground truth result. To do this, you need access to the actual result, so you can compare. This information is not always available in a timely manner (email campaigns, churn, etc.), and sometimes not at all. However, when it is, this then turns into an evaluation problem where the performance metrics determined by the data scientist are calculated on the new ground truth datasets and can be compared.

Using AnalyticOps, you can visualize the information generated from evaluation results over time.

The following details include in Performance Drift:

Model Version Details

This area displays the details of the model version that was deployed in production. The details include Model Name, Model Version ID and Status.

Model Drift Monitoring

This area lets you visualize the evaluation metrics plotted over time and allows you to make a comparison and monitor model performance over time.

The Top section displays the stats of all selected metrics in the form of tiles. The information includes the current value of the metric, the comparison of the current value and the previous value of the metric, and the starting value of the metric.

https://docs.tdaoa.com/images/ug_deployments_performence_drift_tile.png

The next section displays the graphs plotted for the selected metrics.

Each graph plots the metric value over time. To view the details of a plotted metric value or data point:

  1. Hover on a data point in the graph.
    A tooltip displays listing the information related to the value at that point including date, metric value, its comparison with the previous value and the evaluation Job ID.

To select the metrics that you want to plot in the graphs section:

  1. Click the Select Metrics button.
    The Select Evaluation Metrics dialog displays.

    https://docs.tdaoa.com/images/ug_deployments_select_metrics.png
  2. Hover on a field in the Available Fields list and click on the Add icon.
    The field adds to the Selected Fields list.

    https://docs.tdaoa.com/images/ug_deployments_select_metrics.png
  3. Add all the required fields to the Selected Fields list and click Select.
    The tiles and graphs sections will be updated and display all the selected metrics.

Actions

You can perform the following actions on the model if you observe a performance degradation.

Action Description
Retire Model Version Lets you retire the model version. As the retirement process completes, the model version status changes to Retired if no active deployment is running for the model version. For more details of model retiring, see Model Retiring.
Evaluate Model Version Lets you re-evaluate the model version. As the evaluation process completes, the model version status changes to Evaluated if no active deployment is running for the model version. For more details of model evaluation, see Model Evaluation.

To retire the model version deployment:

  1. Click the Retire Model Version button.
    The Retire Model confirmation dialog displays.

    https://docs.tdaoa.com/images/ug_deployments_retire.png
  2. Click Retire to continue.
    The model retires and its status changes to Retired if no active deployment is running for the model version. For more details of model retiring, see Model Retiring.

To evaluate the model version deployment:

  1. Click the Evaluate Model Version button.
    The Evaluate Model Version displays.

    https://docs.tdaoa.com/images/ug_model_lifecycle_evaluation_dialog.png
  2. Set the properties and click Evaluate.
    The model evaluates and its status changes to Evaluated if no active deployment is running for the model version. For more details of model evaluation, see Model Evaluation.

Prediction Drift

Prediction drift is specific to monitoring the data statistics of the model output. You can monitor whether model predictions are suddenly deviated from what they normally predict.

Predictions

Predictions Drift generates a histogram for each prediction, and also it captures the drift value of the prediction over time.

The Predictions table displays the following details.

Property Description
Name Specifies the name of the prediction.
Group Specifies the group of the prediction. By default, all predictions belong to the default group.
Type Specifies the data type of prediction: Continuous, Categorical

To view details of a prediction:

  1. Select a prediction in the Predictions table.
    The right section of the page displays the Distribution histogram and Drift over time for the selected prediction.

    The Distribution histogram displays the prediction value on the x-axis and counts the on the y-axis for both Training and Production data.

    https://docs.tdaoa.com/images/ug_deployments_predictions_distribution.png

    The drift over time graph displays the following metrics for the selected prediction.

    https://docs.tdaoa.com/images/v6/deployment_prediction_drift_over_time.png
    • Population Stability Index (PSI): A measure of population stability between two population samples. PSI is a widely used statistic that measures how much a variable has shifted over time. A high PSI may alert the business to a change in the characteristics of a population. This shift may require investigation and possibly a model update.

    • Kullback–Leibler (KL) divergence: A way to measure the dissimilarity of two probability distributions.

    • Kolmogorov-Smirnov (KS) test: A nonparametric test of the equality of continuous (or discontinuous), one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution (one-sample K–S test), or to compare two samples (two-sample K–S test).

    • Chi-Square: A statistical test used to compare observed results with expected results. The purpose of this test is to determine if a difference between observed data and expected data is due to chance, or if it is due to a relationship between the variables you are studying. A chi-square test helps better understand and interpret the relationship between two categorical variables.