Model Lifecycle

Building a machine learning model is an iterative process. Many of the steps needed to build a machine learning model are reiterated and modified until data scientists are satisfied with the model performance. This process requires a great deal of data exploration, visualization and experimentation as each step must be explored, modified and audited independently.

AnalyticOps covers an end-to-end methodology from business understanding to model consumption in production with several components used in the different stages of the model lifecycle.

This chapter covers the following details related to model lifecycle:

Workspace

To open the model version lifecycle page, see Model Version Lifecycle.

The Model Version Lifecycle page contains the following areas.

1. Header

The Header displays steps of the model lifecycle, guidelines to complete the current step and action buttons to go to the next step.

2. Model Version Details Area

The Model Version Details area displays all the information related to the model version including Version ID, Creation Date, Created By, and Status. You can add Tags to the model version.

3. Steps Detail Area

The Steps Detail area displays multiple sections for each step that is executed for the model lifecycle. Each section displays details of a particular step like Training, Evaluation or Deployment. You can expand/collapse these sections.

Model Training

During the model training, the training dataset is used to train the model. Model hyperparameter selection is a major task in the model training process. Models are algorithms, and hyperparameters are the knobs that you can tune to improve the performance of the model. For example, the depth of a decision tree is a hyperparameter.

Model training is discussed in Create and Import Model Versions. After a model is trained, you can see the training details on the Model Version Lifecycle page.

To view the training details:

  1. From the Model Versions list, select the Trained model and click the View Events icon.

    The Model Version Lifecycle page displays. The Train step in the header is marked as done.

  2. Expand the Training Details section.
    The Training Details section displays the training details.

    The following details display related to the training job:

    Property

    Description

    Job ID

    Specifies the training job ID. You can click on View Job Details to see the event details of the job. For details, see Jobs.

    Training Date

    Specifies the date on which the training job was executed.

    Trained By

    Specifies the username who executed the training job.

    Status

    Shows the status of the training job as Completed.

    Dataset ID

    Displays the training dataset ID used to train the job. You can click on View Dataset to see the dataset details. For details, see Datasets.

    Dataset Name

    Displays the training dataset name used to train the job.

    Resources

    Specifies the resources utilized in the training job including CPU and Memory.

    Hyper Parameters

    Specifies the hyper parameters defined to run the job including eta and max_depth.

    Job Progress

    Lists down all phases of the training job. The job progress information includes:
    Status: Status of each phase as Created, Scheduled, Running, Trained, Completed
    Start Date/Time: Start date and time for each phase
    End Date/Time: End date and time for each phase
    Duration: Duration of each phase

    Training Artefacts

    Lets you view and download training artefacts. For details, see View and Download Model Artefacts.

Model Evaluation

Model evaluation is the next step in the model lifecycle when you evaluate a trained model using the test dataset. The test dataset is used to see how well the model performs on data it has not seen earlier.

Model evaluation generates an evaluation report that displays a number of metrics to monitor the model performance. The details of the evaluation report are mentioned in the section below.

Note: If the “Enable Model Evaluation” option is not selected for BYOM model, the Model Evaluation step will not be available on the Model Lifecycle page. In that case, you can directly go to the Approval step after Training the model version.

To evaluate a trained model:

  1. On the Model Version Lifecycle page, click the Evaluate button.
    The Evaluate Model Version dialog displays.

    https://docs.tdaoa.com/images/ug_model_lifecycle_evaluation_dialog.png
  2. In the Basic tab, set the properties:

    Property

    Description

    Model

    The name of the model in read-only format.

    Dataset Template

    Specifies the required dataset template. For details, see Dataset Templates.

    Dataset

    Specifies the dataset to be used for evaluation job. For details, see Datasets.

    Connection

    Specifies the dataset connection settings to be used for evaluation job. For details, see Dataset Connections.

    https://docs.tdaoa.com/images/evaluate_basic.png
  3. In the Advanced tab, set the properties:

    Property

    Description

    Engine

    Docker Image

    Resource Template

    https://docs.tdaoa.com/images/ug_model_lifecycle_evaluation_advanced.png
  4. Click Evaluate.
    The model version evaluation progress displays on the screen as below.

  5. Click Done when the evaluation progress completes.
    The Model Version Lifecycle page displays. You can see that the Evaluate step in the header is marked as Done and the model version status is changed to Evaluated.

  6. Expand the Evaluation Details section to see the details of the evaluation job.

    The following details display related to the evaluation job:

    Property

    Description

    Job ID

    Specifies the evaluation job ID. You can click on View Job Details to see the event details of the job. For details, see Jobs.

    Training Date

    Specifies the evaluation date.

    Trained By

    Specifies the username who executed the evaluation job.

    Status

    Shows the status of the evaluation job as Completed

    Dataset ID

    Displays the testing dataset ID used to evaluate the job. You can click on View Dataset to see the dataset details. For details, see Datasets.

    Dataset Name

    Displays the testing dataset name used to evaluate the job.

    Resources

    Specifies the resources utilized in the evaluation job including CPU and Memory.

    Job Progress

    Lists down all phases of the evaluation job. The job progress information includes:
    Status: Status of each phase as Created, Scheduled, Running, Evaluated, Completed
    Start Date/Time: Start date and time for each phase
    End Date/Time: End date and time for each phase
    Duration: Duration of each phase

    Evaluation Artefacts

    Lets you view and download evaluation artefacts. For details, see View and Download Model Artefacts.

Model Evaluation Report

AnalyticOps provides allows you to evaluate a model and mark a champion model based on its performance. You can view the evaluation report that highlights the model performance in the form of certain metrics and compare models based on the metric values.

The next section will describe the model comparison feature in detail.

To view the Model Evaluation report:

  1. From the Evaluation Details section, click View Evaluation Report.

    The Model Evaluation Report page displays.

The model evaluation report displays the following areas:

  • Model Version Details

  • Key Metrics

  • Metrics

  • Performance Charts

  • Actions

Model Version Details

Lists down all the details of the model version, training and evaluation jobs.

The following details display related to the training job:

Property Description
Model Version ID Specifies the model version ID. You can click on the Model version ID link to go to the Model Version lifecycle page.
Evaluation Job ID Specifies the evaluation job ID. You can click on the Job ID link to go to the Job's details.
Evaluation Date Specifies the evaluation date.
Dataset ID Displays the training dataset ID used to train the job. You can click on the Dataset ID link to see the dataset details.
Dataset Name Displays the training dataset name used to train the job.
Hyper Parameters Specifies the hyperparameters defined to run the job including eta and max_depth.

Key Metrics

Displays the key metrics that you mark in the Metrics area. The Metrics area can contain a large list of performance metrics. You can mark some of the metrics as Key Metrics to easily access them. All the key metrics will display in this area.

Metrics

Lists down the performance metrics and their values for the current model version. There can be a large list of metrics including Accuracy, Recall, Precision, F1 score. The Mark as Key Metric option allows you to mark the key metrics and they will display in the Key Metrics area.

A list of common performance metrics is:

Metric Description
Accuracy The ratio of the number of correct predictions to the total number of input samples.
Recall The number of correct positive results divided by the number of all relevant samples (all samples that should have been identified as positive).
Precision The number of correct positive results divided by the number of positive results predicted by the classifier.
F1-score F1 Score is the Harmonic Mean between precision and recall. The range for F1 Score is [0, 1]. It tells you how precise your classifier is (how many instances it classifies correctly), as well as how robust it is (it does not miss a significant number of instances).

Performance Charts

Displays a number of performance charts based on different metrics including Confusion matrix, ROC curve and SHAP feature importance. These charts help you monitor model performance visually and decide if you want to mark the model as Champion.

Chart Description
Confusion Matrix A Confusion matrix is an N x N matrix used to evaluate model performance, where N is the number of target classes. The matrix compares the actual target values with those predicted by the machine learning model.
ROC Curve ROC Curves summarize the trade-off between the true positive rate and false positive rate for a predictive model using different probability thresholds.
SHAP Feature Importance SHAP feature importance is based on magnitude of feature attributions.

Actions

The model evaluation report allows you to perform a number of actions on the current model version.

Action Description
Approve Lets you approve the model version. For details, see Model Approval.
Reject Lets you reject the model version. For details, see Model Approval.
Mark/Unmark as Champion Lets you mark/unmark the model version as Champion based on its performance. For details, see Mark Model as Champion.
View Model Drift Allows you to go to the Model drift page and monitor the model performance. For details, see Model Drift.

Model Comparison

AnalyticOps lets you can compare and assess two or more model versions. When you compare model versions, the model comparison output includes model properties, hyperparameters and performance metrics. The model comparison report for two models can be generated and it includes evaluation reports of both the model versions. You can compare the evaluation reports and mark one of the model versions as Champion.

To compare model version:

  1. Go to the Model Versions list of a model and click the Compare Versions button.

    Note: The Compare Versions button displays only when you have two or more Evaluated versions are available in the list. The model comparison only works for more than one evaluated model version.

    The Select Comparison Fields dialog displays.

    https://docs.tdaoa.com/images/ug_model_lifecycle_select_fields.png
  2. Hover on a field in the Available Fields list and click on the Add icon.
    The field adds to the Selected Fields list.

    https://docs.tdaoa.com/images/ug_model_lifecycle_select_fields_2.png
  3. Add all the required fields to the Selected Fields list and click Compare.
    The Model Comparison page displays the values of all the selected fields for the evaluated versions of the model. You can compare the metrics and analyze the model performance here.

  4. Select two model versions from the list and click the View Comparison Report button.

    Note: The Model Comparison report can be generated for two model versions only.

    The Model Comparison Report displays for the selected model versions. This report is similar to the Model Evaluation report and you can compare the metrics of both the models and decide on a champion model.

Mark Model as Champion

Based on the model performance, you can mark a model version as Champion.

You can mark a model version as Champion either from the Model Evaluation report or Model Comparison report.

To mark a model as Champion:

  1. For an Evaluated model version, open either Model Evaluation Report or Model Comparison Report.

    Note: To open a Model Evaluation report, see Model Evaluation Report.

    To open Model Comparison report, see Model Comparison.

  2. Select the Mark as Champion button on the top.

    https://docs.tdaoa.com/images/ug_model_lifecycle_mark_champion_button.png

    The model version is marked as Champion and a Green start displays with it. You can view a Champion model version both in Model Versions list and Model Comparison list.

To unmark a model as Champion:

  1. For a Champion model version, select the Unmark as Champion button in the Model Evaluation or Model Comparison report.

    https://docs.tdaoa.com/images/ug_model_lifecycle_unmark_champion_button.png

    The model version no longer remains a Champion.

Model Approval

You can approve or reject a model version based on its performance. An approved model can be deployed in production, however a rejected model cannot be deployed and you need to evaluate it again.

To approval a model version:

  1. From the Model Version Lifecycle page, select the Approve button.
    The Approve Model Version dialog displays.

    https://docs.tdaoa.com/images/ug_model_lifecycle_model_approval.png
  2. Insert Comments and click Approve.
    The model version status is changed to Approved. The Approval Details section displays.

    The following details display in the Approval Details section:

    Property

    Description

    Job ID

    Specifies the approval job ID. You can click on View Job Details to see the event details of the job. For details, see Jobs.

    Approval Date

    Specifies the approval date.

    Approved By

    Specifies the username who executed the approval job.

    Status

    Shows the status of the approval job as Completed

    Comments

    Shows the approval comments.

    Job Progress

    Lists down all phases of the approval job. The job progress information includes:
    Status: Status of each phase as Created, Completed
    Start Date/Time: Start date and time for each phase
    End Date/Time: End date and time for each phase
    Duration: Duration of each phase

To reject a model version:

  1. From the Model Version Lifecycle page, select the Reject button.
    The Reject Model Version dialog displays.

    https://docs.tdaoa.com/images/ug_model_lifecycle_reject_model.png
  2. Insert Comments and click Reject.
    The model version status is changed to Rejected. The Approval Details section displays.

    The following details display in the Approval Details section:

    Property

    Description

    Job ID

    Specifies the rejection job ID. You can click on View Job Details to see the event details of the job. For details, see Jobs.

    Approval Date

    Specifies the rejection date.

    Approved By

    Specifies the username who executed the rejection job.

    Status

    Shows the status of the rejection job as Completed

    Comments

    Shows the rejection comments.

    Job Progress

    Lists down all phases of the rejection job. The job progress information includes:
    Status: Status of each phase as Created, Completed
    Start Date/Time: Start date and time for each phase
    End Date/Time: End date and time for each phase
    Duration: Duration of each phase

Model Deployment

AnalyticOps provides multiple environments where you can deploy your models in production. You can monitor each deployment and re-evaluate or retire it based on its performance.

To deploy a model version as RESTful:

  1. From the Model Version Lifecycle page, select the Deploy button.

    The Deploy Model Version dialog displays.

    https://docs.tdaoa.com/images/ug_model_lifecycle_deploy_model_dialog.png
  2. In the Basic tab, set the properties:

    Property

    Description

    Engine Type

    Lets you select the engine type for deployment: RESTful, Batch, In-Vantage

    • For RESTful, set the properties:

    https://docs.tdaoa.com/images/ug_model_lifecycle_deploy_restful.png

    Property

    Description

    Replicas

    Specifies the number of container replicas to be deployed.

    • For Batch, set the properties:

    https://docs.tdaoa.com/images/ug_model_lifecycle_deploy_batch.png

    Property

    Description

    Dataset Template

    Specifies the required dataset template.

    Connection

    Specifies the connection settings for the deployment job.

    Schedule

    Lets you define an execution schedule for the batch job. The schedule can be Hourly, Daily, Weekly, Monthly and Yearly.

    Cron

    Lets you write a cron expression directly as input.

    Run Once

    Allows the batch job to be run once only.

    • For In-Vantage, set the properties:

    https://docs.tdaoa.com/images/ug_model_lifecycle_deploy_vantage.png

    Property

    Description

    Connection

    Specifies the connection settings for the deployment job.

    Database

    Lets you select the database for the deployment job.

  3. In the Advanced tab, set the properties:

    https://docs.tdaoa.com/images/ug_model_lifecycle_deploy_advanced.png

    Property

    Description

    Engine

    Docker Image

    Resource Template

    Build Properties

    Lets you define build properties as key/value pairs.

  4. Click Deploy.
    The model version deployment progress displays on the screen as below.

  5. Click Done when the deployment progress completes.
    The model version status changes to Deployed. The Deployment Details section displays on the page.

    The following details display in the Deployment Details section:

    Property

    Description

    Job ID

    Specifies the deployment job ID. You can click on View Job Details to see the event details of the job. For details, see Jobs.

    Deployment Date

    Specifies the deployment date.

    Deployed By

    Specifies the username who executed the deployment job.

    Status

    Shows the status of the deployed job as Completed

    Engine

    Specifies the deployment engine.

    Job Progress

    Lists down all phases of the deployment job. The job progress information includes:
    Status: Status of each phase as Created, Completed
    Start Date/Time: Start date and time for each phase
    End Date/Time: End date and time for each phase
    Duration: Duration of each phase

Model Retiring

To retire the active deployments of a model version:

  1. From the Model Version Lifecycle page, click the Retire button.

    The Retire Model Version dialog displays.

    https://docs.tdaoa.com/images/ug_model_lifecycle_retire_dialog.png
  2. Select the Deployment ID.

    https://docs.tdaoa.com/images/ug_mode_lifecycle_retire_2.png
  3. Click Retire.
    The model version retirement progress displays on the screen as below.

  4. Click Done when the retirement progress completes.
    The model version status changes to Retired if all active deployments are retired. The Retirement Details display on the model version lifecycle page.

    Comment: Image will be updated when Retire works, not working at the moment.

    ![img](file:///Users/sm250158/AoaCoreServices/docs/online/source/meta/images/retire_restful_details.png?lastModify=1619606114)