Deploying and Inferencing a ML Model in DKubeX¶

MLOps Tutorials

After training and registering a ML model in DKubeX, you can deploy the model for inferencing using Ray Serve on DKubeX. The following steps will guide you through the process of deploying a trained ML model and performing inferencing.

Prerequisites¶

You need to have a ML model trained and registered in DKubeX’s MLFlow. To learn how to train and register a ML model in DKubeX, please refer to Training a ML Model in DKubeX / Running Ray Jobs.
You can get the deployment script either from GitHub, or locally from your workspace. For this tutorial, if you are going to use the deployment script locally from your workspace, you need to have the deployment script available on your DKubeX workspace. This script is also available in the dkubex-examples repo that you cloned in the previous Training a ML Model in DKubeX / Running Ray Jobs tutorial. By any chance if it is unavailable or got deleted, use the following steps to download the deployment script for this tutorial.
- Note that the inferencing script used in this tutorial is also available in the dkubex-examples repo.
1. On the My Workspace page of your DKubeX setup, click on the Terminal app to open the Terminal CLI.
2. In the Terminal CLI, run the following command to download the dkubex-examples repo containing the example files for DKubeX to be used in this user guide:
  git clone https://github.com/dkubeio/dkubex-examples.git

Once both of the prerequisites are met, you can proceed with deploying the trained ML model using the following steps:

Deploying a Registered ML Model from MLFlow¶

Open the Deployments page on your DKubeX workspace. This page lists all the model deployments that are currently running.
To create a new model deployment, click on the “Create Deployment” button (shown as a “+” button on the top left corner of the Deployments page).

Create Deployment Button (+)¶
On the General page, provide the following details:
General Page – Deployment Creation¶
- On the General section, select Serve as the type of deployment to be created and provide the following details:
  
  Field
  
  Description
  
  Name
  
  Provide a unique name for the model deployment. For this tutorial, provide mnistdeploy.
  
  Description
  
  Provide a brief description of the model deployment (Optional).
- On the Framework section, select the type as Ray as we are going to use Ray Serve for this model deployment. Optionally, you can also select the Publish checkbox to publish this deployment so that other users of the same DKubeX setup can also access and use this deployment.
- Once done, click on the Next button to proceed to the Configuration page.

Field	Description
Name	Provide a unique name for the model deployment. For this tutorial, provide `mnistdeploy`.
Description	Provide a brief description of the model deployment (Optional).

On the Configuration page, provide the following details:

Configuration Page – Deployment Creation¶

On the Source Configuration section, provide the following details:

Field

Description

Registry Type

Model registry from which the model has to be fetched. For this tutorial, select MLFlow as the registry type.

Model

Provide the name of the model which you registered earlier in MLFlow post-training. For this tutorial, provide mnistmodel

Model Version

Provide the version of the model registered in MLFlow to be deployed. For this tutorial, provide 1 as the version.

Deployment Filepath

Provide the path to the deployment script file that contains the code for model deployment. For this tutorial, provide fashion_mnist.deploy if you want to pull the script from the dkubex-examples repo on GitHub directly. If you want to use the script locally from your workspace, provide the absolute path to the script file in your workspace (make sure to replace the <your-username> part with your DKubeX username), e.g., /home/<your-username>/dkubex-examples/rayserve/mlflow/fashion_mnist.deploy

If you are using GitHub to pull the deployment script, in that case provide the following details in the Github Settings section:

Field

Description

Username

Provide the GitHub username of the owner of the repo from which the deployment script has to be pulled. For this tutorial, provide dkubeio

Repository

Provide the name of the GitHub repo from which the deployment script has to be pulled. For this tutorial, provide dkubex-examples

Branch

Provide the branch name of the GitHub repo from which the deployment script has to be pulled. For this tutorial, provide mlflow

Commit Id

Provide the commit ID of the GitHub repo from which the deployment script has to be pulled (Optional). For this tutorial, you can leave it blank to pull the latest commit.

Also, click on the Private checkbox if the GitHub repo from which the deployment script has to be pulled is private. For this tutorial, you can leave it unchecked as the dkubex-examples repo is a public repo.

Once done, click on the Next button to proceed to the Advanced page.

Field	Description
Registry Type	Model registry from which the model has to be fetched. For this tutorial, select `MLFlow` as the registry type.
Model	Provide the name of the model which you registered earlier in MLFlow post-training. For this tutorial, provide `mnistmodel`
Model Version	Provide the version of the model registered in MLFlow to be deployed. For this tutorial, provide `1` as the version.
Deployment Filepath	Provide the path to the deployment script file that contains the code for model deployment. For this tutorial, provide `fashion_mnist.deploy` if you want to pull the script from the dkubex-examples repo on GitHub directly. If you want to use the script locally from your workspace, provide the absolute path to the script file in your workspace (make sure to replace the `<your-username>` part with your DKubeX username), e.g., `/home/<your-username>/dkubex-examples/rayserve/mlflow/fashion_mnist.deploy`

Field	Description
Username	Provide the GitHub username of the owner of the repo from which the deployment script has to be pulled. For this tutorial, provide `dkubeio`
Repository	Provide the name of the GitHub repo from which the deployment script has to be pulled. For this tutorial, provide `dkubex-examples`
Branch	Provide the branch name of the GitHub repo from which the deployment script has to be pulled. For this tutorial, provide `mlflow`
Commit Id	Provide the commit ID of the GitHub repo from which the deployment script has to be pulled (Optional). For this tutorial, you can leave it blank to pull the latest commit.

On the Advanced page, provide the following details:

Advanced Page – Deployment Creation¶

In the Advanced Configuration section, check the No Resource Limit checkbox if you do not want to set a limit for resource usage. For this tutorial, keep this unchecked. If you want to set a limit on the resources used (which we will do in this tutorial), provide the following details:

Field

Description

CPU (Min/Max)

Minimum and maximum number of CPU cores to be allocated to the model deployment. Default: Min: 1, Max: 2

Memory (Min/Max)

Minimum and maximum amount of memory (in GB) to be allocated to the model deployment. Default: Min: 3, Max: 4

GPU

Number of GPUs to be allocated to the model deployment (Optional).

Replicas (Min/Max)

Minimum and maximum number of replicas to be created for the model deployment. Default: Min: 1, Max: 2

Max Concurrent Requests

Maximum number of concurrent requests that can be handled by the model deployment. Default: 3

Once done, click on the Submit button to start the model deployment.
Model Deployment in Progress¶
- To view the details of the model deployment, click on the deployment name.
- To access the details and logs of the deployment from the Ray dashboard, click on the Ray Dashboard button on the right side of that deployment entry.
  
  Ray Dashboard for Model Deployment¶
Once the deployment is successful, the status of the deployment will change to Running on the Deployments page.

Running Model Deployment¶

Field	Description
CPU (Min/Max)	Minimum and maximum number of CPU cores to be allocated to the model deployment. Default: `Min: 1, Max: 2`
Memory (Min/Max)	Minimum and maximum amount of memory (in GB) to be allocated to the model deployment. Default: `Min: 3, Max: 4`
GPU	Number of GPUs to be allocated to the model deployment (Optional).
Replicas (Min/Max)	Minimum and maximum number of replicas to be created for the model deployment. Default: `Min: 1, Max: 2`
Max Concurrent Requests	Maximum number of concurrent requests that can be handled by the model deployment. Default: `3`

Inferencing using the Deployed ML Model¶

Once the model deployment is in the Running state, you can proceed to perform inferencing using the deployed model. The following is a short tutorial on performing an inferencing request using the deployed model:

Open the Terminal app from the My Workspace page of your DKubeX setup to open the Terminal CLI.
In the Terminal CLI, run the following command to get the active DKubeX workspace profile name on which you are currently working:
d3x profile list
The profile with asterisk (*) mark is your active DKubeX workspace profile. note this name down as it will be used in the next step.
Run the following command to perform an inferencing request using the deployed model. Make sure to replace the <profile-name> part with your active DKubeX workspace profile name that you noted down in the previous step.
python3 ~/dkubex-examples/rayserve/mlflow/client/fmnist_client.py <profile-name> mnistdeploy ~/dkubex-examples/rayserve/mlflow/client/images/pull-over.png
Here, mnistdeploy is the name of the model deployment created earlier in this tutorial, and pull-over.png is the sample input image on which inferencing has to be performed in this tutorial.

Once the command is executed successfully, you will see the inferencing output in the Terminal CLI as shown below:

Inferencing Output in Terminal CLI¶