DKube Developer’s Guide¶

This section provides instructions on how to develop code that will integrate with the DKube platform.

File Paths¶

For IDE & Run jobs, DKube provides a method to access the files in code, data, and model repositories without needing to know the exact folder within the DKube storage hierarchy. The repos are available in the following paths:

Repo Type	Path
Code	Fixed path: /mnt/dkube/workspace
Dataset	Mount path as described at Mount Path
Model	Mount path as described at Mount Path

The Dataset & Model repos are available at the following paths in addition to the user-configured mount paths:

Repo Type	Path
Dataset	/mnt/dkube/datasets/<user name>/<dataset name>
Model	/mnt/dkube/models/<user name>/<dataset name>

In the case of Amazon S3 and Amazon Redshift the mount paths also include the metadata files with the endpoint configuration.

Configuration File¶

A general purpose configuration file can be uploaded into DKube, as described in the following sections:

Job Type	Details
IDE	Configuration Screen
Run	Configuration File

The configuration file can be accessed from the code at the follosing location:

/mnt/dkube/config/<config file name>

Home Directory¶

DKube maintains a home directory for each user, at the location:

/home/<user name>

Files for all user-owned resources are created in this area, including metadata for Runs, IDEs, & Inferences. These can be accessed by an IDE.

The following folders are created within the home directory:

Workspace	Contains folders for each Code Repo owned by the user. These can be updated from a source git repo, edited and committed back to git repo.
Dataset	Contains folders for each Dataset Repo owned by the user. Each Dataset folder contains subdirectories for each version with the dataset files for the version.
Model	Contains folders for each Model Repo owned by the user. Each Model directory contains subdirectories for each version with the model files for the version.
Notebook	Contains metadata for user IDE instances
Training	Contains metadata for user Training Run instances
Preprocessing	Contains metadata for user Preprocessing Run instances
Inference	Contains metadata for user Inference instances

Amazon S3¶

DKube has native support for Amazon S3. In order to use this within DKube, a Repo must first be created. This is desribed at Add a Dataset

This section describes how to access the data and integrate it into your program. The mount path for the S3 Dataset repo contains the config.json & credentials files.

config.json¶

{
  "Bucket": "<bucket name>",
  "Prefix": "<prefix>",
  "Endpoint": "<endpoint>"
}

credentials¶

[default]
aws_access_key_id = xxxxxxx
aws_secret_access_key = xxxxxx

In addition, the path /etc/dkube/.aws contains the metadata and credentials for all of the S3 Datasets owned by the user.

/etc/dkube/.aws/config¶

[default]
bucket = <bucket name 1>
prefix = <prefix 1>
[dataset-2]
bucket = <bucket name 2>
prefix = <prefix 2>
[dataset-3]
bucket = <bucket name 3>
prefix = <prefix 3>

/etc/dkube/.aws/credentials¶

[default]
aws_access_key_id = xxxxxxx
aws_secret_access_key = xxxxxxxxx
[dataset-2]
aws_access_key_id = xxxxxxx
aws_secret_access_key = xxxxxxxxx
[dataset-3]
aws_access_key_id = xxxxxxx
aws_secret_access_key = xxxxxxxxx

Amazon Redshift¶

DKube has native support for Amazon Redshift. In order to use Redshift within DKube, a Repo must first be created. This is described at Add a Dataset

This section describes how to access the data and integrate it into your program. Redshift-specific environment variables are listed at Redshift Variables Redshift can be accessed with or without an API server.

Redshift Access Configuration¶

Redshift Access with an API Server¶

In order to configure the API server to fetch the metadata, a kubernetes config map is configured with the following information:

echo "
  apiVersion: v1
  kind: ConfigMap
  metadata:
    name: redshift-access-info
    namespace: dkube
  data:
     token: $TOKEN
     endpoint: $ENDPOINT
" | kubectl apply -f -

Variable	Description
TOKEN	Security token for the API server
ENDPOINT	url for the API server

DKube fetches the list of databases available and their associated configuration information, such as endpoints and availability region. Additionally, DKube fetches the schemas of the databases from the API server.

Redshift Access without an API Server¶

By default, DKube will use the following query to fetch the redshift schemas and show them as versions in DKube UI when creating a Dataset.

select * from PG_NAMESPACE;

Accessing the Redshift Data from the Program¶

Redshift data can be accessed from any Notebook or Run.

The metadata to access the Redshift data for the current job is provided from the Mount Path specified when the Job is created.

redshift.json¶

{
  "rs_name": "<name>",
  "rs_endpoint": "<endpoint>",
  "rs_database": "<database-name>",
  "rs_db_schema": "<schema-name>",
  "rs_user": "<user-name>"
}

Metadata for all of the selected Redshift datasets for the User is available at /etc/dkube/redshift.json for the Job.

[
   {
     "rs_name": "<name 1>",
     "rs_endpoint": "<endpoint 1>",
     "rs_database": "database-name 1>",
     "rs_db_schema": "<schema-name 1>",
     "rs_user": "<user 1>"
   },
   {
     "rs_name": "<name 2>",
     "rs_endpoint": "<endpoint 2>",
     "rs_database": "database-name 2",
     "rs_db_schema": "<schema-name 2>",
     "rs_user": "<user 2>"
   },
]

Redshift Password¶

The password for the Redshift data is stored encrypted within DKube. The code segment below can be used to retrieve the information without encryption.

import os, requests, json

def rs_fetch_datasets():
    user = os.environ.get("DKUBE_USER_LOGIN_NAME")
    url =
"http://dkube-controller-master.dkube:5000/dkube/v2/controller/users/%s/datums/class/dataset/datum/%s"

    headers={"authorization": "Bearer "+os.environ.get("DKUBE_USER_ACCESS_TOKEN")}
    datasets = []
    for ds in json.load(open('/etc/dkube/redshift.json')):
        if ds.get('rs_owner', '') != user:
           continue
        resp = requests.get(url % (user, ds.get('rs_name')), headers=headers).json()
        ds['rs_password'] = resp['data']['datum']['redshift']['password']
        datasets.append(ds)
   return datasets

This will return the datasets in the following format:

[
 {
   "rs_name": "name1",
   "rs_endpoint": "https://xx.xxx.xxx.xx:yyyy",
   "rs_database": "dkube",
   "rs_db_schema": "pg_catalog",
   "rs_user": "user",
   "rs_owner": "owner",
   "rs_password": "*****"
 },
....
]

Mount Path¶

The mount path provides a way for the code to access the repositories. This section describes the steps needed to enable this access.

Before accessing a dataset, featureset, or model from the code, it needs to be created within DKube, as described at Add a Dataset and Add a Model

This will enable DKube to access the resource. The following image shows a Dataset detail screen for a GitHub dataset that has been uploaded to the DKube storage. It shows the actual folder where the dataset resides.

DKube allows the code to access the Dataset, FeatureSet, or Model without needing to know the exact folder structure through the mount path. When creating an IDE or Run, the mount path field should be filled in to correspond to the code.

Environment Variables¶

This section describes the environment variables that allow the program code to access DKube-specific information. These are accessed from the program code through calls such as:

EPOCHS = int(os.environ.get(‘EPOCHS’, 5))

Note

The variables and mount paths are available in the file /etc/dkube/config.json

General Variables¶

Name	Description
DKUBE_URL	API Server REST endpoint
DKUBE_USER_LOGIN_NAME	Login user name
DKUBE_USER_ACCESS_TOKEN	JWT token for DKube access
DKUBE_JOB_CONFIG_FILE	Configuration file specified at Job creation Configuration Screen
DKUBE_USER_STORE	Mount path for user-owned resources
DKUBE_DATA_BASE_PATH	Mount path for resources configured for an IDE/Run
DKUBE_NB_ARGS	Jupyterlab command line arguments containing auth token, base url and home dir, used in entrypoint for Jupyterlab
KF_PIPELINES_ENDPOINT	REST API endpoint for pipelines to authenticate pipeline requests. If not set, pipelines are created without authentication

DKUBE_JOB_CLASS	Type of Job (training, preprocessing, custom, notebook, rstudio, inference, tensorboard)
DKUBE_JOB_ID	Unique Job ID
DKUBE_JOB_UUID	Unique Job UUID
DKUBE_TENSORBOARD_DIR	Folder for TensorBoard event files

Variables Passed to Jobs¶

The user can provide program variables when creating an IDE or Run, as described at Configuration Screen

These variables are available to the program based on the variable name. Some examples of these are shown here.

Name	Description
STEPS	Number of training steps
BATCHSIZE	Batchsize for training
EPOCHS	Number of training epochs

Repo Variables¶

Name	Description
S3
S3_BUCKET	Storage bucket
S3_ENDPOINT	URL of server
S3_VERIFY_SSL	Verify SSL in S3 Bucket
S3_REQUEST_TIMEOUT_MSEC	Request timeout for Tensorflow to storage connection in milliseconds
S3_CONNECT_TIMEOUT_MSEC	Connection timeout for Tensorflow to storage connection in milliseconds
S3_USE_HTTPS	Use https (1) or http (0)

AWS
AWS_ACCESS_KEY_ID	Access key
AWS_SECRET_ACCESS_KEY	Secret key

Redshift Variables¶

Name	Description
DKUBE_DATASET_REDSHIFT_CONFIG	Redshift dataset metadata for user owned Redshift datasets
DKUBE_DATASET_REDSHIFT_DB_SCHEMA	Schema
DKUBE_DATASET_REDSHIFT_ENDPOINT	Dataset url
DKUBE_DATASET_REDSHIFT_DATABASE	Database name
DKUBE_DATASET_NAME	Dataset name
DKUBE_DATASET_REDSHIFT_USER	User name
DKUBE_DATASET_REDSHIFT_CERT	SSL Certificate

Hyperparameter Tuning Variables¶

Name	Description
DKUBE_JOB_HP_TUNING_INFO_FILE	Configuration file specified when creating a Run
PARENT_ID	Unique identifier (uuid)
OBJECTIVE_METRIC_NAME	Objective metric
TRIAL	Count of trial runs

DKube SDK¶

One Convergence provides an SDK to allow direct access to DKube actions. In order to make use of this, the SDK needs to be called at the start on the code. An SDK guide is available at DKube SDK

Writing Code for Metric Logging¶

Metric & Artifact logging within DKube are handled through MLFlow APIs. The APIs that are supported are defined at:

MLFlow Log Metric Definitions

MLFlow Log Metrics Definitions

MLFlow Autolog Definitions

The following steps are required to save a model and its associated metrics and artifacts:

Create and/or set an MLFlow Experiment
Start an MLFlow Run
Perform the Model training within the MLFlow Run
Log the pertinent metrics & artifacts from within that Run

This section provides some code segments that show how to log metrics using Python and Tensorflow/Keras. Other training frameworks, such as scikit-learn, will differ in the details.

Import MLFlow Module¶

import mlflow

Create & Set MLFlow Experiment¶

MLFlow Runs execute in an Experiment. The Experiment first needs to be created, then the Experiment needs to be set as the current one. In addition, the output folder for the MLFlow data needs to be created.

# Create and set MLFlow Experiment
mlflow.create_experiment(MLFLOW_EXPERIMENT_NAME)
mlflow.set_experiment(experiment_name=MLFLOW_EXPERIMENT_NAME)

# Output directory for MLFlow
OUTPUT_MODEL_DIR = os.getcwd()+"/model_mlflow"
os.makedirs(OUTPUT_MODEL_DIR, exist_ok=True)

Define Callback Function¶

The metric logging happens as part of the training process by using a callback function that will log the metric after each epoch.

# MLFlow metric logging function
class loggingCallback(keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        mlflow.log_metric("train_loss", logs["loss"], step=epoch)

Note

The callback is required for Tensorflow/Keras. Scikit-learn does not have this requirement.

Train Model & Save Model Data¶

The model, metrics, and artifacts are saved as part of the MLFlow Run.

loggingCallback is called for each epoch to log the metrics
After the training is complete, the model is saved with model.save using the Mount Path /model
- The Mount Path is set during and IDE or Run as described at Mount Path

Standard artifacts are provided for the Run in the Runs menu screen by selecting the Run name and the Artifacts tab. Additional artifacts can be saved by using the mlflow.log_artifacts function.

# Train model and save metrics & artifacts
with mlflow.start_run(run_name="name") as run:
    model.fit(x, y, epochs=epochs, verbose=False, callbacks=[loggingCallback()])

    # Exporting model & metrics
    model.save("/model/1")

    mlflow.log_artifacts(OUTPUT_MODEL_DIR)
    mlflow.keras.log_model(keras_model=model, artifact_path=None)

Note

The callback is required for Tensorflow/Keras. With scikit-learn you can use the mlflow.log_metric function directly in the training run.

Writing Code for Katib¶

Katib is a Kubeflow framework that executes hyperparameter optimization during training. More details on this can be found at Introduction to Katib

A description of how to use Katib within DKube is available at Hyperparameters

In order to use Katib, the code must be written to accept the tuning file and to output the metrics in the right format. In this example:

Epochs & Learning Rater will be varied
Training Loss will be minimized

Read the Katib Tuning Input¶

The Katib tuning file describes the tuning objective and the hyperparameters that will be varied to determine which combination best achieves that objective. This information is saved by parsing the input arguments. The argparse module is used for this function.

The expected hyperparameters are read into arg variables

# Set up parsing for the Katib inputs
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--epochs', type=int, default=5,
                       help='The number of epochs for training')
parser.add_argument('--learning_rate', type=float, default=0.01,
                       help="learning rate for optimizer")
args = parser.parse_args()

# Set up input hyperparameters
NUM_EPOCHS = int(os.environ.get("EPOCHS", args.epochs))
LEARNING_RATE = args.learning_rate

Log the Output Metrics¶

Katib runs the number of trials set by the tuning file, and after each trial it analyzes the output metric. The metric is read from stdout. The MLFlow logging code was shown in a previous section. In order to enable Katib, the metric also needs to be output after each completed trial.

# MLFlow metric logging function
class loggingCallback(keras.callbacks.Callback):
   def on_epoch_end(self, epoch, logs=None):
       mlflow.log_metric("train_loss", logs["loss"], step=epoch)
       # Output metric for Katib to collect from stdout
print(f"loss={round(logs['loss'],2)}")

Important

During the model.fit the option verbose=False must be set in order to get clean metric outputs that Katib can read

Writing Code for TensorBoard¶

In order to make use of TensorBoard within DKube, the code needs to be instrumented so that the event files are written to the right folder.

TensorBoard in a Notebook¶

When running your code within a Notebook, the TensorBoard UI expects the event logs to be in the folder defined by the environmental variable DKUBE_TENSORBOARD_DIR.

Write the Event Files¶

The TensorBoard event files are written using a callback in the training function. An example of the code to log the event files would look like this:

# Set the folder for Tensorboard event logs
DKUBE_TENSORBOARD_DIR = os.environ.get('DKUBE_TENSORBOARD_DIR')
# Train model with TensorBoard event logs
with mlflow.start_run(run_name="name") as run:
    model.fit(x, y, epochs=epochs, verbose=False, callbacks=[tf.keras.callbacks.TensorBoard(log_dir=DKUBE_TENSORBOARD_DIR)])

TensorBoard during a Training Run¶

For a Training Run, the TensorBoard event logs can be in one of 2 places:

Folder identified by DKUBE_TENSORBOARD_DIR
Folder defined within the Model directory when the Run is created

Write the Event Files¶

The TensorBoard event files are written using a callback in the training function. An example of the code to log the event files would look like this:

# Set the folder for Tensorboard event logs
DKUBE_TENSORBOARD_DIR = "/model/tensorboard"
# Train model with TensorBoard event logs
with mlflow.start_run(run_name="name") as run:
    model.fit(x, y, epochs=epochs, verbose=False, callbacks=[tf.keras.callbacks.TensorBoard(log_dir=DKUBE_TENSORBOARD_DIR)])

Kubeflow Pipelines Template¶

Kubeflow Pipelines provide a powerful mechanism to automate your workflow. DKube supports this capability natively, as described at Kubeflow Pipelines

One Convergence offers templates and examples to make pipeline creation convenient.

One Convergence provides a set of component definition files for the necessary functions needed to create a pipeline within DKube (item 1). The files include:

A description of the component
A list of inputs and outputs that the component accepts
Metadata that allow the component to be run within DKube as a pod

They are located in the folder /mnt/dkube/pipeline/components from within a JupyterLab notebook.

They can also be accessed from the GitHub location DKube Pipeline Components

These components are called by the DSL pipeline description (item 2), and allow the developer to focus on the specific inputs and outputs required by the Job rather than the details of how those fields get translated at the lower levels. The DSL compliler will convert the DSL into a pipeline YAML file, which can be passed to Kubeflow to run.

An example of using the templates to create a pipeline is found at DKube Training

The file pipeline.ipynb uses the template to create a pipeline within DKube.

Custom Container Images¶

DKube jobs run within container images containing framework and preloaded packages. The image is selected when the Job is created. The image can be from several sources:

DKube provides standard images based on the framework, version, and environment
An image can be manually created, as explained in this section, and stored within an Image Catalog, described at Images
An image can be used from a repo, either directly or after being stored in the Image Catalog

If the standard DKube Docker image does not provide the packages that are necessary for your code to execute, you can create a custom Docker image and use this for IDEs and Runs. There are several different ways that DKube enables the creation and use of custom images.

Manual Image Creation¶

This section describes the process to build a custom Docker image manually.

Getting the Base Image¶

In order to create a custom image for DKube, you can start with the standard DKube image for the framework and version, and add the packages that you need. The standard images are available from the Image dropdown field during IDE & Run creation.

Adding Your Packages¶

In order to add your packages to the standard DKube image, you create a Dockerfile with the packages included. The Dockerfile commands are:

FROM &ltBase Image>
RUN pip install &ltPackage>

Building the Docker Image¶

The new image can be built with the following command:

docker build -t &ltusername>/&ltimage:version> -f &ltDockerfile Name> .

Pushing the Image to Docker Hub¶

In order to push the image, login to Docker Hub and run the following command:

docker push &ltusername>/&ltimage:version>

Using the Custom Image within DKube¶

When starting a Run or IDE, select a Custom Container and use the name of the image that was saved in the previous step. The form of the image will be:

docker.io/&ltusername>/&ltimage:version>

JupyterLab Custom Images¶

When creating a custom image for use in a JupyterLab notebook within DKube, you must include the steps that provide the jovyan user sudo permissions. This allows that user to install system packages within the notebook.

FROM jupyter/base-notebook:latest

ENV DKUBE_NB_ARGS ""
USER root
RUN echo "$NB_USER ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/notebook
USER jovyan
CMD ["sh", "-c", "jupyter lab --ip=0.0.0.0 --port=8888 --allow-root $DKUBE_NB_ARGS"]

CI/CD¶

DKube provides an automated method to:

Build and push images to a Docker registry based on a code change
Execute an automated set of steps through DKube

Basic Setup for CI/CD¶

In order to make use of the CI/CD feature, certain files need to be created in the Code repo to define the necessary actions.

Setting up the Repository¶

In order for the CI/CD system to operate, the repository needs to be set up with the files that provide the action instructions. The directory structure should be as follows:

Repository Root
     |
     |--- .dkube-ci.yml
     |

The other folders and files described in this section can be in any folder, since the .dkube-ci.yml file will identify them by their path.

Placement of .dkube-ci.yml¶

There can be a .dkube-ci.yml file in one, more, or all branches. When a code change is made on a branch, the .dkube-ci.yml for that branch will be used to execute the actions defined for that branch. For CI/CD triggering from the DKube UI, the branch will be specified during the submission.

CI/CD Actions¶

The CI/CD can be triggered in 2 different ways:

From the DKube Builds screen, as described at Images
From a GitHub Webhook trigger, described in this section

In both cases, the file .dkube-ci.yml is used by the CI/CD system to find the other necessary files to execute the commands. The general format of the .dkube-ci.yml file is as follows:

  &ltDeclaration>:
    &ltDeclaration-specific Instructions>

The following types of actions are supported by the CI/CD mechanism.

Declaration	Description
Dockerfile:	Build and push a Docker image using a Dockerfile
conda-env:	Build and push a Docker image using the Conda environment
docker_envs:	Register existing Docker images with DKube
images:	Build other Docker images
jobs:	Add a DKube Jobs template or run Jobs
components:	Build a Kubeflow component
pipelines:	Compile, Deploy, and Run a Kubeflow pipeline

Folder Path¶

The path: declaration can have a hierarchical designation. So, for example, if the file is in the hierarchy folder1/folder2, as referenced from the base repository, the path: declaration would have that hierarchy.

Combining Declarations¶

The declarations can be combined in any order.

Important

The actions from the declarations are run in parallel, except for the Pipeline step, which waits for the components to be built. For others, such as the Jobs: declaration, the image must already have been built and ready for use.

More details on the syntax of the actions are available at CI/CD Examples

Automated Execution Through GitHub Webhook¶

The CI/CD actions can be triggered automatically through a GitHub repo commit. The actions described above will be performed based on the .dkube-ci.yml file.

The Webhook is set up through the procedure in this section.

The Webhook is set up from the root level of the repository, within the branch that will be used for commits. Select the Settings tab.

_images/Developer_CICD_Webhook_Select.png

Select the Webhook menu item on the left.

_images/Developer_CICD_Webhook_Add_Select.png

Select the Add webhook button on the top right.

The Webhook fields should be filled out as follows:

Field	Description
Payload URL	URL used to access DKube, with /cicd/webhook at the end
Content type	application/json
Which events…	Just the push event
Active	Check this when ready to enable the trigger

Important

When the Active checkbox is enabled, every commit to the repo will trigger the CI/CD. Leave this unchecked until you are ready to enable the CI/CD actions.

CI/CD Example¶

This section provides a basic example to demonstrate how to setup and use the CI/CD capability. This example creates and builds a Docker image. The repository that is used for this example is in the following GitHub repo within the training branch:

https://github.com/oneconvergence/dkube-examples/tree/training

Follow the readme instructions to execute the example.

Inference Deployment Requirements¶

Once the training is complete for a DKube model, it can be deployed on a test or production inference server Model Deployment Workflow
The model can be deployed with or without a Model Serving Transformer
A trained model can be deployed with the default DKube image, or with a custom image that the user can provide as described at Images

Optional Transformer¶

As described in the section referenced above, a transformer can optionally be included. If the model is deployed with a transformer, the transformer.py file needs to be written with the following prerequisites:

A class should be defined with preprocess and postprocess as member functions
The class will take the kfserving.KFModel as an argument, and initialize the predictor host.

An example is provided here:

  class Transformer(kfserving.KFModel):
    def __init__(self, name: str, predictor_host: str):
      super().__init__(name)
      self.predictor_host = predictor_host

The preprocess function will accept a dictionary containing the data to be processed and return the payload
The postprocess function will also accept a dictionary containing the output for the model prediction in a dictionary data structure. The function will return the processed model output.

A main function is required to start. An example of this is:

  if __name__ == "__main__":
    transformer = Transformer(args.model_name, predictor_host=args.predictor_host)
    kfserver = kfserving.KFServer()
    kfserver.start(models=[transformer])

For more details refer to the sample transformer.py file at Transform.py Example

Note

Currently a transformer component can only be written in Python

Accessing DKube MinIO Server¶

DKube includes an integrated MinIO server that can be used to supply data to Job executions. For example, the DKube Monitoring examples create synthesized datasets for live data and ground truth and serve them from the DKube MinIO server.

Note

The MinIO server is only available in a full DKube installation

In order to make use of the MinIO server, the access IP and credentials need to be added to a DKube Dataset.

Getting MinIO credentials¶

The credentials can be obtained from the DKube API server. This is accessed from the DKube access URL in the form:

https://<DKube Access IP>:32222/#/api

This will bring up the DKube API screen.

From within that screen, search for logstore

Expand the logstore entry
Select Try it Out
Execute

_images/Developer_API_Logstore_Execute.png

This will execute the curl command and provide a response that includes the AccessKey and AccessKeyId

Creating Dataset Repo¶

The credentials will be used to create a Dataset repo within DKube. Create a Dataset repo by selecting the Datasets menu on the left and selecting + Dataset

Fill in the fields as follows, then Add Dataset

_images/Developer_Create_MinIO_Dataset.png

Field	Description
Name	User-selected name for the Dataset
Versioning	None
Dataset Source	S3
Endpoint	DKube access URL of form http://<DKube IP Address>:32221
Access Key ID	AccessKeyId from previous step
Secret Access Key	AccessKey from previous step
Bucket	cloudevents

Note

The Endpoint field needs to be http (not https), and the port is 32221

TensorFlow Deployment¶

DKube’s TensorFlow serving image uses TensorFlow Serving to serve models trained using the Tensorflow framework
A TensorFlow trained model should be saved in protobuf format. Other file formats are not supported by Tensorflow serving. Tensorflow’s model.save function can be used to save the trained model in the protobuf format.
The model should be saved under a version folder (such as [mount_path}/1]). The save-path follows a convention used by TensorFlow Serving where the last path component (/1 in the example) is a version number for your model. This allows tools like TensorFlow Serving to determine the relative freshness of the model.
Refer to TensorFlow Saved Model for more details about saving the model

PyTorch Deployment¶

The DKube PyTorch serving image uses the standard torch.load and the predict methods to load and serve models
A PyTorch trained model should be saved with a model.pt file. The function torch.save can be used to save a PyTorch model into the .pt format. The file name should be model.pt only.
A net.py is also required to be saved within the same model directory. This defines the model class. An example of this can be found at Net.py Example
The net.py should have a class name Net, and there should not be any other .py file in the model save directory. If there are other files, the serving execution will provide an exception.
Refer to Saving and Loading Models for Inference for more details about saving the model

SKLearn Deployment¶

The Dkube Sklearn serving image uses joblib to the load model and predict.
A scikit-learn model should be saved in joblib format and the file name should be model.joblib. Other formats are currently not supported.
Refer to Model Persistence for more details about saving the model

Custom Deployment¶

In a custom deployment the model can be saved in any user-specific format
A custom deployment will also require a custom serving image which would be user-defined