Data Science Dashboard & Workflow

Data Science Menu

The Data Science Screen provides a GUI-based mechanism to navigate through the workflow.

_images/Data_Scientist_Menu_R33.png

Menu Item

Description

Guide

Dashboard

Overview of the current status of the workflow

Data Science Dashboard

Projects

Create, view, and manage Projects

Projects

Code

Create, view, and manage the Code repos

Repos

Datasets

Create, view, and manage the Dataset repos

Repos

Models

Create, view, and manage the Model repos

Repos

FeatureSets

Extracted Datasets for improved training

FeatureSets

Images

Catalog of images for use in IDEs & Runs

Images

IDEs

Create, view, and manage JupyterLab and RStudio

IDEs

Runs

Create, view, and manage training and preprocessing runs

Runs

Pipelines

Create, view, and manage Kubeflow pipelines & experiments

Kubeflow Pipelines

Deployments

View and manage deployed Models, including Monitors

Product Deployment

Storage

View the storage utilization for the user

Utilization

View the CPU, GPU, memory, and pod utilization for the user

Data Science Dashboard

The Data Science dashboard provides an overview of the current state of the workflow.

_images/Data_Scientist_Dashboard_R30.png

From the dashboard, the user can go directly to recent jobs.

Data Science Workflow

_images/Data_Scientist_Block_Diagram_R30.png

Function

Description

Code

Folder containing the program code for experimentation and training

Dataset

Folder containing the datasets for training

FeatureSet

Extracted Dataset

Model

Trained model that can be used for inference or transfer learning

DKube Run

Execution of DKube training or preprocessing code on dataset

IDE

JupyterLab or RStudio

Pipeline

Automated execution of a set of steps using Kubeflow pipelines

Pipeline Run

Single execution run of a Kubeflow Pipeline

Experiment

Group of Pipeline runs, used to enable better run management

Monitoring

Monitor Model results and provide Alerts if outcomes degrade

Shared Data

Users in a Group share Code, Datasets, IDEs, Runs, Models, Pipelines, and resources. These are shown on the screens for each type of data. At the top left-hand of the screen there is a dropdown menu that allows the User to select what is visible: just the User’s data, just the shared data, or both.

_images/Data_Scientist_Ownership.png

Users can be part of more than one Group. Users that are part of multiple Groups have access to view all of the resources in all of the Groups that they are part of. Adding a User to Groups is described at Add (On-Board) User

Multicluster Operation

DKube supports operation across multiple clusters.

  • Jobs can be submitted to the same or a different cluster

  • Development, deployment, and monitoring can be on the same or different clusters

Remote Cluster Execution

_images/HPC_Block_Diagram_R31.png

DKube can execute Jobs on the local DKube cluster or send them to a remote cluster for execution. The workflow is as follows:

  • The external cluster is added and maintained as described at Multicluster Management

  • The user logs into the remote cluster (if required) as described below

  • When a Run is submitted, the user decides if it should be submitted to the local cluster or the remote cluster. This is described at Configuration Submission Screen

  • If the execution is directed at the remote cluster, the appropriate plug-in translates the data and execution image to run on the remote cluster

  • The execution is performed on the remote cluster, and the status of the Run is sent back to the local cluster

  • All metadata remains on the local cluster

Multicluster Workflow

_images/Multicluster_Workflow_Block_Diagram.png

DKube allows development, serving, and monitoring to execute on the same cluster, or on different clusters.

  • Development can be performed on one cluster, then a Model can be deployed to the same or a different cluster

  • A deployed Model can be monitored on the same cluster as the deployment, or on a different cluster

Deploying on a Different Cluster

_images/Multicluster_Deploy_Block_Diagram.png

Deploying a model on a different cluster than the development cluster is performed by:

  • Building a deployment image on the Development Cluster and storing it in an external repo

  • Importing the deployment from the external repo on the Serving Cluster

This is explained in more detail at Deploy on a Different Cluster

Projects

Projects allow the user to group resources into categories, and view them together. When a Project is selected, only the resources such as Code, Datasets, Models, Runs, etc for that Project will be shown.

A Project is associated with a Group when created.

  • Users who are part of a single Group create Projects within that Group

  • Users who are part of multiple Groups will select a Group when creating the Project, or allow the Project to be part of their default Group

The Project selection is made at the top of the screen for the resources that are filtered by Projects.

_images/Data_Scientist_Project_Select_R30.png

The user can select a specific Project, and see only the resources for that project, or select All Projects, and see all of the resources for all Projects.

Project List

_images/Data_Scientist_Project_List_R30.png

The Project Menu will provide a list of current Projects.

Create a Project

A Project is created by selecting the + Create Project button.

This will initiate a popup to fill in the details of the Project. Fill in the fields and select Submit

_images/Data_Scientist_Project_Create_R32.png

Field

Description

Name

User-defined name for the Project

Group

Group that will be associated with the Project

Description

Optional user-defined information field

Thumbnail URL

Optional URL for a thumbnail image used for the Project

Enable Leaderboard

Select to add the Leaderboard capability as described at Leaderboard

Each Project is associated with a Group.

  • For Users who are part of a single Group, that Group will be filled in and automatically selected

  • For Users who are part of multiple Groups, the Group selection can be left at the default, or a different Group can be selected

The Group that has been selected will determine which other Users can view the data from the resources that are part of the Project. Users in the same Group can access data from the resources that are part of the Project.

The details of the Project can be provided after creating the Project by selecting the Project name.

_images/Data_Scientist_Projects_Details_R22.png

The initial Project fields can be edited using the Edit action button to the right of the Project list.

Create and View Project Resources

Resources such as Repos, IDEs, Runs, etc are created in the Project that is selected at the time the resource is created.

_images/Data_Scientist_Project_Entity_Create_R30.png

Note

Leaving the Project selection to All Projects allows a user to operate without any Project associated with it

Only the resources ties that are part of the Project will be viewed when a specific Project is selected at the top of the screen. Selecting All Projects will show Resources that are associated with a Project as well as those that are not.

The Project that an resource is associated with can be changed using the Edit action button to the right of the resource name.

Leaderboard

Within a Project, a Leaderboard can be optionally enabled. This capability allows multiple users to cooperate based on a set of criteria set up by the owner of the Leaderboard. Users can run through the submission process more than once, and the best results for each participating user is shown in a table.

_images/Leaderboard_Block_Diagram.png

The workflow for the Leaderboard is as follows:

  • The Owner of the Project creates the training and test data repos for the Users to use

  • Each User creates the training and prediction code to achieve the goals

  • The Users submit their best models for evaluation

  • An evaluation service uses evaluation code created by the Project Owner to provide the outcome for the Leaderboard

  • The Leaderboard shows the best outcomes from each of the Users

Set up the Leaderboard

When creating or editing a Project, the Leaderboard is enabled by selecting the “Enable Leaderboard” checkbox.

_images/Data_Scientist_Projects_New_R22.png

After enabling the Leaderboard option, there are more tabs provided after the Project has been created to set it up. The tabs are visible by selecting the Project from the list.

_images/Data_Scientist_Projects_Evaluation_R22.png

Field

Description

Evaluation Source Repo

Evaluation code for the Leaderboard

Evaluation Commit ID

Optional Commit ID (defaults to most recent code repo commit)

Dataset/Version

Optional pointer to the Dataset to use for evaluation

Evaluation Image

Image to be used for evaluation (defaults to standard DKube image)

Evaluation Command

Program used to evaluate the results

There is also a text-based field that the owner can use to provide the details of the Leaderboard.

Note

The evaluation screen is only visible to the project owner. The collaborator users only submit their results for evaluation, but cannot see what is being used to perform the evaluation.

View the Results

The results of the Leaderboard can be viewed on the Leaderboard tab. The evaluation results can be compared and downloaded to a file for archiving or further analysis.

_images/Data_Scientist_Project_Leaderboard_R30.png

Images

DKube provides an image catalog that stores images for use in IDEs and Runs. The images can be added from an external source or built from a Code repo. The custom images will appear in the Image dropdown menu when creating an IDE or Run.

_images/Data_Scientist_Images_R30.png

Add an Image

An image can be added to the image catalog by selecting the + Add Image button on the top right. A popup appears with the necessary fields.

_images/Data_Scientist_Image_Add_Select.png

Field

Value

Name

Unique user-chosen identification

Image URL

URL of image

Description

Optional, user-chosen field to provide details about the Image

Private

Select if the Image requires a Username and Password

_images/Data_Scientist_Image_Add.png

The image will be available when submitting a Job from the Image dropdown.

_images/Data_Scientist_Run_Image_Dropdown.png

Build an Image

An image can be built from an existing Code repo using the CI/CD mechanism as described at CI/CD

Building an image is accomplished by selecting the “+ Build” button in the button on the top right. A popup appears with the necessary fields. The image will be added to the Image Catalog for use within DKube Jobs.

_images/Data_Scientist_Image_Build_Select.png

Field

Value

Code

Available Code repos

Git URL

Repository that contains the Code & Build instructions

Commit ID

Commit ID of the Code repo - latest if blank

_images/Data_Scientist_Image_Build.png

Repos

There are 4 types of repositories under the Repo menu item : Code, Datasets, FeatureSets, and Models.

In order to use them, they must first be accessible by DKube. There are different ways that the repositories can be accessed. These are described in more detail in the sections that describe each input source.

  • They can be uploaded to the internal DKube storage area. In this case, the data becomes part of DKube and cannot be updated externally.

  • They can be accessed as a reference pointer to a folder or file. In this case, the files can be updated by an external application and DKube will access the new version of the file on the next run.

Note

Resources (Code, Datasets, etc) will be created in the Project that is selected when the Repo is created. If “All Projects” is selected, the resource will not be associated with any Project.

Versioned vs Non-Versioned

Datasets & Models

The Dataset & Model repos can be optionally managed by the native DKube DVS versioning system, as explained at Versioning. When repos are created, the source selections guide whether the repo is part of the versioning system or not.

Note

The Datasets & Models need to be versioned with the DKube DVS system in order to be part of the full MLOps architecture. They can then be tracked in the lineage system, compared, etc.

Program Code

Code is not versioned within the native DKube system. It is normally developed and versioned in an IDE, then committed to a GitHub repository.

The behavior of which version of the code gets used depends upon whether it is being used by an IDE or a Run.

IDE

The latest version of the product code will always be used. This is the case even if there is a specific commit ID filled in. The commit ID will be ignored.

Run

The code is uploaded into the local DKube storage based on the Commit ID field. If the Commit ID field is blank, the latest version of the code will be uploaded. If there is a specific GitHub Commit ID in the field, that version of the code will be uploaded and used.

Code

_images/Data_Scientist_Code_R30.png

The program code is contained in folders that are available within DKube to train models using datasets.

The Code screen provides the details on the programs that have been downloaded or linked by the current User, and programs that have been downloaded by other Users in the same Group.

The details on the code can be accessed by selecting the name. The details screen shows the source of the code, as well as a table of versions.

_images/Data_Scientist_Code_Detail.png

Add Code

_images/Data_Scientist_Code_New_R22.png

In order to add a code folder, and use it within DKube:

  • Select the + Code button at the top right-hand part of the screen

  • Fill in the information necessary to access the code

  • Select the Add Code button


Fields for Adding Code

The input fields for adding code are described in this section.

Field

Value

Name

Unique user-chosen identification

Tags

Optional, user-chosen detailed field to allow grouping or later identification

Description

Optional, user-chosen field to provide details about the Code

Git

The Git selection creates a code folder from a GitHub, GitLab, or Bitbucket repo folder.

Access Type

Uploaded to local DKube storage

Field

Value

URL

url of the directory that contains the program code

Private

Select this option if you need additional credentials to access the repo

Bitbucket Access

When accessing a Bitbucket repository, the following url forms are supported.

url

Authentication

https://xxxx

Authentication with username and password

git@xxxx

Authentication with ssh key

ssh://git@xxxx

Authentication with ssh key

The following links provide a reference on how to access a Bitbucket repository:

Bitbucket Set Up Key Reference 1

Bitbucket Set Up Key Reference 2

Private Authentication

When selecting a Private repo, more options are provided to enter the credentials. There are 3 different methods to enter credentials:

  • ssh key

  • Username and password

  • Authentication token

_images/Data_Scientist_Code_New_private.png

When using an ssh key:

  • The public key should be copied to the repository server

  • The private key should be copied to the local workstation, and uploaded from there

Note

DKube supports the pem file format and the OpenSSH & RSA private key formats

The following link provides a reference on how to create a GitHub token: Create GitHub Token

Note

The last set of credentials for each host and credential type will be saved in an encrypted form internally, and prepopulated in the fields

Delete a Code Repo

  • Select the Code to be deleted with the left-hand checkbox

  • Click the Delete icon at top right-hand side of screen

  • Confirm the deletion

Datasets

_images/Data_Scientist_Datasets_R30.png

The Datasets contain the training data. Datasets can be added to DKube in the following ways:

  • Manually added for use as an input for a run with or without versioning capability, as explained in Versioning

  • Manually added as a placeholder for use as a versioned output

  • Created as an output from a preprocessing run

_images/Dataset_Block_Diagram_R22.png

DVS provides a way for DKube to track versions within the application, and is explained at Versioning . To use this option, the DVS name will need to be created, as described in DVS

This will set up the metadata and data storage locations. A default DVS storage is created when DKube is installed.

Dataset Details

The details for the Dataset contain information about where it came from, where it is stored, and the versions.

_images/Data_Scientist_Datasets_Detail.png

Dataset Version Detail

Selecting the version of a Dataset brings up a screen that provides more details on that version, including a list of the Runs that use that version.

_images/Data_Scientist_Datasets_Version_Detail.png

The Dataset can be downloaded to your local file system from this page from the Export button at the top right.

Dataset Version Lineage

If the Dataset was created as part of a Preprocessing Run, selecting the Lineage tab of the Dataset will provide the full lineage. This provides all of the inputs that were used to create the Dataset.

_images/Data_Scientist_Datasets_Detail_Lineage.png

Add a Dataset

The input fields for adding the Dataset depend upon the source.

Datasets Used for Training Input

A Dataset that is intended as the input to a training run can be added with or without versioning capability. For this type of usage, the fields should be filled in as follows:

Field

Value

Versioning

DVS or None, based on whether versioning is selected

Dataset Source

From dropdown menu

Note

The dataset source field will show different options depending upon whether versioning is selected. Some input sources are only compatible with versioning, and others are are only compatible without it

_images/Data_Scientist_Datasets_Version.png

Datasets Used for Preprocessing Output

A new Dataset that is intended as an output for a preprocessing run must first be added in the Dataset repo as a blank (version 1) entry. A new version of the same Dataset will be created by the preprocessing run. For this usage, the fields should be filled in as follows:

Field

Value

Versioning

DVS

Dataset Source

None

_images/Data_Scientist_Datasets_Blank_Version.png

Adding the Dataset

In order to access a Dataset, and use it within DKube:

  • Select the + Dataset button at the top right-hand part of the screen

  • Fill in the information necessary to access the Dataset

  • Select the Add Dataset button

Fields for Adding a Dataset

For all inputs there are common fields.

Field

Value

Name

Unique user-chosen identification

Tags

Optional, user-chosen detailed field to allow grouping or later identification

Description

Optional, user-chosen field to provide details about the Dataset

Versioning

Select DVS to enable the automatic internal DKube DVS versioning as described at Versioning or None to use external versioning from the source

The fields for specific data sources are as follows:

The Git selection creates a Project from a GitHub, GitLab, or Bitbucket repo folder.

Access Type

Uploaded to local DKube storage

Versioned with DVS

Yes

Field

Value

URL

url of the directory that contains the Dataset

Branch

Branch within the Git repo

Private

Select this option if you need additional credentials to access the repo

Note

If the branch of the rep is contained within the url, the Branch input field can be left blank. If the Git repo does not have the branch in the url, the Branch input must be filled in to identify it.

Bitbucket Access

When accessing a Bitbucket repository, the following url forms are supported.

url

Authentication

https://xxxx

Authentication with username and password

git@xxxx

Authentication with ssh key

ssh://git@xxxx

Authentication with ssh key

The following links provide a reference on how to access a Bitbucket repository:

Bitbucket Set Up Key Reference 1

Bitbucket Set Up Key Reference 2

Private Authentication

When selecting a Private repo, more options are provided to enter the credentials. There are 3 different methods to enter credentials:

  • ssh key

  • Username and password

  • Authentication token

When using an ssh key:

  • The public key should be copied to the repository server

  • The private key should be copied to the local workstation, and uploaded from there

Note

DKube supports the pem file format and the OpenSSH & RSA private key formats

The following link provides a reference on how to create a GitHub token: Create GitHub Token


Delete a Dataset

  • Select the Dataset to be deleted with the left-hand checkbox

  • Click the Delete icon at top right-hand side of screen

  • Confirm the deletion

FeatureSets

_images/Features_Block_Diagram.png

DKube allows an extracted version of a Dataset to be used for training. The extraction identifies the features that are best correlated to an accurate prediction, and leads to a more accurate training run. The workflow for using a Feature Store within DKube is:

  • Create a FeatureSet from a Dataset by applying a Feature Specification in a Preprocessing Run

  • Use the FeatureSet in a Training Run rather than the original raw Dataset

FeatureSet Sharing

FeatureSets are global resources. Once created, they are available to other users.

Configure the Feature Store

Before using the Feature Store capability, the storage area needs to be configured. This is done through the Configuration menu in the Operator section. This is described at Configuration

Important

The Feature Store backend configuration should not be changed after being used by DKube

Feature Specification

_images/Features_Flow_Diagram.png

The Feature Specification contains metadata for the features in the FeatureSet. It provides details on each feature, and selects the features that will be used for training. The Feature Specification can also provide the metadata for additional features that are derived from the original Dataset.

The Feature Specification file includes the metadata for the features that will be included in the resulting FeatureSet. The Feature Specification format is as follows:

- name: &ltFeature name> description: &ltBrief description of feature> scheme: &ltFormat of the feature> - name: &ltxxx> description: &ltxxx> schema: &ltxxx>

Create a FeatureSet

_images/Data_Scientist_FeatureSets_R30.png

The FeatureSet menu screen shows the FeatureSets that have been created. If a FeatureSet is associated with a Project, it will be identified. FeatureSets that are not associated with a Project will have the column blank.

Creating a new FeatureSet is accomplished by selecting the + FeatureSet button on the far right.

Note

The FeatureSet will be created in the Project that is selected when the + FeatureSet button is selected. If All Projects is selected, the FeatureSet will not be associated with any Project.

The + FeatureSet button will open a popup to fill in the required information.

_images/Data_Scientist_Feature_Name.png

After filling in the options, select the Next button to optionally upload a Feature Specification.

_images/Data_Scientist_Feature_Upload.png

A Feature Specification can also be uploaded to the FeatureSet after creation by the edit icons on the right.

This creates the first version of the FeatureSet, with the expectation that the Feature Specification will be applied to a Dataset through a Preprocessing run - creating the FeatureSet that will be used in the Training Run.

Create a new FeatureSet Version

A Preprocessing Run takes the initial FeatureSet and applies the Feature Specification to it. The program then writes the new features and creates a version of the FeatureSet (which is now an extracted Dataset) that will be used by the Training Run.

_images/Data_Scientist_Runs_Preprocessing_FeatureSet_R30.png

When creating a Preprocessing Run (explained in Preprocessing Runs ), the input is a Dataset, and the output is a FeatureSet. The FeatureSet contains the Feature Spec that will be applied to create a new version of the FeatureSet.

Use the FeatureSet in a Training Run

Once a new version of the FeatureSet has been created, based on the Feature Specification, it can be used in a Training Run.

_images/Data_Scientist_Runs_FeatureSet_R30.png

When creating a Training Run (explained in Create Training Run ), the input is a FeatureSet, and the output is a Model.

Models

_images/Data_Scientist_Models_R30.png

The Models Repo contains all of the models that are available within DKube. Models are added to the DKube repo in the following ways:

  • Manually added for use as an input for a run that includes transfer learning with or without versioning capability, as explained in Versioning

  • Manually added as a placeholder for use as a versioned output

  • Created as an output from a training run

_images/Model_Block_Diagram_R22.png

The versions of each model can be viewed by selecting the expansion icon to the left of the Model name. This allows direct access to the version details, and for versions of different models to be compared.

_images/Data_Scientist_Models_Versions_R30.png

Model Details

_images/Data_Scientist_Models_Details_R30.png

Selecting the Model will call up a screen which provides more details. This includes the information used to create the model and a list of versions. See section Versioning to understand how Model versions are created.

Model Version Lineage

More detail on a specific model version can be obtained by selecting that version on the detail page. The complete Model lineage can be viewed by selecting the “Lineage” tab on the Model Version Details screen. This provides the complete set of inputs that were used to create the Model.

_images/Data_Scientist_Models_Version_Lineage_R30.png

Add a Model

Models are added manually for 2 different purposes:

  • The Model is going to be used as an input to a Run for transfer learning, where a partially trained model is further trained

  • The Model will become an output from a Training Run

The screens and fields for adding a Model are the same as for adding a Dataset, described at Add a Dataset

Model Workflow

_images/MLOps_Workflow_Diagram_R30.png

The overall workflow of a Model is described in section MLOps Concepts . A diagram of the flow is shown here. This describes the expected stages that a Model goes through from creation to production. This section provides the details of this workflow within DKube.

Compare Models

Model versions can be compared from:

  • The main model screen

  • The detailed model screen

To perform a comparison, the models to be compared should be selected, and the “Compare” button should be used. This will bring up a window with the comparison. The compare function is used by the Data Scientist and ML Engineer during their development.

_images/Data_Scientist_Models_Detail_Compare.png
_images/Data_Scientist_Models_Compare.png

Model Actions

Models can have a variety of different actions performed on them from the details screen. The actions depend upon the User role as described at DKube Roles & Workflow .

_images/Data_Scientist_Models_Actions_R30.png

Edit Model

The description and tag can be modified using the Edit button.

Publish Model

The next phase is handled by the ML Engineer. The ML Eng will optimize, automate, and productize the model using larger datasets and other configurations and parameters that are necessary.

The Model that the ML Eng believes is the best fit to the goal is “Published”.

Publishing a Model changes its Stage in the Model repo, which makes it available to the Product Engineer for testing and eventual Deployment to a production server.

A Model version is published from the Model details page by selecting the Publish button on the far right. This will initiate a popup menu to enter the the publish details.

_images/Data_Scientist_Models_Publish_Popup_R30.png

Field

Value

Description

Optional user-chosen name to provide more details for the instance

Serving Image

Defaults to the training image, but a different image can be used if required

Private

Select if the image needs credentials

Transformer

Select if the inference requires preprocessing or postprocessing

Transformer Image

Image used for the transformer code

Transformer Code

Defaults to the training code repo, but a different repo can be used if required

Transformer Script

Program used for the Transformer

After publishing, the Model version will change to the Published stage.

_images/Data_Scientist_Models_Publish_Stage_R30.png

Deploy Model

A Model is deployed by the Production Engineer from the Models repo. This will create an endpoint that can be exposed for inference. This is described in Production Engineer

IDEs

DKube supports opening an IDE natively. It can include the code and dataset information, as well as the hyperparameters.

_images/Data_Scientist_Notebooks_R37.png

The status messages are described in section Status Field of IDEs & Runs

TensorBoard

TensorBoard can be accessed from the Notebook screen.

  • When an IDE instance is created, TensorBoard is in the “stopped” state. In order to use it, TensorBoard must be started by selecting the play icon (item 2 in the screenshot above).

  • The TensorBoard event files are expected to be in the folder identified in the environment variable DKUBE_TENSORBOARD_DIR. This is explained in more detail at Writing Code for TensorBoard

Note

It can take several minutes for TensorBoard to be active after being started

Note

It is good practice to stop TensorBoard instances if they are not going to be used in order to conserve system resources

IDE Actions

There are actions that can be performed on an IDE instance (items 1 in the screenshot above).

  • An IDE can be started

  • A Training Run can be submitted based on the current configuration and parameters in the IDE.

In addition, there are actions that can be performed that are selected from the icons above the list of instances (item 3 in the screenshot). For these actions, the instance checkbox is selected, and the action is performed by choosing the appropriate icon.

  • An instance can be stopped or started

  • Instances can be deleted or archived (see Delete and Archive )

  • A new instance can be created by either cloning it (clone icon), or starting from a blank configuration from one of the IDE icons

IDE Details

More information can be obtained on the Notebook by selecting the name. This will open a detailed window.

_images/Data_Scientist_Notebook_Detail.png

Create IDE

_images/Data_Scientist_Notebook_Create_R37.png

There are 2 methods to create an IDE instance:

  • Create a new IDE instance by selecting the + <Instance Type> button at the top right-hand side of the screen, and fill in the fields manually

    • This is typically done for the first instance, since there is nothing to clone

  • Clone an instance from an existing instance

    • This will open the same new submission dialog screen, but the fields are pre-loaded from the existing instance. This is convenient when a new instance will have only a few different fields, such as hyperparameters, from an existing one.

In both cases, the new instance submission screen will appear. Once the fields have been entered or changed, select “Submit”

Note

The IDE will be created in the Project that is selected when the IDE is created. If “All Projects” is selected, it will not be associated with any Project.

_images/Data_Scientist_Notebook_Create_Popup.png

There are 3 sections that provide input to the new instance.

  • The Basic tab allows the selection of:

    • The name of the instance and any other pertinent information

    • The program code

    • The framework and version

    • The docker image to use: either the standard DKube image or your own custom image

  • The Repo tab selects the input datafiles required

  • The Configuration tab selects:

    • Inputs related to configuration files & hyperparameters

    • GPU requests

Note

It is not required to make changes to all of the tabs. There are some mandatory fields required, which are highlighted on the screen, but once those have been filled in the instance can be created through the Submit button. The tabs can be selected directly, or the user can go back and forth using the navigation buttons at the bottom of the screen.

Note

The first Run or instance load will take extra time to start due to the image being pulled prior to initiating the task. The message might be “Starting” or “Waiting for GPUs”. It will not happen after the first run of a particular framework version.

File Paths for Datasets and Models

The Dataset & Model repos that are added as part of the submission are saved as described at File Paths

Basic Submission Screen

Field

Value

Name

Unique user-chosen identification

Description

Free-form user-chosen text to provide details

Tags

Optional, user-chosen detailed field to allow grouping or later identification

Code

Project code repo

Framework

Framework type

Framework Version

Framework version

Image

Docker image to use - this can be left at the default, or a custom image can be selected

Code Repo

The code is uploaded into the local DKube storage and used for the IDE.

Custom Containers

Custom containers are supported to extend the capabilities of DKube. In order to use a customer container within DKube, select “Custom” from the Framework dropdown menu. This will provide more options.

  • Enter the image location in the field labeled Docker Image URL in the format registry/<repo>/<image>:<tag>

  • If the image is in a private registry, enable the Private option, and fill in the username and password

_images/Data_Scientist_Notebooks_Container_Custom.png

Repo Submission Screen

The Repos submission screen selects the repositories required for training or experimentation:

  • Dataset repo(s)

  • Model repo(s) for use in transfer learning

  • FeatureSets repo(s)

_images/Data_Scientist_Notebook_mnist_Repo_Dataset.png

A repo is chosen by selecting the + beside the repo type, and choosing the repo(s) from the list provided. The repo is required to be made available to DKube through the process described in Repos

  • The version of the repo can also be chosen

  • A mount path should be selected for the repo, which should correspond to the expected path in the Project code. This is described in more detail at Mount Path

Mount Path for Datasets and Models

The Dataset, FeatureSet, & Model repos that are added as part of the submission contain a field called the “Mount Path”. This is the path that is used by the code to access the repo. This is described in more detail at File Paths

_images/Mount_Point_Diagram_R22.png

Configuration Screen

Configuration File

A configuration file can be uploaded and provided to the program. There is no DKube-enforced formatting for this file. It can be any information that needs to be used during program execution. It can be a set of hyperparameters, or configuration details, or anything else. The program needs to be aware of the formatting so that it can correctly unpacked during execution.

The file can be used within DKube as described in Configuration File

_images/Data_Scientist_Notebooks_Config.png
Hyperparameters

The configuration section allows the user to input the hyperparameters for the instance. The use of the hyperparameters is based on the program code. Hyperparameters can be added by selecting the highlighted + This will allow an additional “Key” and “Value”. More parameters can be added by repeated use of this option.

GPUs

The number of GPUs can be selected for the instance. The GPUs in a Group are shared with all Users in a Group. If there are currently not enough GPUs to satisfy the request, the instance will be queued until enough GPUs are available.

The GPU selection area shows how many GPUs are available in the group. Selecting more GPUs than are available in the group will cause an error.

Note

The screen shows how many GPUs available in the Group, but these are shared with other instances and users. The actual number of GPUs available when the instance is submitted may be fewer than what is shown.

  • Just below the GPU selection is a checkbox that allows the instance to start with the number of GPUs that are available upon submission, including none, if all of the GPUs are currently in use. This will guarantee that the instance does not queue.

Delete IDE

  • Select the instance name from the left-hand checkbox

  • Click Delete icon at top right-hand side of screen

  • Confirm the deletion

Runs

_images/Run_Type_Diagram_R22.png

A Run is the execution of code using:

  • Datasets

  • Optional FeatureSets (extracted Datasets)

  • Optional pre-trained models

  • Hyperparameters

  • Resources

The status messages are described in section Status Field of IDEs & Runs

The Run screen allows the user to manage training and preprocessing based on the inputs selected. The primary difference between the functions are:

  • Once the Training Run is complete, it creates a trained Model

  • When a Preprocessing Run is complete, it creates a new Dataset or FeatureSet

_images/Data_Scientist_Runs_R30.png

Templates

Run Templates are a way to simplify the submission of Runs. They allow many of the fields to be pre-filled, similar to cloning a run from another run. For example, the user may want to do a number of runs with different hyperparameters or resources. A Template can be used to fill in the fields, then the updated hyperparameters or resources can be selected before submitting the new Run.

Training Runs

Training runs create Models as their output.

TensorBoard

TensorBoard can be accessed from the Runs screen.

  • When a Run instance is created, TensorBoard is in the “stopped” state. In order to use it, TensorBoard must be started by selecting the play icon.

  • The TensorBoard event files are expected to be in the folder identified in the environment variable DKUBE_TENSORBOARD_DIR. This is explained in more detail at Writing Code for TensorBoard

Note

It can take several minutes for TensorBoard to be active after being started

Note

It is good practice to stop TensorBoard instances if they are not going to be used in order to conserve system resources

Training Run Actions

There are actions that can be performed on a Training Run instance. For these actions, the Run instance checkbox is selected, and the action is performed by choosing the appropriate icon.

Hyperparameter Optimization

DKube supports Katib-based hyperparameter optimization. This enables automated tuning of hyperparameters for a program and dataset, based upon target objectives. An optimization study is initiated by uploading a configuration file during the Training Run submission as described in Configuration Submission Screen

The study initiates a set of trials, which run through the parameters in order to achieve the objectives, as provided in the configuration file. After all of the trial Runs have completed, DKube provides a graph of the trial results, and lists the best hyperparameter combinations for the required objectives.

The optimization study is ready for viewing when the status is shown as “complete”. That indicates that the trials associated with the study are all complete. The output results of the study can be viewed and downloaded by selecting the Katib icon at the far right hand side of the Run line.

As described in section Configuration Submission Screen , an optimization Run is initiated by providing a YAML configuration file in the Hyperparameter Optimization field when submitting a Run.

A study that has been initiated using Hyperparameter Optimization is identified by the Katib icon on the far right.

_images/Data_Scientist_Runs_Katib_R22.png

Selecting the icon opens up a window that shows a graph of the trials, and lists the best trials based on the objectives.

Training Run Details

More information can be obtained on the Run by selecting the name. This will open a detailed window.

_images/Data_Scientist_Runs_Detail_R37.png

Metrics

The metrics for the Run can be viewed from the Metrics tab.

_images/Data_Scientist_Runs_Detail_Metrics_R37.png

Lineage

DKube provides the complete set of inputs that are used to create a Model from a Training Run. The overall concept is described in section Tracking and Lineage . The lineage is accessed from the details screen for a Run.

_images/Data_Scientist_Runs_Detail_Lineage_R37.png

Logs

The Run log can be viewed from the Logs tab. This is useful when debugging issues with the Run. You can also download the logs from the icon at the far right of the screen.

_images/Data_Scientist_Runs_Detail_Logs_R37.png

Compare Runs

Training runs can be compared in a similar manner as Model versions. This is accomplished by selecting the Runs to compare.

_images/Data_Scientist_Runs_Compare_R22.png

Create Training Run

A Training Run can be created in the following ways:

  • Create a Run from an IDE instance, using the Create Run icon on the right-hand side of the selected instance

    • This will pre-load the parameters from the instance

  • Create a new Training Run by selecting the + Run button at the top right-hand side of the screen, and fill in the fields manually

    • The type of run, training or preprocessing, is chosen

    • A Run can be created from a Template at Templates that will pre-fill in many of the fields

  • Clone a Training Run from an existing instance

    • This will open the same new Training Run dialog screen, but most of the fields are pre-loaded from the existing Run. This is convenient when a new Run will have only a few different fields, such as hyperparameters, as the existing Run.

  • A Run is automatically created as part of a Pipeline

Note

The Run will be created in the Project that is selected when the Run is created. If All Projects is selected, it will not be associated with any Project.

For the cases where a Run is created by the User, the New Training Run screen will appear. Once the fields have been filled in, select Submit

_images/Data_Scientist_Runs_New_R22.png

Note

The first Run will take additional time to start due to the image being pulled prior to initiating the task. The message might be “Starting” or “Waiting for GPUs”. Each time a new version of the framework is run for the first time, the delay will occur. It will not happen after the first run.

File Paths for Datasets and Models

The Dataset & Model repos that are added as part of the submission are saved as described at File Paths


Basic Submission Screen

_images/Data_Scientist_Runs_New_Basic_R22.png

Field

Value

Name

Unique user-chosen identification

Description

Free-form user-chosen text to provide details

Tags

Optional, user-chosen detailed field to allow grouping or later identification

Code

Program code repo

Framework

Framework type

Framework Version

Framework version

Image

Docker image to use - this can be left at the default, or a custom image can be selected

Start-up Command

Program and options that need to run in order to initiate training

Code Repo

The code is uploaded into the local DKube storage and used for the Run.

The program code will be used based on the Commit ID field.

Blank

The latest version of the code will be used

Value

The version of the code with that value will be used

Custom Containers

Custom containers are supported to extend the capabilities of DKube. In order to use a customer container within DKube, select Custom from the Framework dropdown menu. This will provide more options.

  • Enter the image location in the field labeled Docker Image URL in the format registry/<repo>/<image>:<tag>

  • If the image is in a private registry, enable the Private option, and fill in the username and password

_images/Data_Scientist_Runs_Custom_Container_R30.png
Build From Code Repo

By default, DKube will choose a standard image when creating a new training Run. If a different image is required, it can be selected from the dropdown menu in the Image field. In addition, the image can be created from the Code and then used for the Run.

In order to use this capability, the GitHub folder is required to have a .dkube-ci.yml file as described at CI/CD

Selecting the “build-from-code-repo” Image option will cause the image to first be built and saved, then used for the Run execution.

_images/Data_Scientist_Runs_Build_From_Code.png

Repo Submission Screen

Output Model Repo

The Repos screen has an output Model section for the trained model in addition to the input Model section (for transfer learning).

_images/Data_Scientist_Runs_New_Repos_R22.png

The format for the output Model is similar to the input Model. Even though the field is for an output trained model, there still needs to be an entry in the Models repo so that the model can be properly tracked and versioned. The new trained model will become the next version of the model that is added to the submission.

In order to create a completely new model - with Ver 1 - a new DVS model should be created as explained in the section Models


Configuration Submission Screen

Target Cluster

A Run can be executed on the primary Kubernetes cluster, or executed on a remote cluster. Available execution clusters will appear in a dropdown in the Cluster field. Based on the cluster type, the remaining fields will be different.

_images/Data_Scientist_Runs_New_Config_Cluster.png

If the cluster choice is an external cluster, there are additional fields that are required prior to the common fields explained below. The field definitions are available at:

Cluster Type

Field Definitions

Slurm

Slurm sbatch Definitions

_images/Data_Scientist_Runs_New_Config_Slurm.png
Configuration File

A configuration file can be uploaded and provided to the program. There is no DKube-enforced formatting for this file. It can be any information that needs to be used during program execution. It can be a set of hyperparameters, or configuration details, or anything else. The program needs to be aware of the formatting so that it can correctly unpacked during execution.

_images/Data_Scientist_Runs_New_Config_File_R22.png
Hyperparameters
_images/Data_Scientist_Runs_New_Config_Env.png

The configuration section allows the user to input the hyperparameters for the instance. The use of the hyperparameters is based on the program code. Hyperparameters can be added by selecting the highlighted + This will allow an additional Key and Value. More parameters can be added by repeated use of this option.

Hyperparameter Tuning

The Configuration screen has additional fields that allow more actions for the Run beyond what is possible with an IDE.

_images/Data_Scientist_Runs_New_Config_HP_R22.png

In addition to the ability to add or upload hyperparameters, the Training Run can also initiate a Hyperparameter Optimization run.

In order to specify that the Run should be managed as a Hyperparameter Optimization study, a yaml file must be uploaded that includes the configuration for the experiment.

Leaving this field blank (no file uploaded) will indicate to DKube that this is a standard (non-hyperparameter optimization) Run.

The format of the configuration file is explained at Katib Introduction.

GPU Distribution

Runs can be submitted with GPUs distributed across the cluster. The Project code needs to be written to take advantage of this option. In order to enable this, the “Distributed workloads” option needs to be selected.

_images/Data_Scientist_Runs_New_Config_GPUs.png

The distribution can be accomplished automatically or manually.

  • If the automatic distribution option is selected, DKube will determine the most effective way to use the GPUs across the cluster.

  • If the manual distribution option is selected, the user needs to tell DKube how the GPUs should be distributed. For this option, the user needs to understand the topology of the cluster, and know where the GPUs are located.

When distributing the workload manually across nodes in the cluster, the number of workers needs to be specified. DKube takes the number of GPUs specified in the GPU field, and requests that number of GPUs for each worker.

So, for example, if the number of GPUs is 4, and the number of workers is 1, then 8 GPUs will be requested, spread across 2 nodes.


Stop Run

  • Select the Run to be stopped with the left-hand checkbox

  • Click the Stop icon at the top right-hand side of the screen

Preprocessing Runs

A Preprocessing Run outputs a Dataset resource when it is complete. This is typically done in order to modify a raw dataset such that it can be used for Training.

_images/Data_Scientist_Preprocessing_R30.png

Create Preprocessing Run

A Preprocessing Run can be created in the following ways:

_images/Data_Scientist_Runs_Preprocessing_Create_R22.png
  • Create a new Preprocessing Run by selecting the + Run button at the top right-hand side of the screen, and selecting Preprocessing

  • A Run is automatically created as part of a Pipeline

Note

The Run will be created in the Project that is selected when the Run is created. If All Projects is selected, it will not be associated with any Project.

For the cases where a Preprocessing Run is created by the User, the New Preprocessing Run screen will appear. Once the fields have been filled in, select Submit

Note

The first Run will take additional time to start due to the image being pulled prior to initiating the task. Each time a new version of a framework is run for the first time, the delay will occur. It will not happen after the first run.


Basic Submission Screen

In addition to the standard fields, including the name of the run, tags, and start-up command, the Preprocessing Basic screen includes a docker image field that points to the image created by the user.

_images/Data_Scientist_Runs_Preprocessing_Basic_R22.png

Repo Submission Screen

The Repos screen is filled in similarly to the Training Run, but instead of a Model output, there is a Dataset or FeatureSet output.

_images/Data_Scientist_Runs_Preprocessing_Repos_R22.png

Kubeflow Pipelines

DKube supports Kubeflow Pipelines. Pipelines describe a workflow structure graphically, identifying the flow from one step to the next, and define the inputs and outputs between the steps.

An introduction to Kubeflow Pipelines can be found at Kubeflow Pipelines

One Convergence provides templates and examples for pipeline creation described at Kubeflow Pipelines Template

The steps of the pipeline use the underlying components of DKube in order to perform the required actions.

The following sections describe what is necessary to create and execute a Pipeline within DKube.

Upload a New Pipeline

_images/Data_Scientist_Pipelines_R30.png

The pipelines that have been uploaded or created within DKube are available from the Pipelines menu. There are a number of pipelines that come with a standard DKube installation. New pipelines can be uploaded to DKube by selecting the + Upload Pipeline button and entering the access information.

_images/Data_Scientist_Pipelines_Upload_New.png

Create a Pipeline Run

A new Run can be created from a Pipeline by selecting the Pipeline name and choosing either a Run or an Experiment. The Experiment choice will put the Run into that Experiment.

_images/Data_Scientist_Pipelines_Select_R30.png
_images/Data_Scientist_Pipelines_Create_Run_R30.png

The top input fields are common to any pipeline. The Run Parameters are specific to the Pipeline.

_images/Data_Scientist_Pipelines_Run_New_R30.png

Field

Description

Pipeline

Name of the Pipeline - defaults to Pipeline selected in previous step

Pipeline Version

Version of the Pipeline - defaults to Pipeline selected in previous setup

Run name

User-selected name for use in tracking

Description

Optional, user-selected field for providing more details

Experiment

Experiment to use for this Run

Run Type

Choose One-off or Recurring

Run Parameters

Input fields specific to the Pipeline

Important

If the Run Type is selected as Recurring, there are a maximum number of 9 characters for the Run name

Manage Experiments and Runs

Experiments that have been created from pipelines are managed from the Experiments screen.

_images/Data_Scientist_Pipelines_Experiments_R30.png

Within each experiment, the Runs that are part of that Pipeline are visible by selecting the Experiment name.

Selecting the name of the Run brings up the current state of the Pipeline. Selecting a Pipeline box within the “current state” screen brings up a window that provides more details on the configuration and status of the Run.

_images/Data_Scientist_Pipelines_Run_Details_R30.png

Sharing Experiments and Runs

_images/Data_Scientist_Pipelines_Contributors_Select.png

Pipeline resources can be shared with others by adding them from the Contributors tab. Selecting the + Add Contributor button will bring up the screen that allows you to choose the users to share with, as well as the type of access they will have.

_images/Data_Scientist_Pipelines_Contributors_Add.png

Archiving and Deleting Experiments & Runs

Experiments and Runs can be archived by selecting the resource and using the Archive button. This will not delete any of the data or metadata. Archived resources can be viewed through the Archive tab.

Runs can be deleted from the Archive tab.

_images/Data_Scientist_Pipelines_Archive.png

Production Engineer

_images/Workflow_Block_Diagram_R30.png

The Production Engineer Deploys Models for Inference Serving, and Creates & Analyzes Model Monitors to ensure that the Served Models continue to achieve organizational goals.

Models that have been Deployed for Serving are available from the Deployment menu.

Models can be deployed:

The Deployment workflow is explained in this section.

Once a Model has been Deployed:

  • It can be Monitored on the same cluster as it is being Served, or

  • It can be imported from the Serving Cluster and Monitored on a seperate cluster

The Monitoring workflow is described at Monitoring

Model Deployment Workflow

A Model is Deployed by the Production Engineer from the Models repo. This will create an endpoint that can be exposed for inference.

Deploy on the Same Cluster

A Model can be deployed on the same cluster as development in the following ways:

  • Select the Deploy icon on the Model versions screen

  • Select the Deploy button on the Model version details screen

_images/Data_Scientist_Models_Deploy_Icon_R30.png

Deploying from Versions Summary Screen


_images/Data_Scientist_Models_Deploy_Button_R37.png

Deploying from Version Details Screen

A popup will allow you to fill in the necessary deployment details.

_images/Data_Scientist_Models_Deploy_Popup_R37.png

Field

Value

Name

User-chosen name for the Deployment

Description

Optional user-chosen name to provide more details for the Deployment

Serving Image

Defaults to the training image, but a different image can be used if required

Private

Select if credentials are required for the serving image

Serving Port

Port where predictor service is listening - will be filled in by default for supported deployment types, described in more detail at Serving Port & Prefix

Serving url Prefix

Predictor service url prefix - will be filled in by default for supported deployment types

Deployment

Type of deployment - if Production is selected, the instances are launched on the nodes with a Production Affinity defined during installation. If no production Node Affinity is defined, the instance will be run on any node in the cluster. This is described at Node Affinity

Deploy Using

Type of inference - if GPU is selected, the instances are launched on the nodes with a GPU Affinity

Transformer

Select if the deployment requires preprocessing or postprocessing

Transformer Image

Image used for the transformer code

Transformer Code

Defaults to the training code repo, but a different repo can be used if required

Commit ID

Commit ID for the Transformer Code - if left blank it will choose the latest

Transformer Script

Program used for the Transformer - referenced from the top level of the GitHub Code repository

Minimum Replicas

Minimum number of inference pods that will run in the idle state with no inference requests Configuring Scale Bounds

Maximum Concurrent Requests

Soft target for the number of concurrent requests that a single inference pod can serve for the Model Configuring Concurrency

Minimum CPU

Optional minimum CPU requirements

Maximum CPU

Optional maximum CPU requirements

Minimum Memory

Optional minimum memory requirements

Maximum Memory

Optional maximum memory requirements

Event Source

Select if the deployment will be triggered by a specific event

Node Affinity

During installation of DKube, cluster nodes can be assigned to a specific type of Node Affinity for purposes of execution. When a node has been assigned an Affinity, certain types of Jobs will only be run on those nodes.

Serving Port & Prefix

The fields Serving Port and Serving url Prefix will be automatically filled in for supported serving frameworks. The deployed endpoint, as described at Deployment Status, will use these fields to properly handle the live inference data. They should be left in their default state in those cases.

If an unsupported or custom serving framework is used for deployment, these fields should be modified to reflect the different expectations when traffic is sent to the deployed endpoint.

Deployment Status

_images/Data_Scientist_Deployments_R37.png

The deployment status will be one of 2 types:

Status

Description

Running

Locally deployed model as described in this section

Imported

Remotely imported model as described at :ref:prod-deployment-different`

A locally deployed model will also include the endpoint where the deployment is being served.

A summary of the deployments is provided at the top of the screen.

Note

If any deployments have an active monitors set up, the other tabs (Monitors, Dashboard…) are used as described at Monitoring

Change Model Deployment

After a Model has been deployed, it is associated with an endpoint URL. The Model associated with that endpoint can be changed from the Deployments screen. Select the Edit Action button to the right of the Model name. This will cause a Popup to appear that allows the Model version and other associated information to be changed for that endpoint.

_images/Data_Scientist_Deployments_Change.png

Deployment Details

_images/Data_Scientist_Deployments_Select_R37.png

Selecting the Deployment name will provide the details of the deployments. This will bring up a screen that provides the inputs and options, a log of the deployment activity, and the resource utilization. If there is a monitor associated with the deployment, this can also be viewed from the details screen.

_images/Data_Scientist_Deployments_Details_R33.png

Deploy on a Different Cluster

_images/Multicluster_Deploy_Block_Diagram.png

DKube provides a way to deploy a model on a different cluster than the development cluster. The steps are:

  • Build a deployment image on the Development Cluster and store it in an external repo

  • Import the deployment from the external repo on the Serving Cluster

Build and Save the Image on Development Cluster

_images/Data_Scientist_Models_Deploy_Build_R37.png

The Model image is firat built and saved on an external repo.

The Build function is available from the Images tab on the detailed screen after selecting the version of the Model that you want to deploy. Select the Build Image button to start the process. Fill in the details of the repo and Submit

_images/Prod_Eng_Deploy_Build_Image_R37.png

After the build has been completed and pushed to the repo, the Image Name will appear under the Image Name column.

_images/Data_Scientist_Models_Build_Image_Name_R37.png

Selecting the Image Name will bring up a screen providing the details of the Image. The Image field will contain the information required when importing the image on the Serving cluster.

_images/Data_Scientist_Images_Name_Details_R37.png

Import the Image on Serving Cluster

_images/Prod_Eng_Deployments_Import_Deployment_R37.png

The image that was built on the Development cluster is imported into the Serving cluster from the Deployments menu by selecting the + Deployment button.

_images/Prod_Eng_Deployments_Import_Deploy_Popup_R37.png

Complete the fields in the popup to import the model.

Field

Value

Name

User-chosen name for the Deployment

Description

Optional user-chosen name to provide more details for the Deployment

Serving Image

Image field from the Image Name screen on the Development cluster

Private

Select if credentials are required for the serving image

Serving Port

Port where predictor service is listening - will be filled in by default for supported deployment types, described at Serving Port & Prefix

Serving url Prefix

Predictor service url prefix - will be filled in by default for supported deployment types

Deployment

Type of deployment - if Production is selected, the instances are launched on the nodes with a Production Affinity defined during installation. If no production Node Affinity is defined, the instance will be run on any node in the cluster. This is described at Node Affinity

Deploy Using

Type of inference - if GPU is selected, the instances are launched on the nodes with a GPU Affinity

Transformer

Select if the deployment requires preprocessing or postprocessing

Transformer Image

Image used for the transformer code

Transformer Code

Choose the Code repo for the Deployment

Commit ID

Commit ID for the Transformer Code - if left blank it will choose the latest

Transformer Script

Program used for the Transformer - referenced from the top level of the GitHub Code repository

Minimum Replicas

Minimum number of inference pods that will run in the idle state with no inference requests Configuring Scale Bounds

Maximum Concurrent Requests

Soft target for the number of concurrent requests that a single inference pod can serve for the Model Configuring Concurrency

Minimum CPU

Optional minimum CPU requirements

Maximum CPU

Optional maximum CPU requirements

Minimum Memory

Optional minimum memory requirements

Maximum Memory

Optional maximum memory requirements

Event Source

Select if the deployment will be triggered by a specific event