Getting Started¶
Overview¶
DKubeTM is a portable, end-to-end, Kubeflow-based MLOps platform that enables data scientists to develop, tune, and deploy complex models. It is based on Kubernetes, and will run on-premises and on the most popular cloud Platforms. It has the same look, feel, and workflow on all of them, and migrating back and forth between providers is fast and simple.
This guide describes the process of installing and managing DKube on a cluster. It is assumed that a supported version of Kubernetes is installed on the cluster prior to installing DKube, as described at Prerequisites
DKube Configuration¶
The cluster can include one or more master nodes and optional worker nodes.
The Master node coordinates the cluster, and can optionally contain GPUs
Each Worker node provides more resources, and is a way to expand the capability of the cluster
At least 1 Master node must be running for the cluster to be active. Worker nodes can be added and removed, and the cluster will continue to operate.
Installation Configuration¶
The installation can be run:
From the master node in the cluster, or
From a remote node that is not part of the cluster
The overall flow of installation is as follows:
Copy the required files to the installation node through Docker
Ensure that the installation node has passwordless access to all of the DKube cluster nodes
Execute the platform-specific setup steps as described in this document
Install DKube using Helm
Access DKube through a browser
Important
Even if the installation is executed from the master node on the cluster, passwordless access is still required to all of the nodes on the cluster, including the master node
DKube and Kubernetes¶
DKube requires Kubernetes to operate. This guide assumes that a supported version of Kubernetes has been installed on the cluster, as listed in the prerequisites section.
Prerequisites¶
Supported Platforms¶
The following software and versions are supported:
Software Platform |
Version(s) |
---|---|
Ubuntu |
18.04 |
CentOS |
7.9 |
Rook/Ceph |
1.4 |
The following Kubernetes platforms and versions are supported:
Platform |
Version |
---|---|
Kubernetes |
1.18 |
Amazon EKS |
1.18 |
Rancher |
2.4 |
VMWare vSphere with Tanzu |
1.2.1 |
The following software and versions are supported:
Software Platform |
Version(s) |
---|---|
Ubuntu |
18.04, 20.04 |
CentOS |
7.9 |
Rook/Ceph |
1.7 |
The following Kubernetes platforms and versions are supported:
Platform |
Version |
---|---|
Kubernetes |
1.20 |
Amazon EKS |
1.20 |
Rancher |
2.5 |
VMWare vSphere with Tanzu |
1.4 |
Node Requirements¶
Installation Node Requirements¶
The installation node has the following requirements:
Docker CE
Kubectl
Software help to install some of the required packages is provided at Software Package Help
DKube Cluster Node Requirements¶
The DKube Cluster nodes have the following requirements:
Docker CE
Nodes should all have static IP addresses, even if the VM exists on a cloud
All nodes must be on the same subnet
All nodes must have the same user name and ssh key
Each node on the cluster should have the following minimum resources:
16 CPU cores
64GB RAM
Storage size is dependent on the programs and datasets, and should be large enough to handle the required data, but should be at least 400GB
8 CPU cores
16GB RAM
Storage size is dependent on the programs and datasets, and should be large enough to handle the required data, but should be at least 100GB
Note
NVIDIA A100 GPUs in MIG mode are supported within DKube. Please refer to the platform-specific instructions for nodes that include this GPU
Important
Only GPUs of the exact same type can be installed on a node. So, for example, you cannot mix an NVIDIA V100 and P100 on the same node. And even GPUs of the same class must have the same configuration (e.g. memory).
Important
The Nouveau driver should not be installed on any of the nodes in the cluster. If the driver is installed, you can follow the instructions in the section Removing Nouveau Driver
Cluster Access¶
In order to run DKube both during and after installation, a minimum level of security access must be provided from any system that needs to use the node. This includes access to the url in order to open DKube from a browser.
Protocol |
Port Range |
Source |
---|---|---|
TCP |
30002 |
Access IP |
TCP |
32222 |
Access IP |
TCP |
32223 |
Access IP |
TCP |
32323 |
Access IP |
TCP |
32224 |
Access IP |
TCP |
32225 |
Access IP |
TCP |
6443 |
Access IP |
TCP |
443 |
Access IP |
TCP |
22 |
Access IP |
All |
0-65535 |
Private Subnet |
ICMP |
0-65535 |
Access IP |
The source IP access range is in CIDR format. It consists of an IP address and mask combination. For example:
192.168.100.14/24` would allow IP addresses in the range 192.168.100.x
192.168.100.14/16` would allow IP addresses in the range 192.168.x.x
Getting the DKube Files¶
The files necessary for installation are pulled from Docker, using the following commands:
Note
The docker credentials and DKube version number (x.y.z) are provided separately
This will create the folder $HOME/.dkube and copy the necessary files to the folder.
Note
The specific tools and files are used based on the platform-specific instructions described in this document
Platform-Specific Installation Instructions¶
Based on the platform and Kubernetes type, specific setup is required prior to installing DKube.
Kubernetes |
Instructions |
---|---|
EKS |
|
Rancher |
|
Tanzu |