Deploying Embedding Models on DKubeX¶
Embedding models registered in the DKubeX LLM Catalog can be deployed on the DKubeX platform using local resources. The steps to deploy them are given below.
For a detailed guide regarding deploying embedding models in DKubeX using SkyPilot, visit Deploying Embedding Models with SkyPilot.
Deploying Embedding Models on DKubeX UI¶
To deploy an Embedding Model through DKubeX UI, use the following steps:
Open the Deployments page on your DKubeX workspace. This page lists all the model deployments that are currently running or have been executed previously.
To create a new Embedding Model deployment, click on the “Create Deployment” button (shown as a “+” button on the top left corner of the Deployments page).
On the General page, provide the following details:
Select
Embedding Modelas the type of deployment to be launched.Provide a unique name for the Embedding Model deployment in the Name field.
Select
KServeas the deployment framework if you want to deploy the Embedding Model with local resources, or selectSkyif you want to deploy the Embedding Model with SkyPilot. In case of Sky, you can also provide deployment configuration details.Once done, click on the Next button to proceed to the Configuration page.
On the Configuration page, provide the following details:
In the Model Configuration section, provide the following details:
Field
Description
Source
Select the source from where the Embedding Model has to be deployed.
Provider
Select the provider of the Embedding Model.
Embedding Model
Select the Embedding Model to be deployed from the provided catalog.
Token
Provide the access token for the Embedding Model (if required).
Once done, click on the Next button to proceed to the Advanced page.
On the Advanced page, provide the following details:
In the Advanced Configuration section, provide the following details:
Field
Description
CPU
Provide the number of CPU cores to be allocated for the Embedding Model deployment.
Memory
Provide the amount of memory to be allocated for the Embedding Model deployment.
GPU
Provide the number of GPUs to be allocated for the Embedding Model deployment.
QPS
Provide the Queries Per Second (QPS) limit for the Embedding Model deployment.
Replicas
Provide the number of minimum and maximum replicas for the Embedding Model deployment.
Once done, click on the Submit button to create the Embedding Model deployment.
Once the deployment is in Running state, the deployment is ready to be used. You can access the deployment details by clicking on the deployment name on the Deployments page.
Deploying Embedding Models from CLI¶
To list all embeddding models registered in the DKubeX LLM Catalog, use the following command.
d3x emb list
Deploying Embedding Models from DKubeX Embedding Model Catalog¶
To deploy an embedding model from DKubeX embedding model catalog using KServe, you can use the command given below.
d3x emb deploy --name <name of the deployment> --model <emb Name> --token <access token> --publishd3x emb deploy -n bge-large --model BAAI--bge-large-en-v1-5 --token hf_afsrh*********bdism --publishProvide an unique name of the embedding model deployment after the
--nameflag replacing<deployment-name>in the command.Replace
<model-name>with the name of the embedding model from the DKubeX catalog after the--modelflag.Provide the Huggingface access token for the embedding model after the
--tokenflag replacing<access token>in the command.Use the
--publishflag to make the deployment details available for all users on the DKubeX setup.Use the
--kserveflag to deploy the model using KServe.Use the
--min_replicasand--max_replicasflags to specify the minimum and maximum number of replicas for the deployment. For example,--min_replicas 1.
Deploying Embedding Models with Custom Configuration File¶
To deploy an embedding model from Huggingface repository using a custom configuration file using KServe, you can use the command given below.
d3x emb deploy --name <name of the deployment> --config <path to config file> --token <access token> --publishd3x emb deploy -n bge-large --config /home/demo/bge_config.yaml --token afsgt*******hduis --publishProvide an unique name of the embedding model deployment after the
--nameflag replacing<deployment-name>in the command.Provide the absolute path of the embedding model configuration file in your workspace after the
--configflag.Provide the Huggingface access token for the embedding model after the
--tokenflag replacing<access token>in the command.Use the
--publishflag to make the deployment details available for all users on the DKubeX setup.Use the
--kserveflag to deploy the model using KServe.Use the
--min_replicasand--max_replicasflags to specify the minimum and maximum number of replicas for the deployment. For example,--min_replicas 1.