Deploying Embedding Models on DKubeX¶
Embedding models registered in the DKubeX LLM Catalog can be deployed on the DKubeX platform using local resources with KServe. The steps to deploy them are given below.
For a detailed guide regarding deploying embedding models in DKubeX using SkyPilot, visit Deploying Embedding Models with SkyPilot.
To list all embeddding models registered in the DKubeX LLM Catalog, use the following command.
d3x emb list
Deploying Embedding Models from DKubeX Embedding Model Catalog¶
To deploy an embedding model from DKubeX embedding model catalog using KServe, you can use the command given below.
d3x emb deploy --name <name of the deployment> --model <emb Name> --token <access token> --publishd3x emb deploy -n bge-large --model BAAI--bge-large-en-v1-5 --token hf_afsrh*********bdism --publishProvide an unique name of the embedding model deployment after the
--nameflag replacing<deployment-name>in the command.Replace
<model-name>with the name of the embedding model from the DKubeX catalog after the--modelflag.Provide the Huggingface access token for the embedding model after the
--tokenflag replacing<access token>in the command.Use the
--publishflag to make the deployment details available for all users on the DKubeX setup.Use the
--kserveflag to deploy the model using KServe.Use the
--min_replicasand--max_replicasflags to specify the minimum and maximum number of replicas for the deployment. For example,--min_replicas 1.
Deploying Embedding Models with Custom Configuration File¶
To deploy an embedding model from Huggingface repository using a custom configuration file using KServe, you can use the command given below.
d3x emb deploy --name <name of the deployment> --config <path to config file> --token <access token> --publishd3x emb deploy -n bge-large --config /home/demo/bge_config.yaml --token afsgt*******hduis --publishProvide an unique name of the embedding model deployment after the
--nameflag replacing<deployment-name>in the command.Provide the absolute path of the embedding model configuration file in your workspace after the
--configflag.Provide the Huggingface access token for the embedding model after the
--tokenflag replacing<access token>in the command.Use the
--publishflag to make the deployment details available for all users on the DKubeX setup.Use the
--kserveflag to deploy the model using KServe.Use the
--min_replicasand--max_replicasflags to specify the minimum and maximum number of replicas for the deployment. For example,--min_replicas 1.