Deploying Embedding Models on DKubeX¶

Embedding models registered in the DKubeX LLM Catalog can be deployed on the DKubeX platform using local resources with KServe. The steps to deploy them are given below.

For a detailed guide regarding deploying embedding models in DKubeX using SkyPilot, visit Deploying Embedding Models with SkyPilot.

To list all embeddding models registered in the DKubeX LLM Catalog, use the following command.
```
d3x emb list
```

Deploying Embedding Models from DKubeX Embedding Model Catalog¶

To deploy an embedding model from DKubeX embedding model catalog using KServe, you can use the command given below.
d3x emb deploy --name <name of the deployment> --model <emb Name> --token <access token> --publish
d3x emb deploy -n bge-large --model BAAI--bge-large-en-v1-5 --token hf_afsrh*********bdism --publish
- Provide an unique name of the embedding model deployment after the --name flag replacing <deployment-name> in the command.
- Replace <model-name> with the name of the embedding model from the DKubeX catalog after the --model flag.
- Provide the Huggingface access token for the embedding model after the --token flag replacing <access token> in the command.
- Use the --publish flag to make the deployment details available for all users on the DKubeX setup.
- Use the --kserve flag to deploy the model using KServe.
- Use the --min_replicas and --max_replicas flags to specify the minimum and maximum number of replicas for the deployment. For example, --min_replicas 1.

Deploying Embedding Models with Custom Configuration File¶

To deploy an embedding model from Huggingface repository using a custom configuration file using KServe, you can use the command given below.
d3x emb deploy --name <name of the deployment> --config <path to config file> --token <access token> --publish
d3x emb deploy -n bge-large --config /home/demo/bge_config.yaml --token afsgt*******hduis --publish
- Provide an unique name of the embedding model deployment after the --name flag replacing <deployment-name> in the command.
- Provide the absolute path of the embedding model configuration file in your workspace after the --config flag.
- Provide the Huggingface access token for the embedding model after the --token flag replacing <access token> in the command.
- Use the --publish flag to make the deployment details available for all users on the DKubeX setup.
- Use the --kserve flag to deploy the model using KServe.
- Use the --min_replicas and --max_replicas flags to specify the minimum and maximum number of replicas for the deployment. For example, --min_replicas 1.