Deploying LLMs in DKubeX

Both base and finetuned LLMs can be deployed in DKubeX. The steps to deploy them are given below.

Note

  • To make the LLM deployment accessible for all users on a particular DKubeX setup, please use the --public flag in the deployment command.

  • To deploy an LLM on DKubeX with SkyPilot and Sky-Serve, visit Deploying LLMs with SkyPilot.

Deploying Base LLMs

You can deploy base LLMs which are registered with the DKubeX LLM Registry and base LLMs available on the Huggingface repository.

  • To list all base LLMs registered with DKubeX, use the following command.

    d3x llms list
    

    Information

    To see the full list of LLMs registered with DKubeX LLM Registry, please visit the List of LLMs in DKubeX LLM Catalog page.

To deploy a base LLM registered with the DKubeX LLM registry, use the following command. Replace the parts enclosed within <> with the appropriate details.

Note

In case you are using a EKS setup, please change the value of the flag --type from a10 to g5.4xlarge in the following command.

d3x llms deploy --name <name of the deployment> --model <LLM Name> --type <GPU Type> --token <access token for the model (if required)>
  • You can check the status of the deployment from the Deployment page in DKubeX or by running the following command.

    d3x serve list
    

Deploying Finetuned LLMs

You can deploy LLMs finetuned and saved in your workspace, or you can also deploy finetuned LLMs registered in MLFlow in your workspace.

To deploy a finetuned LLM saved in your workspace, use the following command. Replace the parts enclosed within <> with the appropriate details.

Note

In case you are using a EKS setup, please change the value of the flag --type from a10 to g5.4xlarge in the following command.

d3x llms deploy -n <name of the deployment> --base_model <base LLM name> -m <absolute path to the finetuned model> --type <GPU type> --token <access token for the model (if required)>
  • You can check the status of the deployment from the Deployment page in DKubeX or by running the following command.

    d3x serve list