List of LLMs in DKubeX LLM Catalog¶

The following list contains the base LLMs which are currently registered with the DKubeX LLM Catalog on the current version of DKubeX (DKubeX v0.8.3).

LLMs registered on the DKubeX LLM Catalog¶
LLM Name	Required GPU Type	GPU Memory Utilization	Max number of batched tokens	Max number of sequences	Trust remote code	Max number of total tokens	Number of GPUs needed
`casperhansen/mixtral-instruct-awq`	a10	0.8	4096	64	true	4096	4
`google/gemma-7b`	a10	0.85	4096	64	true	2048	1
`HuggingFaceH4/zephyr-7b-beta`	a10	0.7	4096	64	true	4096	1
`meta-llama/Llama-2-13b-chat-hf`	a10	0.8	4096	64	true	4096	4
`meta-llama/Llama-2-7b-chat-hf`	a10	0.85	4096	64	true	4096	1
`microsoft/phi-2`	a10	0.85	4096	64	true	2048	1
`mistralai/Mistral-7B-Instruct-v0.2`	a10	0.7	4096	64	true	4096	1
`Open-Orca/Mistral-7B-OpenOrca`	a10	0.7	4096	64	true	4096	1
`TheBloke/Llama-2-70B-Chat-AWQ`	a10	0.8	4096	64	true	4096	4

These LLMs can be deployed on the DKubeX platform without using any config files provided the resource and permission requirements are met. You can deploy these LLMs on DKubeX using local resources or using cloud resources using SkyPilot.

Note

For more information regarding deploying LLMs on DKubeX, refer to Deploying LLMs in DKubeX.

To deploy an LLM using local resources, you can use the command given below. Replace the following fields in the command with the appropriate values:

<name of the deployment>: Unique name for the LLM deployment
<LLM Name>: Name of the LLM provided in the list above
<GPU Type>: GPU type required for the LLM
<access token>: Access token required for the LLM (if any)

Note

In case you are using a EKS setup, please change the value of the flag --type from a10 to g5.4xlarge in the following command.
Use the --publish flag if you want to make the deployment available for any user on the same setup to access and use.

d3x llms deploy --name <name of the deployment> --model <LLM Name> --type <GPU Type> --token <access token> --publish

d3x llms deploy --name llama27b --model meta-llama/Llama-2-7b-chat-hf --type a10 --token hf_Ahq**********************WmO --publish

To deploy an LLM using cloud resources using SkyPilot, you can use the command given below. Replace the following fields in the command with the appropriate values:

<name of the deployment>: Unique name for the LLM deployment
<LLM Name>: Name of the LLM provided in the list above
<access token>: Access token required for the LLM (if any)

Note

Use the --publish flag if you want to make the deployment available for any user on the same setup to access and use.

d3x llms deploy --name <name of the deployment> --model <LLM Name> -sky --token <access token> --publish

d3x llms deploy --name llama27b --model meta-llama/Llama-2-7b-chat-hf -sky --token hf_Ahq**********************WmO --publish