List of LLMs in DKubeX LLM Registry¶

The following list contains the base LLMs which are currently registered with the DKubeX LLM Registry on the current version of DKubeX (DKubeX v0.7.9.1).

LLMs registered on the DKubeX LLM Registry¶
LLM Name	Required GPU Type	GPU Memory Utilization	Max number of batched tokens	Max number of sequences	Trust remote code	Max number of total tokens
amazon–LightGPT	a10	0.85	4096	64	true	2048
huggingFaceH4–zephyr-7b-beta	a10	0.7	4096	64	true	4096
meta-llama–Llama-2-13b-chat-hf	a10	0.8	4096	64	true	4096
meta-llama–Llama-2-7b-chat-hf	a10	0.85	4096	64	true	4096
mistralai–mistral-7b-v0.2	a10	0.7	4096	64	true	4096
open-orca–mistral-7b-openorca	a10	0.7	4096	64	true	4096

These LLMs can be deployed on the DKubeX platform without using any config files provided the resource and permission requirements are met. To deploy one of this LLMs, please use the following command:

d3x llms deploy --name <name of the deployment> --model<LLM Name> --type <GPU Type> --token <access token for the model (if required)>

d3x llms deploy --name llama27b --model meta-llama--Llama-2-7b-chat-hf --type a10 --token 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

Note

In case you are using a EKS setup, please change the value of the flag --type from a10 to g5.4xlarge in the following command.