List of LLMs in DKubeX LLM Catalog

The following list contains the base LLMs which are currently registered with the DKubeX LLM Catalog on the current version of DKubeX (DKubeX v0.7.9.1).

LLMs registered on the DKubeX LLM Catalog

LLM Name

Required GPU Type

GPU Memory Utilization

Max number of batched tokens

Max number of sequences

Trust remote code

Max number of total tokens

amazon–LightGPT

a10

0.85

4096

64

true

2048

huggingFaceH4–zephyr-7b-beta

a10

0.7

4096

64

true

4096

meta-llama–Llama-2-13b-chat-hf

a10

0.8

4096

64

true

4096

meta-llama–Llama-2-7b-chat-hf

a10

0.85

4096

64

true

4096

mistralai–mistral-7b-v0.2

a10

0.7

4096

64

true

4096

open-orca–mistral-7b-openorca

a10

0.7

4096

64

true

4096

These LLMs can be deployed on the DKubeX platform without using any config files provided the resource and permission requirements are met. To deploy one of this LLMs, please use the following command:

d3x llms deploy --name <name of the deployment> --model<LLM Name> --type <GPU Type> --token <access token for the model (if required)>

Note

In case you are using a EKS setup, please change the value of the flag --type from a10 to g5.4xlarge in the following command.