List of LLMs in DKubeX LLM Registry¶
The following list contains the base LLMs which are currently registered with the DKubeX LLM Registry on the current version of DKubeX (DKubeX v0.8.1).
LLM Name |
Required GPU Type |
GPU Memory Utilization |
Max number of batched tokens |
Max number of sequences |
Trust remote code |
Max number of total tokens |
Number of GPUs needed |
---|---|---|---|---|---|---|---|
|
a10 |
0.85 |
4096 |
64 |
true |
2048 |
1 |
|
a10 |
0.8 |
4096 |
64 |
true |
4096 |
4 |
|
a10 |
0.85 |
4096 |
64 |
true |
2048 |
1 |
|
a10 |
0.7 |
4096 |
64 |
true |
4096 |
1 |
|
a10 |
0.8 |
4096 |
64 |
true |
4096 |
4 |
|
a10 |
0.85 |
4096 |
64 |
true |
4096 |
1 |
|
a10 |
0.85 |
4096 |
64 |
true |
2048 |
1 |
|
a10 |
0.7 |
4096 |
64 |
true |
4096 |
1 |
|
a10 |
0.7 |
4096 |
64 |
true |
4096 |
1 |
|
a10 |
0.8 |
4096 |
64 |
true |
4096 |
4 |
These LLMs can be deployed on the DKubeX platform without using any config files provided the resource and permission requirements are met. To deploy one of this LLMs, please use the following command:
d3x llms deploy --name <name of the deployment> --model<LLM Name> --type <GPU Type> --token <access token for the model (if required)>
d3x llms deploy --name llama27b --model meta-llama--Llama-2-7b-chat-hf --type a10 --token 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
Note
In case you are using a EKS setup, please change the value of the flag --type from a10 to g5.4xlarge in the following command.