List of LLMs in DKubeX LLM Catalog¶

Serving Models and Open-source LLMs

The following list contains the base LLMs which are currently registered with the DKubeX LLM Catalog on the current version of DKubeX (DKubeX v0.8.6.3). These LLMs can be deployed on the DKubeX platform without using any config files provided the resource and permission requirements are met. You can deploy these LLMs on DKubeX using local resources or using cloud resources using SkyPilot.

LLMs registered on the DKubeX LLM Registry¶
LLM Name	Accelerator Type	Provider	Deployment Config	Number of GPUs needed
`casperhansen/llama-3-70b-instruct-awq`	A10	dkubex	`cpu_offload_gb`: 25, `gpu-memory-utilization`: 0.9, `max-model-len`: 4096, `max-num-batched-tokens`: 8192, `max-num-seqs`: 64, `max_total_tokens`: 4096	1
`meta/llama-3.1-405b-instruct`	A100	nim		16
`meta/llama-3.1-70b-instruct`	A100	nim		4
`meta/llama-3.1-8b-base`	A10G	nim		4
`meta/llama-3.1-8b-instruct`	A10G	nim		4
`meta/llama3-70b-instruct`	A100	nim		4
`meta/llama3-8b-instruct`	A10G	nim		1
`meta-llama/Meta-Llama-3-8B-Instruct`	A10	dkubex	`gpu-memory-utilization`: 0.9, `max-model-len`: 4096, `max-num-batched-tokens`: 8192, `max-num-seqs`: 64, `max_total_tokens`: 4096	1
`meta-llama/Meta-Llama-3.1-8B-Instruct`	A10	dkubex	`gpu_memory_utilization`: 0.9, `max-model-len`: 4096, `max-num-batched-tokens`: 8192, `max-num-seqs`: 64, `swap-space`: 10, `max_total_tokens`: 8192	1
`microsoft/Phi-3-small-8k-instruct`	A10	dkubex	`gpu-memory-utilization`: 0.85, `max-model-len`: 4096, `max-num-batched-tokens`: 8192, `max-num-seqs`: 64, `max_total_tokens`: 2048	1
`mistralai/mistral-7b-instruct-v03`	A100	nim		1
`mistralai/Mistral-7B-Instruct-v0.2`	A10	dkubex	`gpu-memory-utilization`: 0.9, `max-model-len`: 4096, `max-num-batched-tokens`: 8192, `max-num-seqs`: 64, `swap-space`: 10, `max_total_tokens`: 8192	1
`mistralai/mixtral-8x22b-instruct-v01`	A100	nim		8
`mistralai/mixtral-8x7b-instruct-v01`	A100	nim		4
`mistralai/Pixtral-12B-2409`	A10	dkubex	`config-format`: mistral, `gpu-memory-utilization`: 0.9, `load_format`: mistral, `max-model-len`: 4096, `tensor-parallel-size`: 4, `tokenizer-mode`: mistral, `max_total_tokens`: 4096	1
`nvidia/nemotron-4-340b-instruct`	A100	nim		16
`tokyotech-llm/llama-3-swallow-70b-instruct-v0.1`	A10G	nim		8
`yentinglin/llama-3-taiwan-70b-instruct`	A10G	nim		8

These LLMs can be deployed on the DKubeX platform without using any config files provided the resource and permission requirements are met. You can deploy these LLMs on DKubeX using local resources or using cloud resources using SkyPilot. For more information on deploying LLMs, go to the appropriate page provided below.

Deploying LLMs using Local Resources

./serving/llm_deploy.html

Deploying LLMs using SkyPilot

./skypilot/llm-deployment-with-skypilot.html