List of LLMs in DKubeX LLM Catalog¶

Serving Models and Open-source LLMs, Deploying LLMs in DKubeX, Deploying LLMs with SkyPilot

The following list contains the base LLMs which are currently registered with the DKubeX LLM Catalog on the current version of DKubeX (DKubeX v0.8.8.1). These LLMs can be deployed on the DKubeX platform without using any config files provided the resource and permission requirements are met. You can deploy these LLMs on DKubeX using local resources or using cloud resources using SkyPilot.

LLMs registered on the DKubeX LLM Registry¶
LLM Name	Accelerator Type	Provider	Deployment Config	Number of GPUs needed
`casperhansen/llama-3-70b-instruct-awq`	`A10`	`dkubex`	`cpu_offload_gb`: 25, `gpu-memory-utilization`: 0.9, `max-model-len`: 4096, `max-num-batched-tokens`: 8192, `max-num-seqs`: 64, `max_total_tokens`: 4096	1
`meta/llama-3.1-405b-instruct`	`A100`	`nim`		16
`meta/llama-3.1-70b-instruct`	`A100`	`nim`		4
`meta/llama-3.1-8b-base`	`A10G`	`nim`		2
`meta/llama-3.1-8b-instruct`	`A10G`	`nim`		2
`meta/llama3-70b-instruct`	`A100`	`nim`		4
`meta/llama3-8b-instruct`	`A10G`	`nim`		1
`meta-llama/Meta-Llama-3-8B-Instruct`	`A10`	`dkubex`	`gpu-memory-utilization`: 0.9, `max-model-len`: 4096, `max-num-batched-tokens`: 8192, `max-num-seqs`: 64, `max_total_tokens`: 4096	1
`meta-llama/Meta-Llama-3.1-8B-Instruct`	`A10`	`dkubex`	`gpu_memory_utilization`: 0.9, `max-model-len`: 8192, `max-num-batched-tokens`: 12288, `max-num-seqs`: 64, `swap-space`: 10, `max_total_tokens`: 8192	1
`microsoft/Phi-3-small-8k-instruct`	`A10`	`dkubex`	`gpu-memory-utilization`: 0.85, `max-model-len`: 4096, `max-num-batched-tokens`: 8192, `max-num-seqs`: 64, `max_total_tokens`: 2048	1
`mistralai/mistral-7b-instruct-v03`	`A100`	`nim`		1
`mistralai/Mistral-7B-Instruct-v0.2`	`A10`	`dkubex`	`gpu-memory-utilization`: 0.9, `max-model-len`: 8192, `max-num-batched-tokens`: 12288, `max-num-seqs`: 64, `swap-space`: 10, `max_total_tokens`: 8192	1
`mistralai/mixtral-8x22b-instruct-v01`	`A100`	`nim`		8
`mistralai/mixtral-8x7b-instruct-v01`	`A100`	`nim`		4
`mistralai/Pixtral-12B-2409`	`A10`	`dkubex`	`config-format`: mistral, `gpu-memory-utilization`: 0.9, `load_format`: mistral, `max-model-len`: 8192, `tensor-parallel-size`: 2, `tokenizer-mode`: mistral, `max_total_tokens`: 4096	2
`nvidia/nemotron-4-340b-instruct`	`A100`	`nim`		16

Note

Provider dkubex indicates that the LLM configuration is provided by DKubeX.
Provider nim indicates that the LLM configuration is provided by NVIDIA Inference Microservice (NVIDIA NIM).

These LLMs can be deployed on the DKubeX platform without using any config files provided the resource and permission requirements are met. You can deploy these LLMs on DKubeX using local resources or using cloud resources using SkyPilot. For more information on deploying LLMs, go to the appropriate page provided below.

Deploying LLMs using Local Resources

./serving/llm_deploy.html

Deploying LLMs using SkyPilot

./skypilot/llm-deployment-with-skypilot.html