Quickstart¶
In this tutorial we will explore the basic features of DKubeX. We will start by deploying an embedding model and an LLM model, ingesting a dataset from our workspace, and building a RAG-based chat application. The steps are as follows:
Prerequisites¶
1. Setup and Access Key¶
Make sure that the current version of DKubeX installed and running on your cluster. For detailed instructions regarding installation and logging in to DKubeX, please refer to Installing DKubeX.
Make sure that at least one of the worker nodes running your cluster running DKubeX contains an NVIDIA A10 GPU (with minimum resource of AWS- g5.4xlarge type instance, with at least 16 vCPU cores and 64 GiB of memory).
In case of a RKE2 cluster, make sure the node is labeled as
a10
during installation.In case of an AWS EKS cluster, make sure that the cluster contains a
g5.4xlarge
type nodegroup with maximum size of 1 or more.
Make sure that you have an active Huggingface access token which has access to the BAAI/bge-large-en-v1.5 and meta-llama/Llama-3.1-8B-Instruct models on Huggingface. You can generate these tokens on the Access Tokens page on Huggingface. For more information, refer to the Huggingface documentation.
2. Configuration Files, Dataset and Environment Variables¶
We will need a few configuration files and a dataset for this tutorial. To get these files on your workspace, log in to DKubeX, open terminal from your workspace page, and go through the steps provided below.
Export the following environment variables on the terminal. Replace
<username>
with your DKubeX username, and<access-token>
with your Huggingface access token.export HOMEDIR=/home/<username> export HF_TOKEN=<access-token>
export HOMEDIR=/home/demo export HF_TOKEN=hf_aJ0eX**************WJlIn0
Run the following command to download the required configuration files and dataset to your workspace.
wget -P ${HOMEDIR}/ https://raw.githubusercontent.com/dkubeio/dkubex-examples/refs/tags/v0.8.7.1/rag/ingestion/ingest-quickstart.yaml https://raw.githubusercontent.com/dkubeio/dkubex-examples/refs/tags/v0.8.7.1/rag/query/query-quickstart.yaml https://raw.githubusercontent.com/dkubeio/dkubex-examples/refs/tags/v0.8.7.1/rag/securechat/securechat-quickstart.yaml https://raw.githubusercontent.com/dkubeio/dkubex-examples/refs/tags/v0.8.7.1/rag/sample-datasets/contract-nli.zip && unzip ${HOMEDIR}/contract-nli.zip && rm -rf ${HOMEDIR}/contract-nli.zip
Step 1: Deploying Models¶
In this step, we will deploy an embedding model and a LLM locally on DKubeX.
To deploy the BAAI/bge-large-en-v1.5 embedding model locally on DKubeX, run the following command:
d3x emb deploy --name=bgelarge --model=BAAI--bge-large-en-v1-5 --token ${HF_TOKEN} --kserve
To deploy the meta-llama/Llama-3.1-8B-Instruct LLM model locally on DKubeX, run the following command:
d3x llms deploy --name=llama31 --model=meta-llama/Meta-Llama-3.1-8B-Instruct --token ${HF_TOKEN} --type=a10 --kserve
Note
In case you are using an AWS EKS setup, please change the value of the flag
--type
froma10
tog5.4xlarge
in the command.You can check the status of the deployments from the Deployments page in DKubeX or by running the following command.
d3x serve list
Wait until the deployments are in running state.
Step 2: Ingesting Data¶
In this step, we will configure the data ingestion pipeline and ingest the dataset.
Run the following command to start editing the ingestion configuration file.
vim ${HOMEDIR}/ingest-quickstart.yaml
Update the following fields in the configuration file and save it:
Field
Description
dkubex
>embedding_url
Provide the endpoint URL of the embedding model deployment (
bgelarge
). You can get the endpoint URL from the Deployments page in DKubeX and opening the deployment details page by clicking on the deployment name.dkubex
>embedding_key
Provide the serving token of the embedding model deployment (
bgelarge
). You can get the serving token from the Deployments page in DKubeX and opening the deployment details page by clicking on the deployment name.file
>inputs
>loader_args
>input_dir
Replace the
<username>
part with your DKubeX username.Example-
ingest-quickstart.yaml
embedding: dkubex dkubex: embedding_url: https://ajndksljdiuwneifune8-1sjdnfkjdjnsfu.elb.us-east-2.amazonaws.com/deployment/demo/bgelarge/v1/ # Provide serving url of local embedding model deployment embedding_key: eyJhjhsdbfkjwbfekwjhbfeiuwefhiCJ9.eyJ1kjfnskfsjbdfiubweifuibefujwibbfekjdncksjnxveW1lbnQvYmdlbGFyZ2VjcHVsb2MvIn0.pVIkjdfsnklB5Ac--EOjfsdkFsk-Fct-1Hjndf2Q # Provide service token of local embedding model deployment batch_size: 10 reader: - file file: inputs: loader_args: input_dir: /home/demo/contract-nli # Replace <username> with your DKubeX username recursive: true exclude_hidden: true raise_on_error: true splitter: sentence_text_splitter sentence_text_splitter: chunk_size: 256 chunk_overlap: 0 metadata: - default adjacent_chunks: true mlflow: experiment: demo-ingestion-quickstart
Trigger the ingestion by running the following command. This will ingest the provided dataset files and create a new dataset called
contracts
.d3x dataset ingest --dataset contracts --config ${HOMEDIR}/ingest-quickstart.yaml
Check the dataset status by running the following command. Once data ingestion finishes successfully, the dataset status of the new dataset will show
completed
.d3x dataset list
To check the list of documents that has been ingested in the
contracts
dataset, use the following command.d3x dataset show --dataset contracts
Step 3: Building a RAG-based Chat Application¶
In this step, we will configure the RAG pipeline, provide chat application configuration, and build a RAG-based chat application using the ingested dataset and the deployed models.
Run the following command to start editing the RAG pipeline configuration file.
vim ${HOMEDIR}/query-quickstart.yaml
Update the following fields in the configuration file and save it:
Field
Description
dkubex
>embedding_url
Provide the endpoint URL of the embedding model deployment (
bgelarge
). You can get the endpoint URL from the Deployments page in DKubeX and opening the deployment details page by clicking on the deployment name.dkubex
>embedding_key
Provide the serving token of the embedding model deployment (
bgelarge
). You can get the serving token from the Deployments page in DKubeX and opening the deployment details page by clicking on the deployment name.dkubex
>llm_url
Provide the endpoint URL of the LLM deployment (
llama31
). You can get the endpoint URL from the Deployments page in DKubeX and opening the deployment details page by clicking on the deployment name.dkubex
>llm_key
Provide the serving token of the LLM deployment (
llama31
). You can get the serving token from the Deployments page in DKubeX and opening the deployment details page by clicking on the deployment name.Example-
query-quickstart.yaml
dataset: contracts embedding: dkubex dkubex: embedding_key: eyJhjhsdbfkjwbfekwjhbfeiuwefhiCJ9.eyJ1kjfnskfsjbdfiubweifuibefujwibbfekjdncksjnxveW1lbnQvYmdlbGFyZ2VjcHVsb2MvIn0.pVIkjdfsnklB5Ac--EOjfsdkFsk-Fct-1Hjndf2Q # Provide serving token of local embeddding model deployment in DKubeX embedding_url: https://ajndksljdiuwneifune8-1sjdnfkjdjnsfu.elb.us-east-2.amazonaws.com/deployment/demo/bgelarge/v1/ # Provide service endpoint of local embedding model deployment in DKubeX synthesizer: llm: dkubex llm_url: https://ajndksljdiuwneifune8-1sjdnfkjdjnsfu.elb.us-east-2.amazonaws.com/deployment/demo/llama31/v1/ # Provide service endpoint of LLM deployment in DKubeX. llmkey: eyJhjhbsfkjdbfjkjcbskjsdbefhiCJ9.eyJ1kjfnskfjkbsfkjbdsuibefujwibbfekjdncksjnxvekbfsdkjdbslkjfnslkndflkbGFyZ2cbdgcgsnsks.pVIkjdgdtcjcjdo--EOhdcydvbed-jhd-hdyvnflr # Provide serving token of LLM deployment in DKubeX. prompt: "default" window_size: 2 max_tokens: 1024 faq: enabled: false threshold: 0.90 cache_suggestion_threshold: 0.85 vectorstore: weaviate_vectorstore weaviate_vectorstore: vectorstore_provider: dkubex url: "" auth_key: "" textkey: 'paperdocs' search: vector_search vector_search: top_k: 3 rerank: true rerank_topk: 5 max_sources: 3 parallel_query: batch_size: 16 context_combiner: use_adj_chunks: true mlflow: experiment: demo-ragquery-quickstart
Run the following command to start editing the chat application configuration file.
vim ${HOMEDIR}/securechat-quickstart.yaml
Update the following fields in the configuration file and save it:
Field
Description
env
>FMQUERY_ARGS
Replace
<username>
with your DKubeX username.Example-
securechat-quickstart.yaml
name: "ndabase" ingressprefix: "/ndabase" env: SECUREAPP_ACCESS_KEY: "allow" FMQUERY_ARGS: "llm --dataset contracts --config /home/demo/query-quickstart.yaml" # Replace <username> with your DKubeX username image: "dkubex123/llmapp:securechat-0.8.7.1" cpu: 1 gpu: 0 memory: 4 dockerserver: "DOCKER_SERVER" dockeruser: "dkubex123" dockerpsw: "dckr_pat_dE90DkE9bzttBinnniexlHdPPgI" publish: "false" port: "3000" description: "Chat Application" rewritetarget: "false" configsnippet: "" output: "yaml" mount_home: "all" metrics: port: 8877
Deploy the chat application by running the following command. This will create a chat application named
ndabase
on your workspace.d3x apps create --config ${HOMEDIR}/securechat-quickstart.yaml
Check the status of the application deployment by running the following command. You can also check the status from the Applications page in DKubeX UI.
d3x apps list
Once the application status becomes
running
, you can access the chat application from the Applications page on the DKubeX UI. Once asked, provide the access key asallow
which will enable you to start using the chat application.Sample questions for this chat application
What are personal and confidential information?
Briefly explain what a termination clause is.
What is a non-circumvention and non-disclosure agreement?
How do I frame a confidential information clause?
What is the difference between a unilateral and mutual NDA?
What are some common exceptions to confidential information clauses?
Tutorials and More Information¶
For more examples including how to train and register models and deploy user applications, please visit the following pages and go through the tutorials provided:
Training Fashion MNIST model in DKubeX
Training Fashion MNIST model with GPU in DKubeX
Deploying models from MLFlow
Deploying models from Huggingface
Deploying LLMs in DKubeX
Deploying embedding models in DKubeX
Creating a Securechat App using BGE-large Embeddings and Llama3-8b Summarisation Models
Evaluating LLMs in DKubeX
Deploying Embedding models with SkyPilot
Deploying LLMs with SkyPilot
Data ingestion with SkyPilot
Wine and Llama2 Finetuning using SkyPilot