Quickstart

In this tutorial we will explore the basic features of DKubeX. We will start by deploying an embedding model and an LLM model, ingesting a dataset from our workspace, and building a RAG-based chat application. The steps are as follows:

  1. Step 2: Ingesting Data

  2. Step 1: Deploying Models

  3. Step 3: Building a RAG-based Chat Application

Prerequisites

1. Setup and Access Key

  • Make sure that the current version of DKubeX installed and running on your cluster. For detailed instructions regarding installation and logging in to DKubeX, please refer to Installing DKubeX.

  • Make sure that at least one of the worker nodes running your cluster running DKubeX contains an NVIDIA A10 GPU (with minimum resource of AWS- g5.4xlarge type instance, with at least 16 vCPU cores and 64 GiB of memory).

    • In case of a RKE2 cluster, make sure the node is labeled as a10 during installation.

      Command to label a node as a10 on a RKE2 cluster
      Command
      kubectl label node <node-name> node.kubernetes.io/nodetype=a10
      
      Example
      kubectl label node demo-worker-node node.kubernetes.io/nodetype=a10
      
    • In case of an AWS EKS cluster, make sure that the cluster contains a g5.4xlarge type nodegroup with maximum size of 1 or more.

  • Make sure that you have an active Huggingface access token which has access to the BAAI/bge-large-en-v1.5 and meta-llama/Llama-3.1-8B-Instruct models on Huggingface. You can generate these tokens on the Access Tokens page on Huggingface. For more information, refer to the Huggingface documentation.

2. Configuration Files, Dataset and Environment Variables

We will need a few configuration files and a dataset for this tutorial. To get these files on your workspace, log in to DKubeX, open terminal from your workspace page, and go through the steps provided below.

  • Export the following environment variables on the terminal. Replace <username> with your DKubeX username, and <access-token> with your Huggingface access token.

    export HOMEDIR=/home/<username>
    export HF_TOKEN=<access-token>
    
  • Run the following command to download the required configuration files and dataset to your workspace.

    wget -P ${HOMEDIR}/ https://raw.githubusercontent.com/dkubeio/dkubex-examples/refs/tags/v0.8.7.1/rag/ingestion/ingest-quickstart.yaml https://raw.githubusercontent.com/dkubeio/dkubex-examples/refs/tags/v0.8.7.1/rag/query/query-quickstart.yaml https://raw.githubusercontent.com/dkubeio/dkubex-examples/refs/tags/v0.8.7.1/rag/securechat/securechat-quickstart.yaml https://raw.githubusercontent.com/dkubeio/dkubex-examples/refs/tags/v0.8.7.1/rag/sample-datasets/contract-nli.zip && unzip ${HOMEDIR}/contract-nli.zip && rm -rf ${HOMEDIR}/contract-nli.zip
    

Step 1: Deploying Models

In this step, we will deploy an embedding model and a LLM locally on DKubeX.

  • To deploy the BAAI/bge-large-en-v1.5 embedding model locally on DKubeX, run the following command:

    d3x emb deploy --name=bgelarge --model=BAAI--bge-large-en-v1-5 --token ${HF_TOKEN} --kserve
    
  • To deploy the meta-llama/Llama-3.1-8B-Instruct LLM model locally on DKubeX, run the following command:

    d3x llms deploy --name=llama31 --model=meta-llama/Meta-Llama-3.1-8B-Instruct --token ${HF_TOKEN} --type=a10 --kserve
    

    Note

    In case you are using an AWS EKS setup, please change the value of the flag --type from a10 to g5.4xlarge in the command.

  • You can check the status of the deployments from the Deployments page in DKubeX or by running the following command.

    d3x serve list
    
  • Wait until the deployments are in running state.

Step 2: Ingesting Data

In this step, we will configure the data ingestion pipeline and ingest the dataset.

  • Run the following command to start editing the ingestion configuration file.

    vim ${HOMEDIR}/ingest-quickstart.yaml
    
  • Update the following fields in the configuration file and save it:

    Field

    Description

    dkubex > embedding_url

    Provide the endpoint URL of the embedding model deployment (bgelarge). You can get the endpoint URL from the Deployments page in DKubeX and opening the deployment details page by clicking on the deployment name.

    dkubex > embedding_key

    Provide the serving token of the embedding model deployment (bgelarge). You can get the serving token from the Deployments page in DKubeX and opening the deployment details page by clicking on the deployment name.

    file > inputs > loader_args > input_dir

    Replace the <username> part with your DKubeX username.

    Example- ingest-quickstart.yaml
    embedding: dkubex
    dkubex:
      embedding_url: https://ajndksljdiuwneifune8-1sjdnfkjdjnsfu.elb.us-east-2.amazonaws.com/deployment/demo/bgelarge/v1/                         # Provide serving url of local embedding model deployment
      embedding_key: eyJhjhsdbfkjwbfekwjhbfeiuwefhiCJ9.eyJ1kjfnskfsjbdfiubweifuibefujwibbfekjdncksjnxveW1lbnQvYmdlbGFyZ2VjcHVsb2MvIn0.pVIkjdfsnklB5Ac--EOjfsdkFsk-Fct-1Hjndf2Q                              # Provide service token of local embedding model deployment
      batch_size: 10
    reader:
      - file
    file:
      inputs:
        loader_args:
          input_dir: /home/demo/contract-nli                              # Replace <username> with your DKubeX username
          recursive: true
          exclude_hidden: true
          raise_on_error: true
    splitter: sentence_text_splitter
    sentence_text_splitter:
      chunk_size: 256
      chunk_overlap: 0
    metadata:
      - default
    adjacent_chunks: true
    mlflow:
      experiment: demo-ingestion-quickstart
    
  • Trigger the ingestion by running the following command. This will ingest the provided dataset files and create a new dataset called contracts.

    d3x dataset ingest --dataset contracts --config ${HOMEDIR}/ingest-quickstart.yaml
    
  • Check the dataset status by running the following command. Once data ingestion finishes successfully, the dataset status of the new dataset will show completed.

    d3x dataset list
    
  • To check the list of documents that has been ingested in the contracts dataset, use the following command.

    d3x dataset show --dataset contracts
    

Step 3: Building a RAG-based Chat Application

In this step, we will configure the RAG pipeline, provide chat application configuration, and build a RAG-based chat application using the ingested dataset and the deployed models.

  • Run the following command to start editing the RAG pipeline configuration file.

    vim ${HOMEDIR}/query-quickstart.yaml
    
  • Update the following fields in the configuration file and save it:

    Field

    Description

    dkubex > embedding_url

    Provide the endpoint URL of the embedding model deployment (bgelarge). You can get the endpoint URL from the Deployments page in DKubeX and opening the deployment details page by clicking on the deployment name.

    dkubex > embedding_key

    Provide the serving token of the embedding model deployment (bgelarge). You can get the serving token from the Deployments page in DKubeX and opening the deployment details page by clicking on the deployment name.

    dkubex > llm_url

    Provide the endpoint URL of the LLM deployment (llama31). You can get the endpoint URL from the Deployments page in DKubeX and opening the deployment details page by clicking on the deployment name.

    dkubex > llm_key

    Provide the serving token of the LLM deployment (llama31). You can get the serving token from the Deployments page in DKubeX and opening the deployment details page by clicking on the deployment name.

    Example- query-quickstart.yaml
    dataset: contracts
    
    embedding: dkubex
    dkubex:
      embedding_key: eyJhjhsdbfkjwbfekwjhbfeiuwefhiCJ9.eyJ1kjfnskfsjbdfiubweifuibefujwibbfekjdncksjnxveW1lbnQvYmdlbGFyZ2VjcHVsb2MvIn0.pVIkjdfsnklB5Ac--EOjfsdkFsk-Fct-1Hjndf2Q                 # Provide serving token of local embeddding model deployment in DKubeX
      embedding_url: https://ajndksljdiuwneifune8-1sjdnfkjdjnsfu.elb.us-east-2.amazonaws.com/deployment/demo/bgelarge/v1/               # Provide service endpoint of local embedding model deployment in DKubeX
    
    synthesizer:
      llm: dkubex
      llm_url: https://ajndksljdiuwneifune8-1sjdnfkjdjnsfu.elb.us-east-2.amazonaws.com/deployment/demo/llama31/v1/                     # Provide service endpoint of LLM deployment in DKubeX.
      llmkey: eyJhjhbsfkjdbfjkjcbskjsdbefhiCJ9.eyJ1kjfnskfjkbsfkjbdsuibefujwibbfekjdncksjnxvekbfsdkjdbslkjfnslkndflkbGFyZ2cbdgcgsnsks.pVIkjdgdtcjcjdo--EOhdcydvbed-jhd-hdyvnflr                        # Provide serving token of LLM deployment in DKubeX.
      prompt: "default"
      window_size: 2
      max_tokens: 1024
    
    faq:
      enabled: false
      threshold: 0.90
      cache_suggestion_threshold: 0.85
    
    vectorstore: weaviate_vectorstore
    weaviate_vectorstore:
      vectorstore_provider: dkubex
      url: ""
      auth_key: ""
      textkey: 'paperdocs'
    
    search: vector_search
    vector_search:
      top_k: 3
      rerank: true
      rerank_topk: 5
      max_sources: 3
    
    parallel_query:
      batch_size: 16
    
    context_combiner:
      use_adj_chunks: true
    
    mlflow:
      experiment: demo-ragquery-quickstart
    
  • Run the following command to start editing the chat application configuration file.

    vim ${HOMEDIR}/securechat-quickstart.yaml
    
  • Update the following fields in the configuration file and save it:

    Field

    Description

    env > FMQUERY_ARGS

    Replace <username> with your DKubeX username.

    Example- securechat-quickstart.yaml
    name: "ndabase"
    ingressprefix: "/ndabase"
    env:
    SECUREAPP_ACCESS_KEY: "allow"
    FMQUERY_ARGS: "llm --dataset contracts --config /home/demo/query-quickstart.yaml"    # Replace <username> with your DKubeX username
    image: "dkubex123/llmapp:securechat-0.8.7.1"
    cpu: 1
    gpu: 0
    memory: 4
    dockerserver: "DOCKER_SERVER"
    dockeruser: "dkubex123"
    dockerpsw: "dckr_pat_dE90DkE9bzttBinnniexlHdPPgI"
    publish: "false"
    port: "3000"
    description: "Chat Application"
    rewritetarget: "false"
    configsnippet: ""
    output: "yaml"
    mount_home: "all"
    metrics:
    port: 8877
    
  • Deploy the chat application by running the following command. This will create a chat application named ndabase on your workspace.

    d3x apps create --config ${HOMEDIR}/securechat-quickstart.yaml
    
  • Check the status of the application deployment by running the following command. You can also check the status from the Applications page in DKubeX UI.

    d3x apps list
    
  • Once the application status becomes running, you can access the chat application from the Applications page on the DKubeX UI. Once asked, provide the access key as allow which will enable you to start using the chat application.

    Sample questions for this chat application
    • What are personal and confidential information?

    • Briefly explain what a termination clause is.

    • What is a non-circumvention and non-disclosure agreement?

    • How do I frame a confidential information clause?

    • What is the difference between a unilateral and mutual NDA?

    • What are some common exceptions to confidential information clauses?

Tutorials and More Information

For more examples including how to train and register models and deploy user applications, please visit the following pages and go through the tutorials provided:

Training Example
  • Training Fashion MNIST model in DKubeX

  • Training Fashion MNIST model with GPU in DKubeX

./training/index.html
Serving Example
  • Deploying models from MLFlow

  • Deploying models from Huggingface

  • Deploying LLMs in DKubeX

  • Deploying embedding models in DKubeX

./serving/index.html
RAG
  • Creating a Securechat App using BGE-large Embeddings and Llama3-8b Summarisation Models

./rag/rag.html
Evaluation
  • Evaluating LLMs in DKubeX

./eval.html
SkyPilot
  • Deploying Embedding models with SkyPilot

  • Deploying LLMs with SkyPilot

  • Data ingestion with SkyPilot

  • Wine and Llama2 Finetuning using SkyPilot

./skypilot/skypilot.html