Quickstart¶
In this example we will explore the basic features of DKubeX. We will start by ingesting a dataset from our workspace, deploying a base LLM model, building a RAG chat application, finetuning the LLM model with a custom dataset, deploying the finetuned LLM model, and building a RAG chat application with the finetuned LLM model. The steps are as follows:
Prerequisites¶
You must have the current version of DKubeX installed into your system. For detailed instructions regarding installation and logging in to DKubeX, please refer to Installation.
For this example, ideally you need a a10 GPU (g5.4xlarge) node attached to your cluster.
Attention
In case of a RKE2 setup, please make sure that you have labeled the node as “a10”. Also, in case you are using any other type of GPU node, make sure to use the label for that node which you have put during DKubeX installation process.
Make sure you have access and the secret token (only if required) for that particular model which you are going to deploy (in this example, Llama2-7B).
Note
In the case the model you want to deploy is not in the DKubeX LLMs registry, you can also deploy LLMs directly from the HuggingFace model hub using the configuration file of that particular model. Make sure you have access to that particular model (in case the model is a private model).
For instructions on how to deploy the model from HuggingFace repository, please refer to Deploying Base LLMs.
This example uses the ContractNLI dataset throughout where required. You need to download the dataset on your DKubeX workspace.
Attention
Although the ContractsNLI dataset is in accordance with the terms and conditions of the Creative Commons Attribution 4.0 International Public License, it is recommended to go through the terms and conditions of the dataset before using it. You can read the terms and conditions here- https://stanfordnlp.github.io/contract-nli/#download
To download the dataset, open the Terminal application from the DKubeX UI and run the following command:
wget https://stanfordnlp.github.io/contract-nli/resources/contract-nli.zip
Unzip the downloaded file using the following command. A folder called contract-nli will be created which will contain the entire dataset.
unzip contract-nli.zip
Export the following variables to your workspace by running the following commands on your DKubeX Terminal.
Replace the
<username>
part with your DKubeX workspace name.export PYTHONWARNINGS="ignore" export OPENAI_API_KEY="dummy" export NAMESPACE="<username>" export HOMEDIR=/home/${NAMESPACE}
Ingesting Data¶
Note
For detailed information regarding this section, please refer to Data ingestion and creating dataset.
Important
This example uses the BAAI/bge-large-en-v1.5
embeddings model for data ingestion.
Configuring .yaml
files for ingestion¶
You need to provide a few .yaml files to be used during the ingestion process.
On the Terminal application in DKubeX UI, run the following commands:
git clone https://github.com/dkubeio/dkubex-examples.git cd dkubex-examples git checkout llamaidx cd rag/ingestion cp ingest.yaml ${HOMEDIR}/ingest.yaml && cp custom_sdr.py ${HOMEDIR}/custom_sdr.py && cd
You need to provide proper details on the ingest.yaml file. Run
vim ingest.yaml
and make the following changes.On the
reader:inputs:loader_args:input_dir:
section, provide the absolute path to your dataset folder, i.e. in this case,"/home/<your username>/contract-nli/"
(Provide your DKubeX username in place of<your username>
).On the
reader:pyloader:
section, provide the absolute path to thecustom_sdr.py
file, i.e. in this case,/home/<your username>/custom_sdr.py
(Provide your DKubeX username in place of<your username>
).
You can also modify and customize several other options in the ingest.yaml file according to your needs, including the splitter class, chunk size, embedding model to be used, etc.
Triggering ingestion and creating dataset¶
Open the Terminal application in DKubeX UI.
Use the following command to perform data ingestion and create the dataset. A dataset named contracts will be created.
d3x dataset ingest -d contracts --config ${HOMEDIR}/ingest.yaml
Note
A few documents from the ContractsNLI dataset may show errors during the ingestion process. This is expected behaviour ar those documents’ format are not suitable for ingestion.
The time taken for the ingestion process to complete depends on the size of the dataset. The ContractsNLI dataset contains 605 documents and the ingestion process may take around 30 minutes to complete. Please wait patiently for the process to complete.
In case the terminal shows a timed-out error, that means the ingestion is still in progress, and you will need to run the command provided on the CLI after the error message to continue to get the ingestion logs.
The record of the ingestion and related artifacts are also stored in the MLFlow application on DKubeX UI.
To check if the dataset has been created, stored and are ready to use, use the following command:
d3x dataset list
To check the list of documents that has been ingested in the dataset, use the following command:
d3x dataset show -d contracts
Deploying LLMs from Model Catalog¶
Note
For detailed information regarding this section, please refer to Deploying LLMs in DKubeX.
Here we will deploy the base Llama2-7B model, which is already pre-registered with DKubeX.
Note
This workflow requires an a10 GPU node. Make sure your cluster is equipped with such. Also, in case you are using any other type of GPU node, make sure to use the label for that node which you have put during DKubeX installation process.
To list all LLM models registered with DKubeX, use the following command.
d3x llms list
Export the access token for the Llama2-7B model. Replace the
<Huggingface token for Llama2-7B>
part with the token for the Llama2-7B model.export HF_TOKEN="<Huggingface token for Llama2-7B>"
Deploy the base Llama2-7B model using the following command.
d3x llms deploy --name=llama27bbase --model=meta-llama--Llama-2-7b-chat-hf --token ${HF_TOKEN} --type=a10 --publish
Note
In case you are using a EKS setup, please change the value of the flag --type from a10 to g5.4xlarge in the following command. Also, in case you are using any other type of GPU node, make sure to use the label for that node which you have put during DKubeX installation process.
You can check the status of the deployment from the Deployments page in DKubeX or by running the following command.
d3x serve list
Wait until the deployment is in running state.
Building your first RAG chat application¶
Note
For detailed information regarding this section, please refer to Creating and accessing the chatbot application.
From the DKubeX UI, open and log into the SecureLLM application. Once open, click on the Admin Login button and log in using the admin credentials provided during installation.
Hint
In case you do not have the credentials for logging in to SecureLLM, please contact your administrator.
On the left sidebar, click on the Keys menu and go to the Application Keys tab on that page.
To create a new key for your application, use the following steps:
A pop-up window will show up on your screen containing the application key for your new application. Alternatively, you can also access your application key from the list of keys in the Application Key tab.
Copy this application key for further use, as it will be required to create the chatbot application. Also make sure that you are copying the entire key including the sk- part.
From the DKubeX UI, go to the Terminal application.
You will need to configure and use the
query.yaml
file from the dkubex-examples repo to be used in the query process in the Securechat application.Run the following command to put the
query.yaml
file on your workspace.cd && cp dkubex-examples/rag/query/query.yaml ./query.yaml
Provide the following details on the
query.yaml
file. Once provided, save the file.On the
chat_engine:url:
section, provide the endpoint URL of the deployed model to be used. The syntax for the URL is provided below. Replace<your username>
part with your username."http://llama27bbase-serve-svc.<your username>:8000"
Note
You are providing your own username here because the
llama27bbase
deployment was done from your workspace earlier. If you are going to use a model deployed by any other user, you will need to provide the proper deployment name in place ofllama27bbase
and the username of that user.On the
tracking:experiment:
section, provide a name for the experiment under which the query records and artifacts will be stored in MLFlow.
Create a new file on your workspace called
securechat.yaml
. This file will be used to create the securechat application.cd && touch securechat.yaml
Provide the following content on the file. Replace both the
<your username>
parts with your username on DKubeX and the<your secureLLM API key>
part with the API key that was generated earlier on SecureLLM.image: dkubex123/llmapp:securechat-081 name: ndabase cpu: 1 gpu: 0 memory: 4 dockerserver: DOCKER_SERVER dockeruser: dkubex123 dockerpsw: dckr_pat_dE90DkE9bzttBinnniexlHdPPgI publish: "true" env: OPENAI_API_KEY: "" SECUREAPP_ACCESS_KEY: allow FMQUERY_ARGS: "llm --dataset contracts -e http://llama27bbase-serve-svc.<your username>:8000/v1/ --config /home/<your username>/query.yaml --securellm.appkey=<your secureLLM API key>" port: "3000" description: "Contracts" rewritetarget: "false" configsnippet: "" ingressprefix: /ndabase output: yaml
Note
You are providing your own username here because the
llama27bbase
deployment was done from your workspace earlier. If you are going to use a model deployed by any other user, you will need to provide the proper deployment name in place ofllama27bbase
and the username of that user.
Launch the app deployment with the following command:
d3x apps create -c securechat.yaml
To check the status of the app deployment, use the following command:
d3x apps list
Once the app deployment status becomes running, you can access the application from the Apps page of DKubeX UI. Provide the application key that you set in the
SECUREAPP_ACCESS_KEY
field earlier to start using the chat application.
Hint
You can ask the following questions to the chatbot when using the ContractsNLI dataset:
How do I frame a confidential information clause?
What is the difference between a unilateral and mutual NDA?
What are some common exceptions to confidential information clauses?
Finetuning LLM with custom dataset¶
Note
For detailed information regarding this section, please refer to Finetuning Open Source LLMs.
You will need to use a custom python script for getting the chunks to finetune your LLM model.
Create a new script called
extract_chunks.py
on your workspace using the following command:cd && touch extract_chunks.py
Provide the following content in the
extract_chunks.py
script. Once done, save the script.import mlflow import requests import pandas as pd import csv import os import argparse from mlflow.tracking import MlflowClient import json import subprocess client = MlflowClient() # Setting mlflow tracking uri os.environ['MLFLOW_TRACKING_URI'] = "http://d3x-controller.d3x.svc.cluster.local:5000" def retreive_chunks(vector_id_list, output_csv_path, output_chunks_json, experiment_name, no_of_chunks, cleaned_chunks_dir): # Specify your Weaviate server URL weaviate_url = "http://weaviate.d3x.svc.cluster.local" all_data = [] all_data_json = [] chunks_ft_path = "./temp_out/" output_json_path = f"./temp_out/0-000000/{output_chunks_json}" # Check if the directory exists directory = os.path.dirname(output_json_path) if not os.path.exists(directory): os.makedirs(directory) if no_of_chunks is None: vector_ids_to_process = vector_id_list else: vector_ids_to_process = vector_id_list[:no_of_chunks] for vector_id in vector_ids_to_process: # Construct the URL for the object retrieval url = f"{weaviate_url}/v1/objects/{vector_id}" # Make a GET request to retrieve the object response = requests.get(url) # Check if the request was successful (status code 200) if response.status_code == 200: # Parse and print the retrieved object retrieved_object = response.json() #paper_chunks = retrieved_object['properties']['paperchunks'] paper_chunks = retrieved_object.get('properties', {}).get('paperchunks', '') all_data.append({'vector_id': vector_id, 'paper_chunks': paper_chunks}) all_data_json.append({'chunks': paper_chunks}) else: # Print an error message if the request was not successful print(f"Failed to retrieve object. Status code: {response.status_code}, Response: {response.text}") # Write the paper_chunks data to a CSV file #with open(output_csv_path, 'w', newline='', encoding='utf-8') as csvfile: # csv_writer = csv.writer(csvfile) # csv_writer.writerow(['vector_id','paper_chunks']) # Write header # csv_writer.writerows([[data['vector_id'], data['paper_chunks']] for data in all_data]) # Log the CSV file as an artifact in MLflow # Write the paper_chunks data to a JSON file chunk_size = 500 chunks = [all_data_json[i:i+chunk_size] for i in range(0, len(all_data_json), chunk_size)] #print(chunks) for i, chunk in enumerate(chunks): i_str = str(i).zfill(6) output_json_path = f"./temp_out/0-{i_str}/" if not os.path.exists(output_json_path): os.makedirs(output_json_path) with open(f"{output_json_path}/text_chunks.json", 'w', encoding='utf-8') as jsonfile: json.dump(chunk, jsonfile) print(f"Chunk {i} written to {output_json_path}") """ chunks_dir_inc = 0 for data in all_data_json: inc_str = str(chunks_dir_inc).zfill(6) output_json_path = f"./temp_out/0-{chunks_dir_inc}/./text_chunks.json" with open(output_json_path, 'w', encoding='utf-8') as jsonfile: json.dump([{'chunks': data['chunks']} for data in all_data_json], jsonfile) # json.dump([{'chunks': data['chunks']} for data in all_data_json], jsonfile) chunks_dir_inc += 1 print(chunks_dir_inc) """ try: # Specify your shell command command = "your_shell_command_here" # Execute the shell command result = subprocess.run(f"d3x fm trainchunks --source {chunks_ft_path} --destination {cleaned_chunks_dir} ", shell=True, check=True, stdout=subprocess.PIPE) # If the command executed successfully, print the output print(result.stdout.decode('utf-8')) except subprocess.CalledProcessError as e: print(f"Error creating train chunks: {e}") #mlflow.set_experiment(experiment_name) #with mlflow.start_run(): # mlflow.log_artifact(output_csv_path, artifact_path="weaviate_data") # mlflow.log_artifact(output_chunks_json, artifact_path="weaviate_data_json") def extract_column_values(csv_file_path, column_name): """ Extract values from a specified column in a CSV file and return them as a list. Parameters: - csv_file_path (str): Path to the CSV file. - column_name (str): Name of the column to extract. Returns: - list: List of values from the specified column. """ try: # Read the CSV file into a pandas DataFrame df = pd.read_csv(csv_file_path) # Extract the column data into a list column_values = df[column_name].tolist() return column_values except Exception as e: print(f"Error: {e}") return None def artifacts_download(run_id,local_dir): artifact=client.download_artifacts(run_id,"",local_dir) print(f"Artifacts downloaded to: {local_dir}/chunks/") return local_dir def main(): # Parse command-line arguments parser = argparse.ArgumentParser(description='Process Weaviate data and log to MLflow.') parser.add_argument('--experiment_name', type=str,required=True ,help='MLflow experiment name') parser.add_argument('--run_id', type=str, required=True,help='MLflow run ID for artifact download') parser.add_argument("-d", "--destination", type=str, required=True, help="The path where chunks will be kept for training") parser.add_argument('--no_of_chunks', type=int,help="retreives the chunk text for first given number") args = parser.parse_args() csv_file_path = artifacts_download(args.run_id, ".") csv_file_path += "/chunks/chunks.csv" print(csv_file_path) column_name = "chunk_id" vector_ids_list = extract_column_values(csv_file_path, column_name) output_csv_file = "./retrieved_chunks.csv" output_chunks_json = "./text_chunks.json" print(f"output_csv_file saved to the following {output_csv_file}") print(f"output_chunks_json saved to the following {output_chunks_json}") retreive_chunks(vector_ids_list, output_csv_file, output_chunks_json, args.experiment_name, args.no_of_chunks, cleaned_chunks_dir=args.destination) print(f"output csv file also artifacted to mlflow with experiment name {args.experiment_name}") if __name__ == "__main__": main()
Generate the chunks using the following command. Replace the <ingestion run ID on MLFlow> with the run ID of the ingestion run for your dataset on the MLFlow application.
python3 extract_chunks.py --experiment_name chunk-generation --run_id <ingestion run ID on MLFlow> -d ${HOMEDIR}/chunks_for_finetuning/
Train the LLM with the chunks using the following command:
Note
In case you are using a EKS setup, please change the value of the flag -t from a10 to g5.4xlarge. Also, in case you are using any other type of GPU node, make sure to use the label for that node which you have put during DKubeX installation process.
d3x fm tune model finetune -j llama27bfinetune -e 1 -b 20 -l ${HOMEDIR}/chunks_for_finetuning -o ${HOMEDIR}/ft-output/ -c 8 -m 64 -g 1 -t a10 -n meta-llama--Llama-2-7b-chat-hf --ctx-len 512
Attention
The time taken by the finetuning process depends on the size of the dataset. The ContractsNLI dataset contains 605 documents and the finetuning process may take around one hour to complete. Please wait patiently for the process to complete.
In case the terminal shows a timed-out error, that means the finetuning is still in progress, and you will need to run the command provided on the CLI after the error message to continue to get the finetuning logs.
You will need the absolute path to the finetuned model checkpoint to merge the finetuned model with the base model. Use the following command to get the absolute path to the finetuned model checkpoint:
echo ${HOME}/ft-output/meta-llama--Llama-2-7b-chat-hf/TorchTrainer_*/TorchTrainer_*/checkpoint*/
Export the absolute path to the finetuned model checkpoint to be used during the merge process with the following command. Export the
<checkpoint absolute path>
part with the absolute path to the finetuned model checkpoint you got in the previous step.export CHECKPOINT="<checkpoint absolute path>"
Merge the finetuned model checkpoint with the base model to create the final finetuned model using the following command:
Note
In case you are using a EKS setup, please change the value of the flag -t from a10 to g5.4xlarge. Also, in case you are using any other type of GPU node, make sure to use the label for that node which you have put during DKubeX installation process.
d3x fm tune model merge -j <merge job name> -n <full HF path to the base model> -cp <absolute path to the finetuned checkpoint> -o <absolute path to merged finetuned model output folder> -t a10
d3x fm tune model merge -j llama27bmerge -n meta-llama--Llama-2-7b-chat-hf -cp ${CHECKPOINT} -o ${HOMEDIR}/merge_output -t a10
To quantize the finetuned model, use the following command:
Note
In case you are using a EKS setup, please change the value of the flag -t from a10 to g5.4xlarge. Also, in case you are using any other type of GPU node, make sure to use the label for that node which you have put during DKubeX installation process.
d3x fm tune model quantize -j llama27bquantize -p ${HOMEDIR}/merge_output/ -o ${HOMEDIR}/quantize_result -t a10
Attention
The time taken by the quantization process depends on the size of the dataset. The ContractsNLI dataset contains 605 documents and the quantization process may take around 30 minutes to complete. Please wait patiently for the process to complete.
In case the terminal shows a timed-out error, that means the quantization is still in progress, and you will need to run the command provided on the CLI after the error message to continue to get the quantization logs.
Importing the finetuned LLM to MLFlow¶
Note
For detailed information regarding this section, please refer to Deploying LLMs in DKubeX.
To import the finetuned LLM model to MLFlow, use the following command:
d3x models import llama27bft custom_model ${HOMEDIR}/quantize_result
Deploying the finetuned LLM¶
Note
For detailed information regarding this section, please refer to Deploying LLMs in DKubeX.
Deploy the finetuned LLM model using the following command:
Note
In case you are using a EKS setup, please change the value of the flag --type from a10 to g5.4xlarge in the following command. Also, in case you are using any other type of GPU node, make sure to use the label for that node which you have put during DKubeX installation process.
d3x llms deploy -n llama27bft --mlflow llama27bft:1 --type a10 --base_model meta-llama--Llama-2-7b-chat-hf --token ${HF_TOKEN} --publish
Check the status of the deployment from the Deployment page in DKubeX or by running the following command. Wait until the deployment is in Running state.
d3x serve list
Building a RAG chat application with your finetuned LLM¶
Note
For detailed information regarding this section, please refer to Creating and accessing the chatbot application.
From the DKubeX UI, open and log into the SecureLLM application. Once open, click on the Admin Login button and log in using the admin credentials provided during installation.
Hint
In case you do not have the credentials for logging in to SecureLLM, please contact your administrator.
On the left sidebar, click on the Keys menu and go to the Application Keys tab on that page.
To create a new key for your application, use the following steps:
A pop-up window will show up on your screen containing the application key for your new application. Alternatively, you can also access your application key from the list of keys in the Application Key tab.
Copy this application key for further use, as it will be required to create the chatbot application. Also make sure that you are copying the entire key including the sk- part.
From the DKubeX UI, go to the Terminal application.
You will need to configure and use the
query.yaml
file from the dkubex-examples repo to be used in the query process in the Securechat application.Run the following command to put the
query.yaml
file on your workspace, rename it toquery_ft.yaml
(as one instance of query.yaml is already available on your workspace from earlier).cd && cp dkubex-examples/rag/query/query.yaml ./query_ft.yaml
Provide the following details on the
query_ft.yaml
file. Once provided, save the file.On the
vectorstore_retriever:dataset:
section, provide the name of your ingested dataset, i.e.contracts
.On the
chat_engine:url:
section, provide the endpoint URL of the deployed model to be used. The syntax for the URL is provided below. Replace<your username>
part with your username."http://llama27bft-serve-svc.<your username>:8000"
Note
You are providing your own username here because the
llama27bft
deployment was done from your workspace earlier. If you are going to use a model deployed by any other user, you will need to provide the proper deployment name in place ofllama27bft
and the username of that user.On the
tracking:experiment:
section, provide a name for the experiment under which the query records and artifacts will be stored in MLFlow.
Create a new file on your workspace called
securechat_ft.yaml
. This file will be used to create the securechat application.cd && touch securechat_ft.yaml
Provide the following content on the file. Replace both the
<your username>
parts with your username on DKubeX and the<your secureLLM API key>
part with the API key that was generated earlier on SecureLLM.image: dkubex123/llmapp:securechat-081 name: ndaft cpu: 1 gpu: 0 memory: 4 dockerserver: DOCKER_SERVER dockeruser: dkubex123 dockerpsw: dckr_pat_dE90DkE9bzttBinnniexlHdPPgI publish: "true" env: OPENAI_API_KEY: "" SECUREAPP_ACCESS_KEY: allow FMQUERY_ARGS: "llm --dataset contracts -e http://llama27bft-serve-svc.<your username>:8000/v1/ --config /home/<your username>/query_ft.yaml --securellm.appkey=<your secureLLM API key>" port: "3000" description: "Contracts" rewritetarget: "false" configsnippet: "" ingressprefix: /ndaft output: yaml
Note
You are providing your own username here because the
llama27bft
deployment was done from your workspace earlier. If you are going to use a model deployed by any other user, you will need to provide the proper deployment name in place ofllama27bft
and the username of that user.
Launch the app deployment with the following command:
d3x apps create -c securechat_ft.yaml
To check the status of the app deployment, use the following command:
d3x apps list
Once the app deployment status becomes running, you can access the application from the Apps page of DKubeX UI. Provide the application key that you set in the SECUREAPP_ACCESS_KEY field earlier to start using the chat application.
Hint
You can ask the following questions to the chatbot when using the ContractsNLI dataset:
How do I frame a confidential information clause?
What is the difference between a unilateral and mutual NDA?
What are some common exceptions to confidential information clauses?
Tutorials and More Information¶
For more examples including how to train and register models and deploy user applications, please visit the following pages and go through the table provided: