Deploying a RAG-based Chatbot¶

You can deploy a RAG-based chat application from DKubeX UI by going through the following guide.

Prerequisites¶

You should have a dataset pre-ingested which will be used to create the RAG application. For the steps to ingest a dataset, refer to the Data Ingestion and Creating Datasets guide.
If you are going to use custom deployments for your chat application, the embedding model deployment and LLM deployment should be created beforehand and be in running state. For the steps to create a deployment, refer to the Deploying Embedding Models on DKubeX and Deploying LLMs in DKubeX guides.
If you are going to use SecureLLM along with your chat application, you should have a SecureLLM application key created beforehand. For the steps to create a SecureLLM application key, refer to the SecureLLM guide.

Configuring and deploying the RAG Application¶

To configure and deploy the RAG-based chat application, follow the steps below:

On the DKubeX UI, click on the Applications button on the left sidebar. The Applications page will be displayed.
Click on the + button on the top left of the Applications window. This is the create application button. The Application Create window will open.
On the General page, under the General Configuration section, provide the following details:

RAG Application - General Configuration¶

Field

Description

Name

Provide a name for the RAG application.

Description

(OPTIONAL) Provide a description for the RAG application.

RAG Application - General Configuration¶
Field	Description
`Name`	Provide a name for the RAG application.
`Description`	(OPTIONAL) Provide a description for the RAG application.

Once done, click on the Next button to go to the Image Configuration page.

Here, under the Image Configuration section, provide the following details:

RAG Application - Image Configuration¶
Field	Description
`Type`	Select the type of the application you are going to deploy. For RAG-based chat applications, select `Secure Chat`, or to deploy a custom application, select `Custom`.

Note

If you select Custom as the type of the application, you will need to provide the docker image and account details in the respective fields that shows up.

Under the Access Control section, provide the following details:

RAG Application - Access Control¶
Field	Description
`Secure Access Key`	Provide a custom password which will be used during logging in to the RAG application.
`Publish`	Select this option to make the RAG application available for all users on the DKubeX setup. If you do not select this option, the RAG application will be available only for you.

Under the Resource Settings dropdown, provide the following details:

RAG Application - Resource Settings¶
Field	Description
`CPU Count`	Number of CPUs to be allocated for the RAG application. The default value is `1`.
`GPU Count`	Number of GPUs to be allocated for the RAG application. The default value is `0`.
`Memory`	Amount of memory to be allocated for the RAG application. The default value is `4`.
`Port`	Port number on which the RAG application will be running. The default value is `3000`.
`Mount Home`	Select the location you want to be mounted. The default value is `All`.
`MLFlow Experiment`	(OPTIONAL) Provide an unique MLFlow experiment in which you want to log the RAG records.
`Cache Enabled`	(OPTIONAL) Select this option to enable caching for the RAG application. This improves performance and response time by storing responses temporarily for repeated identical requests.
`Rewrite Target`	(OPTIONAL) Select this option to rewrite the target of the RAG application. This helps to modify the path of incoming requests before routing them to the backend service (usually your LLM or inference server).

In the Environment Variables section, you can define environment variables by clicking on + Add ENV, and then entering the environment variable in the format of key and value in the respective fields.

Once done, click on the Next button to go to the Advanced page.

Here, under the Advanced Configuration section, provide the following details:

RAG Application - Advanced Configuration¶
Field	Description
`Enable RAG`	Select this option to enable the RAG functionality for the application to be deployed. For Secure Chat applications, this option is enabled by default.
`Dataset Name`	Select the dataset that you want to use for the RAG application from the dropdown. The dataset should be pre-ingested before deploying the RAG application.
`Secure LLM`	(OPTIONAL) If you want to use SecureLLM with the RAG application, provide the SecureLLM application key in this field. If you do not want to use SecureLLM, leave this field blank.
`Use Adjacent Chunks`	(OPTIONAL) Select this option to use adjacent chunks for the RAG application. This helps to improve the performance of the RAG application by using adjacent chunks of data.

Under the Embedding section, provide the following details:

RAG Application - Embedding Configuration¶
Field	Description
`Provider`	Select the embedding model provider to be used from the dropdown. The available options are `DkubeX`, `Sky`, and `OpenAI`.
`Model`	Select the embedding model/deployment to be used from the dropdown.
`Key`	If you have selected `OpenAI` as the embedding model provider, provide the OpenAI API key in this field.

Under the Retrieval section, provide the following details:

RAG Application - Retrieval Configuration¶
Field	Description
`Provider`	Select the retrieval LLMprovider to be used from the dropdown. The available options are `DkubeX`, `Sky`, and `OpenAI`.
`Model`	Select the retrieval LLM/model deployment to be used from the dropdown.
`Key`	If you have selected `OpenAI` as the retrieval LLM/provider, provide the OpenAI API key in this field.

Under the Vector Store Settings section, provide the following details:

RAG Application - Vector Store Configuration¶
Field	Description
`Vector Store Provider`	Select the vector store provider to be used from the dropdown. By default, `dkubex` will be selected.
`Vector Store`	Select the vector store to be used from the dropdown. By default, `weaviate_vectorstore` will be selected.
`URL`	(OPTIONAL) Provide the URL of the vector store to be used.
`Auth Key`	(OPTIONAL) Provide the authentication key for the vector store to be used.
`Text Key`	(OPTIONAL) Provide the text key for the vector store to be used.

Under the Search section, provide the following details:

RAG Application - Search Configuration¶
Field	Description
`Search Type`	Select the vector search type to be used from the dropdown. By default, `vector_search` will be selected.
`Top K`	Provide the number of top results to be returned from the vector search. The default value is `3`.
`Rerank`	(OPTIONAL) Select this option to enable reranking of the search results. This helps to improve the relevance of the search results by reordering them based on their relevance to the query.

Under the Query Settings section, provide the following details:

RAG Application - Query Settings Configuration¶
Field	Description
`Batch Size`	Enter the batch size for query processing. The default value is `32`.
`Acronym Expander`	(OPTIONAL) Select this option to enable acronym expansion in the query processing. This helps to expand acronyms in the query to their full form.
`Query Rewrite`	(OPTIONAL) Select this option to enable query rewriting. This helps to rewrite the query to improve the relevance of the search results.

Note

If you enable the Acronym Expander or Query Rewrite options, provide the relevant details in the form that appears.

Under the Synthesizer Settings section, provide the following details:

RAG Application - Synthesizer Settings Configuration¶
Field	Description
`Chat Memory Tokens`	Enter the number of chat memory tokens to be used for the synthesizer. The default value is `6000`.
`Max Tokens`	Enter the maximum number of tokens to be used for the synthesizer. The default value is `1024`.
`Window Size`	Enter the window size for the synthesizer. The default value is `2`.
`User Prompt`	Enter custom user prompt to be used for the synthesizer. The default value is `default`.
`System Prompt`	(OPTIONAL) Enter custom system prompt to be used for the synthesizer.
`Use Adjacent Chunks`	(OPTIONAL) Select this option to use adjacent chunks for the synthesizer. This helps to improve the performance of the synthesizer by using adjacent chunks of data.

Under the FAQ Settings section, provide the following details:

RAG Application - FAQ Settings Configuration¶
Field	Description
`FAQ`	(OPTIONAL) Select this option to enable FAQ functionality in the RAG application. This helps to provide answers to frequently asked questions.

Once done, click on the Submit button on the bottom right of the page. The RAG application will be created and deployed on the DKubeX platform. You can open and manage the application from the Applications page.