Data Ingestion and Deploying RAG-based chatbot

Retrieval Augmented Generation is a great way to leverage your private corpus with generative AI & when combined with fine-tuned LLMs, you can achieve GPT-like results for your custom use cases.

../_images/rag_details.png

In this workflow, datasets (PDF, DOCX, TXT, HTML) are first divided into smaller segments or chunks. These chunks are then transformed into embeddings using an embeddings module and then stored in a vector database for efficient retrieval.

When a user submits a query in a chat application, the backend system conducts a search to find relevant context. It combines this context with a custom prompt and utilizes a Language Model (LLM) acting as a summarization agent. The LLM’s role is to generate human-like responses to the query, ensuring that the answers are coherent and contextually relevant.

These examples will take you through the procedure of deploying a LLM chatbot that can answer questions based on specific dataset(s) provided by you. Two workflow examples for the aforementioned has been provided below. Please click on the link provided in the following table to head to the particular example that you want to follow.