RAG model for a QA bot-

·

1 min read

I made this following a blog post replit.com/@nhegday/RAG-From-Scratch , Here is what I learnt -RAG refers to Retrieval Augmented Generation , it is a fine-tuning technique used to enhance and personalize the responses of an LLM. I used Openai and Pinecone.DB .Pinecone is a vector database .

So here's the step-by-step process:

Read the input data which would be used to train the model .

Convert the data into smaller chunks

Embed the chunks- it generates a vector for each chunk , i did that using openai's text-embedding-ada-002 model .

Store the embeds in pinconedb by creating an index and upserting the vectors.

Create a mapping of vectors to chunks to unique ids.

Then retrieve the most similar chunks by using pineconedb query .

Then construct the prompt by passing instructions and the most similar chunks .

Then get the gpt response by chatcompletion.create function using gpt-3.5-turbo model .

Here's the colab notebook -colab.research.google.com/drive/1cZbfnz731N..