I have some general questions to RAG-Systems in N8N.
I fill my qdrant-Vector-DB externally with docling (pdf-parser) in Python.
And want make that data accessable with N8N for Mail, Whatsapp, Elevenlabs, and so on.
Here is a simple example that is working:
But i also saw workflows with the “Answer question with vector store”-node, like this:
Can maybe someone explain me the differences and whats the benifits, to use one or the other ??
But the main question is, both workflows make 2 Requests to the Chat-Model in the execution-process.
But that shouldnt be needed.
In my Python-Chat were i directly communicating with the qdrant-Database, the workflow is like this:
- The question get vectorised by the embedded modell
- With this vector the DB is searched
- the question is sended to the LLM with the question and the results from the DB.
- Result in the chat.
So normally just one request to the LLM-Chat-Model should be needed.
Can I realize that in N8N ??
Or do i have to use a webhook to make make that external in this way ??