RAG Metadata - Difference between Vector Store Tool node and Vector Store nodes?

laschmu · January 8, 2025, 9:39pm

Hi there,

I’m playing around with RAG and vector stores and looking into the meta data.

The to be achieved goal

My primary goal is that i need the source (and further metadata) of a information from the vector store - simplified for this example the file name.

In my first try (upper workflow) i couldn’t get it to work.

Luckily i’ve found the 2 posts mentioned below in the questions, leading me to the lower workflow solution which returns the filename and all of the meta data.

My questions:

I don’t understand the difference between the Vector Store Tool and the Vector Store Node itself. Why is the lower workflow returning the meta data and the tool doesn’t?
I’ve found the solution in the thread Get metadata from Vector Store Tool.
Especially the question here is: What are the pro’s and con’s for using the second approach?
I found another solution with RAG setup with Contextual Summaries, Sparse Vectors and Reranking and was thinking, that this could also help, storing additional information (like the filename) in the vectorized data. This topic might be too new to be widely adopted yet. I can’t get it to work currently due to my poor dev raspberry having issues running qdrant properly… -.-

Big thanks here already to @Jim_Le for sharing his insights in these 2 posts.

My workflow

Information on your n8n setup

n8n version: 1.72.1
Database (default: SQLite): Postgres
n8n EXECUTIONS_PROCESS setting (default: own, main): Own
Running n8n via (Docker, npm, n8n cloud, desktop app): Docker
Operating system: Raspbian

Jim_Le · January 9, 2025, 12:03pm

Hey @laschmu

The only difference is the Vector Store Tool uses an LLM to summarise the documents before passing the response back to the agent. This means, you don’t get the raw documents back as a tool response and hence why the agent is unable to see the metadata.

The second approach (“the custom workflow tool”) is fine and there is no “real” cons other than (1) it’s a lot more nodes obviously and (2) you’ll get more execution logs - since subworkflows run in a separate execution. One nice thing about doing it this way is you have much more freedom to manipulate the tool response before it goes back eg. handling empty responses, having special filtering logic etc.

You’ll be glad to know that this is changing in a big way in release 1.74 - see feat: Allow using Vector Stores directly as Tools by mutdmour · Pull Request #12311 · n8n-io/n8n · GitHub

Regarding your second question, I wouldn’t recommend using langchain code node in most scenarios especially if you’re just starting out with vector stores. Stick with the custom workflow tool for now!

Ryan_Shillington · January 13, 2025, 4:40pm

I updated to 1.74.1 but I don’t understand how to use a Vector Store as a tool directly. I’m still only presented with “Vector Store Tool” as an option.

laschmu · January 13, 2025, 7:12pm

It is not possible to use it directly, you have to route it through a sub workflow as of my example.

system · January 20, 2025, 7:13pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.