How to include metadata when inserting pdf to supabase vector store?

I’ve built a system where users can upload files (pdfs) that i later want to be able to search via an AI Agent that i’m chatting to. When uploading the pdf the file should be able to belong to a specific user or entity so i need to pass this information along with the file when it’s processed and put into my vector store on supabase.

When the user uploads the file i store it in supabase bucket and then send the url together with the identification information to a webhook on n8n. The file is downloaded as a binary and then passed to the supabase node. So far so good.

But as the file is processed by the text splitter, it strips all metadata and all i get left is the text content of the pdf file. The file gets correctly split, vectorised and then added to the vector store, but the metadata is no where to be found.

So the question is, how can i pass along metadata that will also be inserted into the rows of the vector table, so that the AI agent can find the correct information?

The table has a NO NULL condition on the identity fields so the database insert fails because it’s not receiving any information for these fields and throw this error:

Problem in node ‘Upload Knowledge base‘

Error inserting: null value in column “document_type” of relation “documents” violates not-null constraint 400 Bad Request

Also, there’s no way to directly map to a column in the table so i’m just betting on calling the key in the metadata the same as the column name in the table?

I’m running on n8n cloud.


2 Likes

The best RAG solution I’ve seen is from Cole Medin, on YouTube:

On the third video I think he uses a method for referencing the sources.

Your question will be answered by analyzing the part of the workflow that creates the vector store, pulling the files from Google Drive and storing them in Supabase.

He’s storing metadata and on the third version of the workflow he also stores URL of the file.

Additionally, he has implemented many improvements over normal RAG functionality that makes this RAG solution way above average.

His workflows are available on his GitHub.

In every video he links everything in the description.

I hope this helps! It’s a deep dive in the RAG topic, but I assure you it’s worth it!

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.