Hi,
I am rather new to n8n, been using the cloud option for a couple of months now, and I LOVE IT!
Most of the automation/AI ideas I have in mind are related to creating solutions that combine RAG and Relational Database knowledge for organizations.
When researching workflows, I came across the Postgres node, which is brilliant. However, I have some confusions about how this works..
On a small table the node seems to be working on the entire data, at least that’s what I see when I check the input/output JSON of the various nodes after an execution. I see ALL rows being passed around. How would this work on a large table?
Is there a difference if one wants an AI model to really understand underlying trends within the data, by processing a large chunck of it before providing feedback, versus ad-hoc queries that target a specific rows and information?
I’ve seen videps on Youtube that show amazing AI agent capabilities with SQL databases, like identifying generic trends within a table, but I can’t see how that works on a large table? Would the agent need to process the entire data?
I know I am so confused that I can’t even ask the question clearly, but I am looking for a scalable, workable way of plugging agents to databases that can work in real life. Any advise here is appraciated…
Thanks
Ashraf
1 Like
So, depending on the data, but most would extract it into a rag based storage for ai agent consumption.
Thank you @King_Samuel_David …
But, do you mean that the entire content of a relational database, or say a few important tables, be converted into Text and loaded in a RAG vector database? Isn’t that challenging ?
The issue is, loading all sql data into N8N can be execution heavy, so batching or breaking down data is best,
Depending on setup like cpu / ram on server can determine how much you can load without seeing performance delgrations.
Thus, it normally grabbed required data, if you want to use ai agent, embedding data, alot use superbase, but pinecone, Qdrant is also avaible depending on scailing needs / performace etc.
You could have an ai agent, which retirves sql, and stores in RAG if big data, then access via another ai agent etc, or if the data is small you can just feed the sql data backinto the llm, but large data slower more crashed, hitting token limits etc.
Hope this helps
Thank you.
I actually tried querying the database and embedding the results in a vector database in supabase. Problem is that I am not able to get this workflow to work properly because I’m new to this.
I need every row from the database to be its own chunk in the vector database. I am not sure how to do that!
I also don’t know how to configure the embeddings node and most importantly the text splitter node with a proper chunk size, since I want a complete row as a chunk, which could be variable in size.
Could anyone guide me through the setup for such a things?
Thanks
Ashraf