I’m trying to build a flow in n8n where I use an AI agent (like OpenAI) to help generate some structured data. for example, a list of job positions in a department.
But I don’t want the agent to just make stuff up. I already have a master list (could be in Google Sheets or a DB) that has valid data. So ideally, I want the agent to look at that list, compare it with the input (like “HR department” + a short description), and return only the valid positions from the master.
So like
Input: Department = “HR”, Description = “deals with internal people stuff”
Output: Only positions from the HR section of the master list (e.g., HR Manager, Recruiter, etc.)
Not sure what’s the best approach here. Should I just get all the master data and feed it into the prompt? Or maybe use a vector store? Or something else entirely?
Would love to see if anyone’s done something like this. open to ideas or rough direction. Thanks in advance
You should use a vectors store with embedding for this. When using embeddings, the model can return similar items from the vector to use for its output writing. Give me a min and I’ll upload an example here.
Here’s an example workflow moving data from google sheets into supabase, then having an agent use the vector store for information.
If this response helped you, please click the heart to show that it is useful If this response solved your issue, mark it as the solution to help the community
Since vector stores usually chunk the data, I’m wondering, when retrieving info, will it bring in all the relevant context properly?
Like, what if the info is split across chunks or isn’t stored together nicely — will it still work reliably?
Feels a bit different from pulling clean, structured lists from something like Google Sheets or a database, where you know the whole row or group comes together.
The workflow is built to store the data in two ways.
It stores each row in the google sheet exactly like it is as a document row. The AI Agent has a tool to query document rows. It’s writing sql, and returning the actual rows that are stored in supabase.
In addition, the google sheet is chunked and embedded in the documents table as well. The AI Agent can refrerence the chunked embedings as additional data to support the data returned in the query.
If this response helped you, please click the heart to show that it is useful If this response solved your issue, mark it as the solution to help the community