Passing a large dataset to an OpenAI node - Vector store

Describe the problem/error/question

Here is my use case. I’m crawling a large website and storing all the data in a vector db (pinecone). I then want to utilize an Open AI agent on the dataset. For purposes of this example, let’s assume I want to ask the Open AI agent to create certain technical how-to steps based on reading the dataset.

If I utilize a an agent with the pinecone db as a tool, it works, except for the size limitation:

The challenge is I can only pass a certain amount of data to the open AI agent.
In the vector store node, I specify a limit of results to return. If you make that number to high, it exceeds the amount of data you can pass to the open AI agent.

What is the error message (if any)?

Too much data is passed to the open AI agent node if I set the limit too high in the pinecone vector node.

If I don’t set a high limit, then the open AI agent doesn’t process the entire data set.

How would one be able to instruct the openAI agent to iterate through the dataset? So I can pass the max limit in every iteration, ensuring that the agent goes through the entire dataset to complete its task? Right now it’s a single execution based on the limit of data sent from the vector store node.

1 Like

It looks like your topic is missing some important information. Could you provide the following if applicable.

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

Same exact question!
It is making me think if I should train a model and self host it, or keep trying to expand this dataset limit when using the vector database node…

There’s a token limit and not much we can do about it.

But you can also have a multi-agent workflow that splits the task into chunks:

  • Agent 1 searches and write the topics
  • Agent 2 does topic A
  • Agent 3 does topic B

And so on.

What are you trying to achieve that would require the agent to retrieve 100% of the vector store?

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.