AI Agent generates answer before using Pinecone Vector Store (RAG order seems reversed)

Describe the problem/error/question

I’m using n8n AI Agent with Pinecone Vector Store as a Tool for a RAG workflow.

What I am experiencing is very strange and seems opposite to how RAG should work.

Here is the exact sequence I observe from the logs (using “Return Intermediate Steps”):

  1. The LLM reads the news article input

  2. The LLM starts generating an answer and even says something like “no relevant past records found”

  3. Only after that, the Pinecone Vector Store tool is called

So the order is effectively:

Generate → then Search

instead of:

Search → then Generate

Because of this, the LLM makes a decision without having seen the Pinecone search results.

The logs clearly show:

  • First, the model produces reasoning/output

  • Then the tool call to Pinecone happens

  • The search result comes too late to influence the decision

This makes the RAG workflow unreliable, since the vector search is not used during the reasoning step.

I expected the AI Agent to:

  1. Call Pinecone first

  2. Use the retrieved context

  3. Then generate the decision

But the actual behavior is the opposite.

Is this expected behavior of AI Agent + Vector Store Tool?
Or am I missing some configuration that forces the Agent to use the vector search before generating?

Would appreciate any clarification.

What is the error message (if any)?

Please share your workflow

(Select the nodes on your canvas and use the keyboard shortcuts CMD+C/CTRL+C and CMD+V/CTRL+V to copy and paste the workflow.)

Share the output returned by the last node

Information on your n8n setup

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

Hey @Parksinwoo Welcome to the n8n community!

The AI Agent works as designed: it dynamically decides when to call tools like Pinecone. This can lead to a “Generate → then Search” order. For a guaranteed “Search → then Generate” order, you should use a different workflow structure.

Here is the solution:
Build a deterministic RAG chain instead of using the AI Agent with a tool.

  1. Use a “Question and Answer Chain” node as your main LLM node.
  2. Connect its “Retriever” input to a “Vector Store Retriever” node.
  3. Connect that retriever to your Pinecone Vector Store node (set to “Retrieve Documents” mode).

This chain will always search Pinecone first and pass the results to the LLM, giving you the reliable, fixed-order workflow you expect.

Hello,

This is expected behavior for an AI agent as it needs to always do the reasoning first and then it will determine whether or not it needs to call the RAG database which in this case is Pinecone giving the Generate then Search behavior.

In order to get it to search the database every time you are going to have to specifically run the Vector Search for the documents Vector Store node / tool. Inject the documents into the prompt. Finally call the LLM which will do the reasoning with the context already in place.

The best time to use the agent for this search is when retrieval is optional so the model decides if the search is actually necessary.