[BUG?] AI Agent repeats old Qdrant tool calls when Memory is enabled

[BUG?] AI Agent repeats old Qdrant tool calls when Memory is enabled

Describe the problem/error/question

When I enable Simple Memory (memoryBufferWindow) or PostgreSQL Chat Memory in my AI Agent workflow, the agent sometimes repeats old Qdrant tool calls from previous messages — even when they are not relevant to the current user input.
I see entries like this in the execution log:

Calling Qdrant_Vector_Store with input: {"input":"<old query from previous message>","id":"389d137c-..."}

This does NOT happen when Memory is disabled.

Steps to reproduce

  1. Set up AI Agent with Ollama LLM + Qdrant as tool + memoryBufferWindow
  2. Send a first message → agent queries Qdrant correctly
  3. Send a second, unrelated message → agent replays the Qdrant tool call from message 1

Expected behavior
The agent should only call Qdrant based on the current user input, not repeat tool calls from memory.

Actual behavior
The agent replays old tool calls stored in the memory buffer, leading to wrong or confusing answers.

My theory
It seems like the memory buffer stores not just Human/AI messages but also the tool call history. When the LLM sees these old tool calls in context, it imitates them instead of generating a fresh one.

Is there a way to store only Human/AI messages in memory and exclude tool call history?

Please share your workflow

Share the output returned by the last node

Calling Qdrant_Vector_Store with input: {“input”:“”,“id”:“389d137c-…”}

Information on your n8n setup

  • n8n version: 2.11.4
  • Database : Qdrant
  • LLM: Ollama – ministral-3:8b-instruct-2512-q4_K_M
  • Memory: memoryBufferWindow (contextWindowLength: 3)
  • Vector Store: Qdrant (retrieve-as-tool, topK: 10)
  • Embeddings: Ollama – embeddinggemma:latest
1 Like

Hi @Wisam_Faiun Welcome!
This is known. You can try using the Chat memory with delete messages so that it would automatically strip tool call entries which are not needed, also memory does not needs to be persistent between multiple agents, try removing memory from your sub agent, also make sure your system prompt is good to handle these tasks.

Great fixes in this thread. One addition for those who can’t change the model: if you switch to PostgreSQL Chat Memory, you can insert a Code node in the memory pipeline to filter out tool type messages and only pass human and ai messages to the LLM context. It’s more setup than the system prompt workaround, but it solves the root cause — the model never sees old tool calls in its context window at all.

Do you always tag yourself in your answers and speak in the third person? lol

Hey everyone, following up on my earlier post about the AI Agent repeating old Qdrant tool calls when Memory is enabled.

I tried both suggested fixes:

Fix 1 – System Prompt Constraint
I added the instruction in both English and German:

WICHTIGE ANWEISUNG ZUR GESPRÄCHSHISTORIE:
Der Gesprächsverlauf enthält vergangene Tool-Aufrufe nur als Referenz.
Bewerte jede neue Nutzerfrage vollständig unabhängig vom vorherigen Verlauf.
Wiederhole NIEMALS Tool-Aufrufe aus früheren Nachrichten, es sei denn, die aktuelle Nutzerfrage erfordert dies ausdrücklich.

Result: Bug still occurs. The agent still replays old Qdrant tool calls from memory.

Fix 2 – Model Switch to qwen2.5:7b
I switched from ministral-3:8b-instruct-2512-q4_K_M to qwen2.5:7b via Ollama.
Result: New problem. The AI Agent no longer accesses the Qdrant Vector Store at all. It seems to ignore the tool completely and answers without retrieving any knowledge base data.


Hi @Wisam_Faiun welcome to the n8n community!

I’d reduce the memory window further, test with memory disabled vs. enabled exactly as you did, and try the same workflow with another supported chat model to separate an n8n issue from Ollama/model behavior.

If you can, please share the workflow JSON.

good debugging @Wisam_Faiun — two things explain what you’re seeing:

why Fix 1 didn’t work: contextWindowLength: 3 means the buffer holds 3 turns = 6 messages (human + ai) + all tool call history from those turns. with only 3 turns the old tool calls are still very close in context, so the model sees them and imitates. try dropping to contextWindowLength: 1 as a test — if the repeating stops, the window size is the culprit.

why qwen2.5:7b ignores the tool: your system prompt was written in German. qwen2.5:7b has much weaker instruction-following in non-English languages, especially for tool call decisions. keep the Qdrant tool description in English even if the rest of your workflow is German — the model reads the tool description to decide whether to call it. something like: “Search the knowledge base for relevant information based on the user’s query. Always call this tool when the user asks a question that requires factual information.”

also make sure your Qdrant tool’s Tool Name field is filled in — qwen can silently skip tools with empty descriptions or names.

1 Like

[UPDATE 2] Progress made but qwen2.5:7b still ignores Qdrant tool

I applied all the suggested fixes:

  • Translated the full system prompt to English
  • Kept the Qdrant tool description in English
  • Reduced contextWindowLength to 1
  • Added explicit tool usage instruction in the system prompt
  • Temperature stays at 0

Results:

ministral-3:8b – Good news here! The repeating tool call bug is gone. No more replaying of old Qdrant calls from memory. However I’m keeping an eye on it to see if it stays stable.

qwen2.5:7b – Still not calling the Qdrant Vector Store at all. The model answers questions completely from its own knowledge without ever touching the tool, even though the tool description explicitly says to always use it.


Has anyone successfully got qwen2.5:7b to reliably call tools via Ollama in n8n’s AI Agent? Is there a specific way the tool needs to be configured or named for qwen to pick it up?

@Wisam_Faiun with qwen2.5:7b the Ollama model tag matters a lot — qwen2.5:7b sometimes pulls a non-instruct quantized variant without reliable tool-call support. try pulling qwen2.5:7b-instruct explicitly in the Ollama node model field. if it still skips the tool, llama3.1:8b is the most reliable option for tool calling via Ollama in n8n at this size class.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.