N8N tool call sucks

N8N doesn’t work well with tools, sometimes it doesn’t call for some obscure reason, because the tests of the same system prompt in openrouter for example regardless of LLM, the function appears, but n8n ignores it after a few loops, if you for example have 10 questions to ask, and with each answer, a function call to write a spreadsheet for example, on the third or fourth answer, it no longer calls the tool. It seems like it gets lost, I’ve done EVERYTHING. I’ve changed the prompt several times, I’ve redone it with the help of several LLM, and I’ve come to the conclusion that the problem is in N8N. I’ve tested Dify and it works first time with all calls and iterations. N8N was not made for this, very weak.

Other failed attempts:

I have already tried to create an MCP server, I have already described the tool in detail, I have already modified the structure of the agent’s output, I have already tried to inject something into the user input.

Are you using a memory node?
I sometimes find when the context window becomes too big, the agent starts to become unreliable with tool calls.
You can check in the logs if it’s building up a massive input.

Yes, I am using memory, but on my 5th or 6th response it no longer calls the tool. What bothers me is the lack of knowing what LLM actually returned, this is not visible, and the memory being large should not affect it, is this a bug in N8N?

thanks for the reply.

I’ve found the same issue. GPT-4.1 frequently fails, Sonnet-4 is more reliable, and the new Kimi K2, which is supposed to be one of the best tool call models in the world works maybe 1/10 of the time. These models work nearly 100% of the time in Cursor. It’s clearly an n8n issue, not a model or context window issue.