According to a few AIs I’ve been using to debug my system, the Agent node’s internal ReAct logic (LangChain) is allegedly “polluting” my instructions, causing the LLM to ignore my System Prompt’s tone and formatting rules. I have tried multiple architectures, prompts and models, and the behavior is consistent with the use of the Agent node, as the Basic LLM node respects my prompt perfectly. However, I need an on-demand RAG tool, which the basic node doesn’t support natively.
It’s been suggested I switch to the direct OpenAI/Anthropic agent nodes, but their direct APIs do not meet my project’s legal compliance requirements, which force me to stay with Azure/GCP/AWS. How can I create an Agent to call tools (RAG) using Azure while maintaining 100% adherence to my System Prompt?
Is there a Community Node that handles Azure Tool Calling without the ReAct “bloat”?
If I use the AI Agent + Azure Chat Model, how do I force it into “Tools/Function Calling” mode to stop it from ignoring my instructions?
Or is it better to abandon the Agent node entirely and build a manual “Router” using only Basic LLM nodes?
The issue you’re facing is due to how the Agent node (ReAct / LangChain) works internally. It adds its own reasoning steps and hidden instructions, which can override or “pollute” your System Prompt. That’s why your tone and formatting rules are not being followed consistently, while the Basic LLM node works perfectly.
Best solution (recommended)
Instead of using the Agent node, build a manual RAG + LLM pipeline using Azure. This gives you full control and removes the ReAct interference completely.
How to structure it
User Input
Receive the user query (Slack, webhook, etc.)
RAG Retrieval (your data source)
Query your vector database / documents / API to fetch relevant context
Prepare final prompt manually
Combine everything into one structured prompt:
Would that imply having 2 LLMs, one to write the queries and another to generate the final product?
The RAG is a legal repository, so it is a lot of content I can’t dump onto on LLM without filtering only to what is needed. I was using it as an on-demand tool for the LLM to check if the laws it intended to use from its internal knowledge were still factually correct.
Would having 2 LLMs like this lead to equivalent precision and token usage as a single LLM calling the RAG as a tool?
Hi @luizedu Welcome!
This problem is purely situation based and i understand the concern of data being leaked for no reason from the vector store during retrieval, what i would recommend is that you having an architecture where 2 separate AI agents tend to work on a specific writing, basically one would get the data and write the content whatever it is and the 2nd agent would verify its output if that is not relevant or not actually linked to what we actually need then we can use a text classifier to define the output stream and we can repeat this process until the 2nd agent approves, this can be expensive if your requirements are very strict as if you are working with vector databases you need make sure what you inject into it, so if possible try to regulate what you inject as document instead of actually dumping everything, i think everything would work seamlessly if you follow this plan and use models which have a large context window. Let me know how this goes.
Great discussion here. I ran into a similar issue building chatbots with Gemini on n8n - the ReAct agent kept ignoring my tone instructions despite a well-crafted system prompt.
The manual RAG + Basic LLM approach suggested above is exactly what I ended up using too. But one thing worth adding:
When you build the manual prompt in a Code node or Set node, try being very explicit about the prompt structure. Something like putting your system instructions as a clearly labeled block with a separator, then the context, then the user message. Models respond better when the sections are visually distinct, not just concatenated.
Also - one trick I found useful when using Gemini (and should work for Azure OpenAI too): repeat the key formatting instructions at the END of the prompt as well, right before the user message. Something like “Remember: [your key rules]”. This counteracts the tendency of models to weight earlier tokens less as the context grows.
For the Azure compliance requirement specifically - the Azure OpenAI node in n8n works well for this. It gives you full control to send a structured messages array (system + user roles) which bypasses the ReAct overhead entirely.
Agent nodes sometimes change your instructions because they add their own reasoning, while basic model nodes follow your instructions exactly. To use Azure for retrieving documents or calling tools without altering your instructions, it’s best to avoid the Agent node. Instead, you can use a basic model node with function or tool calling, and if needed, add a simple router to decide which tool to use. This approach keeps your instructions fully respected.
the Agent node’s ReAct logic adds its own reasoning steps that can override your System Prompt. we hit this too when trying to enforce strict tone/formatting — the LLM keeps ignoring the rules because Agent is injecting its own instructions. best solution: build a manual RAG + Azure LLM pipeline instead. skip the Agent node entirely, use Basic LLM node with your System Prompt + a separate retrieval step. gives you full control and removes the ReAct pollution. for Azure compliance, this pattern works great — seen teams do exactly this to keep their System Prompts intact.