I have a self-hosted n8n running and using the Azure OpenAI chat model for my AI agent. But for some of the workflows, the chat model is failing very randomly with a Connection error.
Error:
2026-02-25T19:19:37.955Z | debug | Workflow execution finished with error {“error”:{“level”:“warning”,“tags”:{“reWrapped”:true},“timestamp”:1772047177953,“context”:{},“functionality”:“regular”,“name”:“NodeOperationError”,“node”:{“parameters”:{“aiAgentStarterCallout”:“”,“promptType”:“auto”,“text”:“={{ $json.chatInput }}”,“hasOutputParser”:false,“needsFallback”:false,“options”:{“systemMessage”:“**System Instructions:** \nprompt”}},“type”:“@n8n/n8n-nodes-langchain.agent”,“typeVersion”:3.1,“position”:[560,0],“id”:“id”,“name”:“AI Agent”},“messages”:[“Connection error.”],“message”:“Connection error.”,“stack”:“NodeOperationError: Connection error.\n at /usr/local/lib/node_modules/n8n/node_modules/.pnpm/@n8n+n8n-nodes-langchain@file+packages+@n8n+nodes-langchain_25556d2df217b3ecea67b49c8dcb0d5a/node_modules/@n8n/n8n-nodes-langchain/nodes/agents/Agent/agents/ToolsAgent/V3/helpers/executeBatch.ts:113:11\n at Array.forEach ()\n at executeBatch (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/@n8n+n8n-nodes-langchain@file+packages+@n8n+nodes-langchain_25556d2df217b3ecea67b49c8dcb0d5a/node_modules/@n8n/n8n-nodes-langchain/nodes/agents/Agent/agents/ToolsAgent/V3/helpers/executeBatch.ts:102:15)\n at processTicksAndRejections (node:internal/process/task_queues:105:5)\n at ExecuteContext.toolsAgentExecute (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/@n8n+n8n-nodes-langchain@file+packages+@n8n+nodes-langchain_25556d2df217b3ecea67b49c8dcb0d5a/node_modules/@n8n/n8n-nodes-langchain/nodes/agents/Agent/agents/ToolsAgent/V3/execute.ts:46:66)\n at ExecuteContext.execute (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/@n8n+n8n-nodes-langchain@file+packages+@n8n+nodes-langchain_25556d2df217b3ecea67b49c8dcb0d5a/node_modules/@n8n/n8n-nodes-langchain/nodes/agents/Agent/V3/AgentV3.node.ts:139:10)\n at WorkflowExecute.executeNode (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_@[email protected]_@opentelemetry+exporter-trace-otlp_4dbefa9881a7c57a9e05a20ce4387c10/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:1039:8)\n at WorkflowExecute.runNode (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_@[email protected]_@opentelemetry+exporter-trace-otlp_4dbefa9881a7c57a9e05a20ce4387c10/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:1218:11)\n at /usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_@[email protected]_@opentelemetry+exporter-trace-otlp_4dbefa9881a7c57a9e05a20ce4387c10/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:1653:27\n at /usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_@[email protected]_@opentelemetry+exporter-trace-otlp_4dbefa9881a7c57a9e05a20ce4387c10/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:2296:11”},“workflowId”:“cV3IXtIFJdzLwYCJ”,“file”:“logger-proxy.js”,“function”:“exports.debug”}
Hi @tej I recommend using Official OpenAI node and connection for better and reliable service. Azure with openAI might sometimes get unstable with high usage.
since you aren’t hitting rate limits, the issue is mostly a timeout or a dropped network packet between your docker container and azure. looking at your workflow json, you have a Grafana MCP tool connected to that agent. grafana queries can be slow sometimes. if the agent is waiting on that tool to return heavy data, the underlying http connection to azure might just be timing out and dropping while it waits.
i’d try flipping on the Retry On Fail toggle in the AI Agent node. if it’s just a random network blip or a slightly slow tool execution, a quick retry usually forces it right through without breaking the whole workflow.
as for that 404 error, you’re putting https://resource-name.openai.azure.com/openai as your endpoint in the credentials. n8n actually builds the final api path for you . so it’s taking your url and automatically adding another /openai or deployment string to the end of it. that’s exactly why azure is throwing a 404 back at you—the path doesn’t exist.
just strip it back to the absolute base url: https://resource-name.openai.azure.com/.
once you save that clean url in your credentials, it should connect perfectly.
Hi @tej Have you tried using a different model? On that service? Does it gives the same error, although this should not be happening but 404 is a critical one, i recommend switching up to Openrouter if its in production.
I was not able to replicate the azure node connection error in my local so I am suspecting some issue with my LB configuration related to timeout setting
Hi @tej
Can you confirm the exact endpoint format you’re using, including api-version, and whether the same deployment works consistently outside of n8n under concurrent load?
ok!
the cleanest solution is to fix DNS behavior at the cluster level, either by stabilizing CoreDNS or configuring the cluster or node resolver to prefer IPv4, rather than trying to solve this inside n8n @tej
this command uses a Kubernetes feature called hostAliases. think of it as giving your n8n pod a local phonebook. instead of asking the cluster’s flaky DNS server where Azure is (which is what’s causing your timeouts), n8n will bypass DNS entirely and use your working IPv4 address to connect directly.
just swap in your actual deployment name and that working IP address, then run this: