Intermittent Telegram Node Failures After RAG Agent Node

Hello n8n community,

I’m hoping for some guidance on a very strange, intermittent issue I’m facing with a self-hosted setup. The core problem is that the Telegram “Send Message” node fails unpredictably, but only when it receives input from a LangChain RAG Agent node.

My Environment

  • n8n Version: 1.119.2

  • Deployment: Docker Compose on a dedicated Ubuntu Server 24.04.

  • Networking: All traffic is routed through Nginx Proxy Manager (also in Docker).

  • Special Configuration: My OpenAI nodes are configured to route through a custom Nginx proxy on a separate server. This proxy uses a self-signed certificate. I have correctly configured n8n to trust this certificate by mounting it into /opt/custom-certificates with the required hashed symlink. This part works – the RAG agent successfully gets responses from OpenAI.

The Detailed Problem

My workflow is based on the official “Company RAG” template: Telegram Trigger → (process voice/text) → RAG Agent → Telegram Send Message ( Company Knowledge Base Agent (RAG) | n8n workflow template )

The workflow fails about 50% of the time, and the failure always happens at the final “Send Message” node. Sometimes it works perfectly, other times the execution just hangs or errors out at that final step.

Here’s what I’ve discovered through debugging:

  1. The problem is specific to the RAG Agent node. If I replace the RAG Agent with a simple Set node that just outputs static text, the Telegram node works flawlessly, 100% of the time.

  2. The pattern is repeatable. I’ve built other simple test workflows. As soon as I place any AI Agent node before a Telegram node, the connection becomes unreliable.

This leads me to believe the issue is some kind of incompatibility or race condition between the output of the agent and the input of the Telegram node.

Things I’ve Already Investigated (to save time)

I’ve spent a lot of time debugging potential “red herrings” and want to rule them out:

  • ValidationError: X-Forwarded-For: I do see this error in my logs once when I first load the UI after a container restart. However, I am almost certain this is a separate, cosmetic issue because:

    • It only happens on the first UI request and does not appear when the Telegram node fails.

    • I have already tried setting N8N_TRUSTED_PROXIES, N8N_RATELIMIT_TRUST_PROXY, and even N8N_DISABLE_RATE_LIMIT=true. These variables are correctly passed into the container (verified with docker exec env), but they don’t fix the intermittent Telegram failures.

My Configuration

Here is the structure of my docker-compose.yml:

services:
  n8n:
    image: n8nio/n8n:latest
    container_name: n8n
    restart: always
    ports:
      - "0.0.0.0:5678:5678"
    env_file: ./.env
    environment:
      - N8N_HOST=${SUBDOMAIN}.${DOMAIN_NAME}
      - WEBHOOK_URL=https://${SUBDOMAIN}.${DOMAIN_NAME}/
      # ... other standard variables
    volumes:
      - n8n_data:/home/node/.n8n
      - ./pki:/opt/custom-certificates:ro # For my OpenAI proxy's self-signed cert
      # ... other data volumes
    networks:
      - proxy
# ...

My Core Questions

  1. What are the recommended advanced debugging techniques? The standard execution view is not sufficient for these intermittent failures. How can I capture the exact data payload, internal state, or any potential errors being passed between the RAG Agent and the Telegram node precisely at the moment of failure?

  2. How can I enable more verbose, internal logging? The standard logs don’t show any specific error from the Telegram node when it fails. Are there specific NODE_DEBUG flags or other n8n-specific logging levels I can enable that would provide a detailed trace of the outbound HTTPS request being made by the Telegram node?

  3. What information would be most helpful for you to see? I am fully prepared to provide any necessary diagnostic information to get to the bottom of this. Please let me know what logs, configurations, or specific tests you would like me to run. I can share anything needed (anonymized, of course) to help you help me.

I’m convinced my environment is set up correctly and this is a more nuanced application-level problem. Any help or ideas would be greatly appreciated. Thank you!

Could you share some screenshots, along with a failed execution as a reference?

You said that the node works if you use a Set node, which implies that the issue is not in your n8n setup. This also implies that the issue is based on the input items. I think that its something about the expression you are using, and how you reference the output of the agent inside the Text parameter of your Telegram node.

Hello!

Thanks for the reply. Your suggestion to check the input data was logical. I’ve since isolated the specific error from the Telegram node, and it points to a network-level issue, not an input data problem.

The exact error from a failed execution is ECONNRESET (socket hang up).

Here is the full error output:

{
  "errorMessage": "The connection to the server was closed unexpectedly, perhaps it is offline. You can retry the request immediately or wait and retry later.",
  "errorDetails": {
    "rawErrorMessage": [
      "socket hang up",
      "socket hang up"
    ],
    "httpCode": "ECONNRESET"
  },
  "n8nDetails": {
    "nodeName": "Response",
    "nodeType": "n8n-nodes-base.telegram",
    "nodeVersion": 1.2,
    "resource": "message",
    "operation": "sendMessage",
    "time": "15.11.2025, 16:46:51",
    "n8nVersion": "1.119.2 (Self Hosted)",
    "binaryDataMode": "default",
    "stackTrace": [
      "NodeApiError: The connection to the server was closed unexpectedly, perhaps it is offline. You can retry the request immediately or wait and retry later.",
      "    at ExecuteContext.apiRequest (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-nodes-base@file+packages+nodes-base_@[email protected]_asn1.js@5_afd197edb2c1f848eae21a96a97fab23/node_modules/n8n-nodes-base/nodes/Telegram/GenericFunctions.ts:230:9)",
      "    at processTicksAndRejections (node:internal/process/task_queues:105:5)",
      "    at ExecuteContext.execute (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-nodes-base@file+packages+nodes-base_@[email protected]_asn1.js@5_afd197edb2c1f848eae21a96a97fab23/node_modules/n8n-nodes-base/nodes/Telegram/Telegram.node.ts:2198:21)",
      "    at WorkflowExecute.executeNode (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_@[email protected]_@[email protected]_08b575bec2313d5d8a4cc75358971443/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:1093:8)",
      "    at WorkflowExecute.runNode (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_@[email protected]_@[email protected]_08b575bec2313d5d8a4cc75358971443/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:1274:11)",
      "    at /usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_@[email protected]_@[email protected]_08b575bec2313d5d8a4cc75358971443/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:1708:27",
      "    at /usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_@[email protected]_@[email protected]_08b575bec2313d5d8a4cc75358971443/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:2324:11"
    ]
  }
}

Based on this, I conducted further tests to determine the trigger, and the results are inconsistent.

  1. Input Data Ruled Out: I modified the Telegram node to send a hardcoded, static string (“hello world”), completely ignoring the RAG Agent’s output. The intermittent ECONNRESET error still occurs. This confirms the issue is not related to the data payload from the agent.

  2. Inconsistent Failure Rate: I’ve observed that the failure rate appears correlated with the complexity of the initial query to the RAG Agent. Simpler queries that resolve faster have a higher success rate than more complex ones that take longer.

  3. Looping Test: To test stability, I placed the Telegram node (sending a static string) in a loop to fire 15 times after the RAG Agent completes. The behavior is unpredictable:

    • Sometimes, all 15 requests succeed.

    • Sometimes, it fails on the first request.

    • Sometimes, it sends 1-3 requests and then fails with ECONNRESET.

This evidence strongly suggests the issue is that the resource-intensive RAG Agent node is leaving the container’s networking stack in an unstable state, which then causes subsequent outbound HTTPS requests from the Telegram node to fail.

Or perhaps the problem lies somewhere else entirely… I just can’t pinpoint what it could be.

I’m starting to think the issue is with my configuration after all. Since no one else seems to be reporting this specific problem with a common combination like AI + Telegram, it must be something unique to my setup that I need to fix.

after your rag add a wait node and wait for a couple of seconds and check what will happen

Thanks, I’ve already tried that. Adding a Wait node (even up to 5 mins) after the RAG Agent doesn’t change the outcome. The ECONNRESET error still happens intermittently.

How about replacing telegram node with a hand crafted http node. maybe the telegram node is the problem

I wanted to follow up and close this topic with a solution. Thank you all for your time and suggestions.

After extensive debugging, I have located the root cause of the ECONNRESET errors. The problem was indeed in my specific configuration, but at a much lower level than expected.

The core issue was related to how my n8n container was making outbound HTTPS requests to the OpenAI API through my custom proxy. My initial proxy setup (a simple Nginx reverse proxy) was causing network instability under the specific load generated by the RAG Agent node. It seems this instability was leading to dropped connections, resulting in the socket hang up errors in the downstream Telegram node.

The Solution:

I have completely re-architected my approach to proxying the traffic. Instead of using Nginx, I implemented a more robust, dedicated proxy chain on my host server using Privoxy (as an HTTP proxy) forwarding to a SOCKS5 proxy.

I then configured my n8n container to use this new proxy via the standard HTTP_PROXY/HTTPS_PROXY environment variables.

This new, more stable networking layer has completely resolved the issue. The ECONNRESET errors are gone, and the workflow now runs with 100% reliability, no matter how complex the RAG query is.

The key takeaway is that the RAG Agent’s resource usage was exposing an underlying weakness in my previous network proxy setup. The problem wasn’t n8n itself, but the environment it was running in.

Thanks again for helping me troubleshoot this. I’m marking this as solved.

1 Like