Self Hosted N8N Help

Describe the problem/error/question

So, The issue I am recseiving is I am getting a connection lost error in the top right of our two different self hosted platforms. I have done everything I can think of it fix. I am going crazy.

We are using: ALB → FARGATE → RDS

Right now, It seems the frontend is actinv very randomly and just not well compared ot our cloud model. Another strange thing it does is sometimes the excution will not go right away. Like. The exection button is spinning but you dont see the workflow moving from one box to another if that makes sense.

I have also seen when the frontend is running. Its a 50 50 if the exucetion is truely running or not.

I have configured 1CPU and 2GB of ram on our ECS Fargate. I have not tried adding more but its not running that hard

The ALB/ECS/RDS are all in the same VPC and is and not needing to peer or anything like that.

I am so lost at this point that I just need ideas

Please share your workflow

Dont have Any

Share the output returned by the last node

Dont have Any

Information on your n8n setup

  • n8n version: 2.4.4
  • Database (default: SQLite): Postgress (RDS)
  • n8n EXECUTIONS_PROCESS setting (default: own, main): own
  • Running n8n via (Docker, npm, n8n cloud, desktop app): Docker in ECS Fargate
  • Operating system: ECS

Hi, @JoshuaS!

This behavior is typically caused by unstable WebSocket connections or resource constraints in a load-balanced setup. n8n’s UI relies on WebSockets for live execution updates, and if the ALB is not correctly configured for WebSockets or sticky sessions, the frontend can show “connection lost” and executions may appear to hang or run inconsistently.

The recommended fix is to ensure the ALB supports WebSockets, enable sticky sessions on the target group, and verify timeouts (idle timeout ≥ 60s). If issues persist, increase CPU/RAM slightly and confirm that only one n8n instance is serving the UI (no horizontal scaling without queue mode).

I understand that. The issue is its kind of random. I have my F12 on and i can see the sockets connections getting made and dropping from one workflow to the next. They are being made.

That is why I am confused

I have only one instance. Let me look into stickness.

Sorry for the edit

Can you ensure the ALB supports WebSockets