Hi, I’ve been using n8n on local, from Docker, without issues. I then installed it in AWS Fargate and I’m now seeing the ‘executing’ stuck for around 10 mins. Sometime it fails with " Problem running workflow Can’t connect to n8n."
On local, the same basic workflow takes around 1 second.
Speaking with n8n support, they suggested enabling stickiness on the ALB. I’ve done this and the problem is still there, it’s also set on the target group. I’ve looked through the flow logs and can’t spot anything obvious. My guess is websockets are causing the problem but I’m out of ideas on how to fix it.
Infrastructure is basic:
ALB (port 443 with cert) → Fargate (port 5678)
What is the error message (if any)?
Problem running workflow
Can’t connect to n8n.
Information on your n8n setup
n8n version: latest and previous versions over the last month
Hey @nmeneil hope you are doing great!!Could you share mode details about the workflow that you are trying to run? May be more details about your AWS configurations? I have some experience running ECS self-hosted installations so may be I could help! Happy to read your comments
Hey @Gonzalo_Romero_Herna thanks for the reply. The workflow I’ve used for testing is about as basic as possible, it’s a trigger node and one http request node. The http node executes on local in around 2 seconds.
I’ve been trying various combinations since I posted. I’ve now installed ngnix in the container, switched over to ‘sse’ from ‘websockets’ with ngnix proxying from the ALB over port 8080 → n8n all within the same container. This has resulted in it working 40% of the time now, it just hangs for the other 60%. Saving the workflow also hangs at random. There are no errors in the logs.
For AWS, again it’s a simple config. The ALB has a cert which terminates the connection over https and forwards it onto the target group over port 8080. We block all ports apart from 443 going to the ALB. I have used the same setup for many other ECS deployments and they all work fine, just not n8n.
I like n8n and want to put a business case forward for obtaining the business license but sadly, I’m reaching the end now. It’s getting impossible to justify investing the time it in. I’ll give it one last go and then call it a day.
@nmeneil yea, that seems pretty weird. I was executing some testings in two of my self-hosted installations (different infra) but I was not able to reproduce the error since I ran the workflow described successfully over and over again. My suspect is that could be related to an unexpected behavior due an environment variable. I was taking some other cases reporting the same issue (under other scenarios) but no concrete solutions so far. Is it possible to share your current environment variables set?