Scale n8n master nodes

sulabhsuneja · March 29, 2023, 10:12am

Describe the issue/error/question

I am facing downtime issues with n8n. I have implemented n8n queue mode. I have checked the metrices i.e CPU utilization and memory utilization of containers is not more than 20%. Is there any way to scale up n8n master nodes or reduce downtime? Any help will be appreciated. Thanks

What is the error message (if any)?

Information on your n8n setup

n8n version: 0.192.1
Database you’re using (default: SQLite): MySQL
Running n8n with the execution process [own(default), main]: main
Running n8n via [Docker, npm, n8n.cloud, desktop app]: Docker

MutedJam · March 29, 2023, 10:37am

Hi @sulabhsuneja, what exactly do you mean by downtime issues? Does your main n8n instance crash or something? If so, which error exactly are you seeing?

There is no supported scenario for having multiple main instance I am afraid. The idea would be to have one main instance and multiple workers.

sulabhsuneja · March 29, 2023, 10:44am

Hi @MutedJam
Yes, main n8n instance gets crashed. I have attached a screenshot of the Cloudwatch logs above. I am not able to find any error statement. It just gives 502 error. I am assuming it is not able to connect with RDS server during that time but I am not sure of the reason behind this issue.

MutedJam · March 29, 2023, 10:47am

The logs you have shared don’t suggest a crash I am afraid. Even without any error you’d see the instance restarting in such cases, meaning there’d be log lines such as n8n ready on 0.0.0.0, port 5678. Perhaps you can search for these and investigate the log entries just before? These should give you an indicator as to what might be happening here.

sulabhsuneja · March 29, 2023, 10:59am

Here in the below attached logs screenshot, I am able to see logs of “Stopping n8n…” and n8n getting restarted but not able to figure out if the workflow i.e (workflow ID: 470) listed is the one causing crash of main instance.

MutedJam · March 29, 2023, 11:05am

Thanks for sharing @sulabhsuneja! The Stopping n8n… line suggests an intentional graceful stop rather than a crash. So this would suggest that something in your environment is sending a SIGTERM (or SIGINT) signal rather than n8n crashing, and you might want to investigate in that direction.

sulabhsuneja · March 29, 2023, 11:08am

Thanks @MutedJam. I will start looking into environment configuration. Is there any way you can help in identifying SIGTERM or SIGINT signal? What could be the potential reasons behind this?

MutedJam · March 29, 2023, 11:14am

Unfortunately not, as I am not familiar with AWS. It might be worth checking with their support team directly. Is there perhaps some auto scaling at play (as in one container shutting down and a new one with additional resources starting instead or vice versa)?

sulabhsuneja · March 29, 2023, 12:57pm

Oh, Ok. Yes I have implemented autoscaling group for the ECS instances which run the docker containers. Usually, there is only one ECS instance on which only one master node is running in the container.

system · June 27, 2023, 12:57pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.