Help Needed with n8n Pod Restarts in Docker/Kubernetes Deployment

Hey n8n Community,

Hope you’re all doing well! We’re facing a puzzling issue with our n8n installation deployed using Docker and Kubernetes on AWS. Our setup includes a main pod, 2 webhook pods, and 2 worker pods. Here’s the problem: the pods keep restarting unexpectedly, but we can’t find any clues in the n8n logs as to why it’s happening.

Basically, everything seems fine, and then out of the blue, the pods start restarting without any clear triggers or error messages in the n8n logs. We’ve scoured the logs but haven’t been able to pinpoint the cause of these restarts.

So, we’re reaching out to the awesome n8n community for some guidance and experiences you might have had in similar situations. If any of you have encountered similar issues with n8n in a scalable Docker/Kubernetes setup or have insights to share, we’d really appreciate your help.

Here are a few specific questions we have:

  1. Have any of you faced challenges or known issues when deploying n8n in a scalable mode using Docker and Kubernetes on AWS?
  2. Is it possible that external factors like resource constraints, network instability, or interactions with other services are triggering these pod restarts?
  3. Do you know of any additional logs or debugging techniques that could shed light on the restart process and help us identify the root cause?
  4. Are there any specific configuration settings or best practices for deploying n8n in a scalable mode that we should be aware of?
  5. Can you recommend any tools or approaches for effectively monitoring and tracking pod restarts in a Kubernetes environment?

We’d be extremely grateful for any advice, experiences, or suggestions you can offer to help us get to the bottom of this issue. Your input will not only assist us in solving our problem but also contribute to the collective knowledge of the n8n community.

Thanks a ton for your time and support. Looking forward to hearing from you and sharing our progress on this matter.

Best regards,
Pavel

Information on your n8n setup

  • **n8n version: 0.222.1 **
  • **Database (default: SQLite): Postgres **
  • n8n EXECUTIONS_PROCESS setting own
  • *Running n8n via Docker
1 Like

Actually we have experienced the same problem, however, could not find the answer. Would be happy, if anyone share any thoughts.

Hi folks, I am sorry you’re having trouble. Perhaps @krynble can help with these queue mode specific problems?

On a general note, where n8n’s own logs aren’t particularly helpful I find it often useful to keep an eye on generic stats such as memory usage which n8n itself wouldn’t track. Out of memory situations are a typical cause of crashes and the exact way this problem shows can change.

2 Likes

Hello people.

So when you mentioned logs, have you tried increasing the log levels to debug? You can do that by setting N8N_LOG_LEVEL to debug and this will make n8n way more verbose. this might help. This applies to all containers / pods.

About your questions @PRybakov , here are my considerations:

  1. Depending on the database you are using, you might face issues with the number of simultaneous connections to the database. This might cause your pods to fail.
  2. Yes, if pods lose conectivity to Redis they will retry connecting a few times (for 30 seconds) and exit, but this is followed by log messages with warn levels which should be visible by default
  3. Set the log level to debug as described above. This might give you more insights about it.
  4. The crucial steps are: redis and database are considered shared resources and must be available to all pods. Also you must make sure that the encryptionKey is set correctly in all pods to the same value, so that all pods can read the credentials properly.
  5. Not really, sorry about this one

I hope this helps.

1 Like