Describe the problem/error/question
n8n becomes completely unresponsive every few minutes or few hours. The instance freezes and cannot respond to any requests until the container is restarted.
What is the error message (if any)?
No error messages. n8n becomes unresponsive silently.
Workflow Details
Information on your n8n setup
- n8n version: 2.1.4
- Database: PostgreSQL RDS (AWS)
- n8n EXECUTIONS_PROCESS setting (default: own, main): main
- Running n8n via (Docker, npm, n8n cloud, desktop app): Docker (AWS ECS Fargate)
- Operating system: Linux (Container on AWS ECS)
Check the execution and your AWS dashboard.
If the server only using for n8n.
There should be one or more workflow using too much resource. (Most possible RAM or CPU)
Once you fix that workflow or disable it. The instance should be fine.
Update after deep investigation:
I did extensive research and here’s what I found:
Infrastructure is NOT the bottleneck:
- ECS Task: 8 vCPU, 16GB RAM - barely utilized (CPU 4-7%)
- RDS: CPU 4-7%, memory fine, connections 3-13 out of 401
- Cache hit ratio: 99.93%
What I found - Database bloat:
- execution_data table is 4.6GB (99% of entire database)
- Only ~10,600 executions but averaging 446KB per execution
- Some workflows store 2-19MB per execution
- PostgreSQL does 145 million row sequential scans on execution_entity when loading UI
- Pruning is enabled (7 days) but data within those 7 days is massive
Any ideas on how to handle this? Is this a known issue with large execution data?
I’m facing a similar issue in my setup. I have two primary environments: beta and production, where a self-hosted n8n instance is available to users.
My current configuration includes:
-
One main service and one worker service, both with auto-scaling enabled (up to 5 instances each).
-
Each main and worker instance is provisioned with 4 vCPUs and 8 GiB RAM.
Initially, I suspected the issue was related to the lack of caching (for example, via CloudFront or ALB caching). However, when I performed load testing in the beta environment using production-equivalent configurations and without caching enabled, the n8n instance handled the load without any issues. [ Basically not able to reproduce the “Page Unresponsive” issue in beta ]
Are there any recent updates or known changes related to this behavior? @Ohad_Cohen
@Ohad_Cohen @jeimuzu18 @darrell_tw
Similar to what @Ohad_Cohen had, My DB’s CPU Utilization was more, so, increased the config, and now, its down to 20%.
Still the “Page Unresponsive” Issue Persists.
Any updates on you guys? Would be helpful.