Hey - I’m see a slow memory leak on my production n8n instance, after several weeks this cause the container to fall over and restart. I’ve deployed the observability environment variables and can see this starting to build with memory and heap usage increasing over the 2 days since I enabled.
I’m really looking for help on the correct approach to troubleshoot. I’d assume this is related to a specific workflow (maybe a code node?) but I’m not sure how to approach debugging this. Any suggestions?
We are planning on upgrade to v2 soon but have substantial testing to carry out before we can upgrade.
What is the error message (if any)?
The logs show this error message:
Pruning old insights data
timeout of 3000ms exceeded
Error while fetching community nodes: timeout of 3000ms exceeded
Deregistered all crons for workflow
Received tool input did not match expected schema
Pruning old insights data
Deregistered all crons for workflow
<--- Last few GCs --->
[7:0x7f6362ac1000] 3265907810 ms: Mark-Compact 2006.1 (2086.9) -> 1990.5 (2087.2) MB, pooled: 1 MB, 577.83 / 0.11 ms (average mu = 0.310, current mu = 0.355) task; scavenge might not succeed
[7:0x7f6362ac1000] 3265908554 ms: Mark-Compact 2006.6 (2087.4) -> 1991.2 (2087.4) MB, pooled: 1 MB, 641.79 / 0.08 ms (average mu = 0.230, current mu = 0.137) allocation failure; scavenge might not succeed
<--- JS stacktrace --->
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----
Last session crashed
Initializing n8n process
Error tracking disabled because this release is older than 6 weeks.
n8n ready on ::, port 5678
[license SDK] Skipping renewal on init: license cert is not due for renewal
Information on your n8n setup
n8n version: 1.123.19 - I was running 1.120.4 when it last crashed
your n8n instance is hitting the hard memory limit of the Node.js process.
You can instantly increase the memory ceiling by setting this environment variable in your Docker/deployment config:
Variable:NODE_OPTIONS
Value:--max-old-space-size=4096 (or higher, e.g., 8192, depending on your server’s available RAM).
This tells Node.js, “It’s okay to use up to 4GB of RAM before you panic.”
It doesn’t fix the leak, but it will extend the time between crashes from days to weeks.
SQLite is probably your biggest issue. On a production instance that runs for weeks, SQLite holds a lot in memory and doesn’t release it well. The execution data piles up. Switch to PostgreSQL when you do your v2 upgrade — this alone fixes the memory leak for a lot of people.
Until then, make sure execution pruning is actually working. Check your EXECUTIONS_DATA_PRUNE and EXECUTIONS_DATA_MAX_AGE settings. If you’ve got weeks of execution data sitting in SQLite, that’s a ton of memory. Set EXECUTIONS_DATA_SAVE_ON_SUCCESS to none if you don’t need to keep successful runs — that’s usually the biggest volume.
For finding a specific leaky workflow — look at which workflows run the most frequently (cron triggers, polling triggers). Those are the ones where a small leak compounds. Code nodes that build up arrays or objects without clearing them between runs are a common culprit. Same with HTTP Request nodes inside loops that don’t have timeouts set — if they hang, the data sits in memory.
You can also check the docker stats output right after a restart vs a few days later to get a baseline, then disable your highest-frequency workflows one at a time and see if the growth rate changes. Not elegant but it works.
Also — binary data mode matters. If you’re processing files (images, PDFs, etc.) and your binary data mode is set to default (in-memory), switch it to filesystem with N8N_DEFAULT_BINARY_DATA_MODE=filesystem. That alone can cut memory usage significantly if any of your workflows handle files.
We think we have identified the workflow. It is regular job that works with a reasonable sized data set and some code nodes. We’ve seen increases in memory that never reduce again at the moment this workflow starts.