Self hosted instance crashing repeatedly out of memory

Describe the issue/error/question

My instances are crashing almost every day now due to out-of-memory errors. Out of a sudden, my memory usage goes from about 500-700MB to 2.4GB and my nohup log gets filled very quickly before the crash occurs. I’ve been using n8n for about a year now and instability started to show around December. back then I also noticed tons of problems with the pruning of old executions that also resulted in crashes when trying to start the nodes.

Note I have two separate instances - they are independent of each other but use the same managed postgres server, with different db’s of course.

Information on your n8n setup

  • **n8n version: 0.218.0
  • **Database: Google Managed PGSQL 13
  • **Running n8n with the execution process: main
  • **Running n8n via: npm

What is the error message (if any)?

<— Last few GCs —>

[1776:0x5c34060] 23977754 ms: Mark-sweep 1942.3 (1993.7) → 1935.8 (1995.7) MB, 2119.2 / 0.1 ms (average mu = 0.183, current mu = 0.071) allocation failure scavenge might not succeed
[1776:0x5c34060] 23979963 ms: Mark-sweep 1943.7 (1995.7) → 1937.9 (1997.2) MB, 2187.4 / 0.1 ms (average mu = 0.100, current mu = 0.010) allocation failure scavenge might not succeed

<— JS stacktrace —>

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
1: 0xb08e80 node::Abort() [node]
2: 0xa1b70e [node]
3: 0xce1890 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
4: 0xce1c37 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
5: 0xe992a5 [node]
6: 0xe99d86 [node]
7: 0xea82ae [node]
8: 0xea8cf0 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
9: 0xeabc6e v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
10: 0xe6d1aa v8::internal::factory::NewFillerObject(int, bool, v8::internal::AllocationType, v8::internal::AllocationOrigin) [node]
11: 0x11e5f96 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [node]
12: 0x15d9c19 [node]
query is slow: SELECT “SharedWorkflow”.“workflowId” AS “SharedWorkflow_workflowId”, “SharedWorkflow”.“userId” AS “SharedWorkflow_userId”, “SharedWorkflow__SharedWorkflow_workflow”.“createdAt” AS “SharedWorkflow__SharedWorkflow_workflow_createdAt”, “SharedWorkflow__SharedWorkflow_workflow”.“updatedAt” AS “SharedWorkflow__SharedWorkflow_workflow_updatedAt”, “SharedWorkflow__SharedWorkflow_workflow”.“id” AS “SharedWorkflow__SharedWorkflow_workflow_id”, “SharedWorkflow__SharedWorkflow_workflow”.“name” AS “SharedWorkflow__SharedWorkflow_workflow_name”, “SharedWorkflow__SharedWorkflow_workflow”.“active” AS “SharedWorkflow__SharedWorkflow_workflow_active”, “SharedWorkflow__SharedWorkflow_workflow”.“nodes” AS “SharedWorkflow__SharedWorkflow_workflow_nodes”, “SharedWorkflow__SharedWorkflow_workflow”.“connections” AS “SharedWorkflow__SharedWorkflow_workflow_connections”, “SharedWorkflow__SharedWorkflow_workflow”.“settings” AS “SharedWorkflow__SharedWorkflow_workflow_settings”, “SharedWorkflow__SharedWorkflow_workflow”.“staticData” AS “SharedWorkflow__SharedWorkflow_workflow_staticData”, “SharedWorkflow__SharedWorkflow_workflow”.“pinData” AS “SharedWorkflow__SharedWorkflow_workflow_pinData”, “SharedWorkflow__SharedWorkflow_workflow”.“versionId” AS “SharedWorkflow__SharedWorkflow_workflow_versionId”, “SharedWorkflow__SharedWorkflow_workflow”.“triggerCount” AS “SharedWorkflow__SharedWorkflow_workflow_triggerCount”, “SharedWorkflow__SharedWorkflow_role”.“createdAt” AS “SharedWorkflow__SharedWorkflow_role_createdAt”, “SharedWorkflow__SharedWorkflow_role”.“updatedAt” AS “SharedWorkflow__SharedWorkflow_role_updatedAt”, “SharedWorkflow__SharedWorkflow_role”.“id” AS “SharedWorkflow__SharedWorkflow_role_id”, “SharedWorkflow__SharedWorkflow_role”.“name” AS “SharedWorkflow__SharedWorkflow_role_name”, “SharedWorkflow__SharedWorkflow_role”.“scope” AS “SharedWorkflow__SharedWorkflow_role_scope” FROM “public”.“shared_workflow” “SharedWorkflow” LEFT JOIN “public”.“workflow_entity” “SharedWorkflow__SharedWorkflow_workflow” ON “SharedWorkflow__SharedWorkflow_workflow”.“id”=“SharedWorkflow”.“workflowId” LEFT JOIN “public”.“role” “SharedWorkflow__SharedWorkflow_role” ON “SharedWorkflow__SharedWorkflow_role”.“id”=“SharedWorkflow”.“roleId”
execution time: 10338

<— Last few GCs —>

[27352:0x4fca060] 47493 ms: Scavenge 1950.8 (1980.4) → 1948.6 (1981.4) MB, 6.4 / 0.0 ms (average mu = 0.184, current mu = 0.146) allocation failure
[27352:0x4fca060] 47508 ms: Scavenge 1951.7 (1981.4) → 1949.4 (1982.1) MB, 6.1 / 0.0 ms (average mu = 0.184, current mu = 0.146) allocation failure
[27352:0x4fca060] 47523 ms: Scavenge 1952.5 (1982.1) → 1950.3 (1990.9) MB, 6.5 / 0.0 ms (average mu = 0.184, current mu = 0.146) allocation failure

<— JS stacktrace —>

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
1: 0xb08e80 node::Abort() [node]
2: 0xa1b70e [node]
3: 0xce1890 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
4: 0xce1c37 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
5: 0xe992a5 [node]
6: 0xe99d86 [node]
7: 0xea82ae [node]
8: 0xea8cf0 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
9: 0xeabbe5 v8::internal::Heap::HandleGCRequest() [node]
10: 0xe39287 v8::internal::StackGuard::HandleInterrupts() [node]
11: 0x11e56e5 v8::internal::Runtime_StackGuard(int, unsigned long*, v8::internal::Isolate*) [node]
12: 0x15d9c19 [node]

Hey @Federico_White,

Welcome to the community :tada\

What are your workflows doing and how much data is in the database these are things that can have an impact on memory usage. What is the nohup file being filled with or is the examples you have put below? As you are using npm which node version are you currently running?

I would start with the workflows and see what is running around the time of the crash and how much data is being loaded and worked with.

Hi, Jon thanks for coming back! I was able to fix it. The issue was certainly about increasing my memory size limits as I noticed spikes when removing old executions and running some high-intensity workflows.

The environment variable I used was:
export NODE_OPTIONS=“–max-old-space-size=4000”

Thank you!

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.