N8n Workflow Memory Bloat: Processing Daily Sales Data Causes Exponential Slowdown and Stalls

Hi all,

I’m running into a persistent issue with a workflow that processes daily sales data and updates a menu registry in a Postgres database. I’ve tried several approaches, but keep hitting a wall with memory bloat and workflow stalls. I’d appreciate any advice or best practices from the community!

Workflow Overview

  • Step 1: The workflow queries for a single day that needs processing (status pending or in_progress).

  • Step 2: It fetches all sales for that day (typically 500–1000 items).

  • Step 3: It aggregates and processes those sales, then updates the menu_registry table (which only has a few hundred items total).

  • Step 4: It updates a few other related tables.

  • Step 5: The workflow loops to process the next day.

The Problem

  • When I first start the workflow, it runs very quickly—processing 13 days the first minute. Then 7 the second. Then by minute 6 it’s only getting through 2 a minute.

  • As it continues, it slows down dramatically. Eventually, it gets stuck and can’t move to the next day.

  • I can see in the n8n UI that one node (usually after aggregation or registry update) is holding hundreds of thousands of items in memory.

  • The database itself is not growing out of control; the menu_registry table remains a few hundred rows.

1. Single Workflow Approach

  • The workflow loops internally, processing one day at a time.

  • Memory usage grows with each loop, even though each day’s data is small.

  • Eventually, the workflow stalls and must be killed.

2. Two-Workflow Approach

  • I split the process:

  • Orchestrator Workflow: Finds the next day to process and calls the “day processor” workflow.

  • Day Processor Workflow: Receives a date, processes that day’s sales, updates the registry and other tables.

  • The orchestrator would sometimes trigger the day processor for hundreds of days at once, causing a massive spike in executions and making the system unresponsive. I had to restart the server to recover.

What I’m Looking For

  • How can I design this workflow (or pair of workflows) so that each day is processed independently, memory is released after each day, and the system doesn’t get bogged down over time?

  • Is there a best practice for batch processing in n8n to avoid memory bloat when looping over many days?

  • How should I structure the data passing between workflows to avoid accidental accumulation?

  • Are there any workflow patterns or node configurations that can help with this scenario?

Extra Context

  • I’ve checked the database: no runaway growth or duplication.

  • The problem seems to be with how n8n accumulates items in memory as the workflow loops.

  • I’m using Postgres nodes, Code nodes, and some aggregation logic.

Any advice, examples, or references to similar issues would be greatly appreciated!

Thanks in advance,

Michael

Information on your n8n setup

  • 1.91.2 self hosted
  • Supabase Postgres
  • Running n8n via Docker
  • Ubuntu

You could potentially use the Loop node and set the batching of it, however with 500k+ records you’re maybe running into memory issues slowing everything down. I would personally consider a pub sub / event driven architecture here. Pump the records into a queue and then process them either in serial or concurrently in batches.

Yes, I completely agree here as well. I’m also wondering if adding worker nodes could help distribute the load more effectively and avoid bottlenecks over time.

:link: Configuring queue mode | n8n Docs

@Michael_McReynolds Have you already set up any monitoring for memory, CPU, and execution duration? It might give some early insights where the bottleneck starts.

For production-grade workflows like this, having dedicated workers is definitely a best practice. I’m also working on setting up a Grafana dashboard to track these metrics more closely.

Regarding the pub/sub or event-driven approach—I think this could be a game-changer for your use case.
Since your sales data per day is static after creation, decoupling the fetching and processing into separate executions using event producers and consumers would make the system much more resilient and scalable.

In essence:

The Orchestrator could push each day as a job/message into a queue (Kafka, Redis Streams, etc.).

Worker workflows would pick up the jobs independently, process them, and exit cleanly.

This ensures that each day is processed in its own isolated execution, releasing memory immediately after processing, avoiding the bloat seen in long-running loops.

Using Kafka or Redis Streams would also give you built-in scaling, retries, and backpressure handling.

Redis might even be enough here if you’re looking for low-latency and simpler operations.

Additionally, if you still need to keep running totals or tallies across multiple executions, you can implement an accumulator pattern:

Each worker can update an external accumulator (e.g., Postgres via ON CONFLICT updates, Redis counters, or even Kafka Streams).

This pattern lets your workflows stay stateless while still maintaining system-wide aggregates efficiently.

It also protects you from memory bloat because all aggregation/state is handled outside of n8n’s execution memory.

1 Like

I was literally thinking about queue mode now as well, however I have not personally used that yet to confirm if it would work.

1 Like

From what i’ve heard, it uses round robin (not seen officially in docs but from an n8n employee), so each execution takes place on the workers, so if you have two workers and 3 executions it processes like w1 e1 e3 | w2 e2

I’ve not processed high amount of data through it yet, but will likely soon with trading data probably so I must say thanks for the Event Driven Arch idea.

I’ve been looking into how it processes data and flow etc, but one thing to note they use bull for a messaging queue built on redis, which stores the executionIDs I believe which then the workers call the queue and process the id against the database to pull the actual code / workflow.

I really want to dig deep into the arch of n8n, any pointers be amazing :slight_smile:

@Wouter_Nigrini i’ll share the dashboard once done see if its useful :slight_smile: should be done today hopefully, it’s really for production / enterprise customer who run multi-node setups etc.

1 Like

I also need to look at this a little deeper. I decided that I would only use n8n for processing simple integrations and for its agent AI nodes and not so much for processing large amounts of critical data. I come from an integration background processing bank transactions and high volume transactions in telecoms where they use specialised platforms and tools and pub sub used to be a central decision in architectures.

Another similar topic which came up now potentially using queue mode with multiple instances:

1 Like

Thanks, there is some good information in here. Workers may be the right answer. It’s too bad you can’t just run a workflow on a loop and not have to worry about it.

The frustrating part of this is that the building of the menu registry only has to happen one time for each restaurant. After that it will just run the workflow once after it gets the daily sales data to update the registry. Even the biggest restaurant won’t have enough sales to cause it to bog down.

It’s very specifically how n8n handles this specific loop. You got me thinking about other options though. There is no reason the initialization of the menu registry has to take place in n8n. Are there some other tools you could recommend?

Hi, I would first troubleshoot your logic / setup before your start doing anything with queue mode ) Imho there is somewhere a problem with your data flows as in your screenshot there are 550k items output from your node (explaining the exponential slowdown and stand stills after a short while. Since the problem is most likely incorrect logic and data flow in one big main loop, queue mode will not help since the data cannot be divided between workers and each individual worker will have the same resource limit as the main. My advice: break up your workflow into logical sub workflows and makr sure that each sub workflow only gets the minimum req. Data as input and only outputs what is required. In addition make sure to any read / writes / … From external systems (like dB etc are optimized) e.g. don’t insert record per record by combine in batches.

There are many things you can do before resorting to queue mode. And as mentioned even by going all in on queue mode you need to divide things up before it can run anything efficienty

Reg
J.

I think this is good advice. Thanks for your help. After chasing down some nasty performance issues, I wanted to share what finally fixed things for us. Maybe it’ll save someone else a few head-scratching hours.

1) Don’t pour a list straight into a PostGreSQL node

The problem: Every item in an incoming array triggers its own query, so a single run can explode from 10 → 100 → 600 → 10 000 individual calls.

Quick fix: Collapse the array to one item before the SQL node.

// Function node: reduce all items to a single object
return [{ payload: items.map(i => i.json) }];

With one consolidated item, the SQL node fires only once per workflow run.

2) Internal “while-loops” inflate memory and runtime

Even after trimming 90 % of the logic, a self-looping workflow still grew slower on every cycle. My guess: each pass keeps more execution data in memory until it drags.

The pair of lower dots are the days the restaurant was closed.

However, instead of looping the workflow directly, when I linked the end of the workflow to a scheduler workflow and had the scheduler just re trigger the main workflow it flattened the line.

So the solution really is to not loop workflows.

Alternatively, and probably the better solution, is to not process data using n8n. But rather use some traditional SQL scripting. However, I didn’t really think about this until after I spent days and days building the workflow. D’oh!

2 Likes

Hi, or use sub workflows. I think it will do a better job at memory management (as it is handled as a new execution, I imagine it will have the same result as a scheduled trigger, which.is.also.just a new execution) Also did you try to disable the saving of execution data on the workflow itself?

Reg
J.

1 Like