Performance issues with Agent node and memory in Redis/Postgres

Hi everyone! :wave:

I’m experiencing a performance issue and I’d love to see if anyone else has encountered something similar or can point me in the right direction. :thinking:

I have a simple scenario with a webhook input, followed by an agent node with memory configured in Redis. I’m stress testing the endpoint, and everything works fine at first, but then it starts to slow down significantly.

I’ve checked several things to make sure everything is configured correctly:

  1. Redis: CPU and RAM usage never exceed 1%, so it seems like resources are not the issue.
  2. The maximum number of Redis clients is set to 100,000, so that shouldn’t be a limitation.
  3. I’ve also tested using PostgreSQL instead of Redis, and the same issue happens: it works fine at first, but then starts to slow down.
  4. I even deployed a database on Google Cloud Platform (using their configuration and hardware), and the problem persists.

The weird part is that when I disconnect the memory, the stress tests run smoothly without any performance issues.

Has anyone experienced something similar or have any ideas on what could be causing this bottleneck? Any suggestions would be greatly appreciated! :pray:

Thanks in advance for your help. Looking forward to hearing from you all! :blush:

Hi,

Just to clarify:
You are not using queue mode?
Define stress test? How many req per second?
Where is everything running?
After approx. How much time/req. Does it slow down?

Reg
J.

Hi J.,

Thanks again for following up! :blush: Here’s some more detail based on the latest tests:

  • I’m not using queue mode (at least not explicitly — just a webhook trigger followed by an agent node with memory).
  • We’re running a stress test that lasts 2 minutes, starting with 10 concurrent users and ramping up to 100 users, increasing in steps of 10.
  • During the first test run, right after the memory is connected, the results are fairly decent: around 3s to 6s per request with ~15 RPS.
  • However, on subsequent runs, performance drops significantly — it doesn’t go above 5 RPS, and response times spike to 20s to 40s per request :grimacing:
  • Setup-wise, everything is running in Docker on a 16-core, 62GB RAM VM. The n8n container has 16GB RAM and 10 CPUs, and the Redis container has 8GB RAM and 4 CPUs.
  • I also tried swapping Redis for Postgres, and even tested with managed databases on Google Cloud, and the issue remains the same.

Let me know if you’d like any logs or more data. Really appreciate your help :pray:

Hi, I think the limiting factor is the number of execution that can run simultaneously. In a queue environment the default concurrency per worker is 10. I can only assume that the main must have a similar limitation (if not less). I think the easiest way to find out is reading the execution entity table and check for their status. I think if you have many jobs with néw status and no startedat time they are “queued”. In this way you can also check how many have “,running” status at any given time.

The only thing that I cannot explain is why it works when there is no memory attached …
One test would be to have memory attached but to set it with size of 1 instead of the default 5.
Also keep the prompt to a min. Just to be see a diff

Reg,
J.

Hi J.,

Yes, that’s exactly what’s puzzling me :thinking: — if the bottleneck was only due to the number of concurrent executions, I’d expect the behavior to be the same regardless of whether memory is attached or not. But strangely enough, once I remove the memory, everything flows smoothly again.

I also tried setting the memory size to 1 (instead of the default 5), just to test if it made any difference… but unfortunately, the same slowdown still happens.

Thanks a lot for the insights though — I’ll dig into the execution entity table as you suggested and see what I find there! :bulb:

Cheers,