Describe the issue/error/question
We have a setup with “main” mode. In that setup we have dozens of scenarios, some are scheduled, some are triggered by webhooks.
If I just run load test on simple “wait for 5s and make http request” test scenario, I’m able to get good results, i.e. many executions happens at the same time properly, no CPU load, no memory load, all’s well.
But. We have this particular ‘complicated’ scenario, which runs for ~15s and does some HTTP queries, some cyclic logic, whatever. It’s scheduled to run every minute.
If it runs, it immediately blocks all parallel executions. Which can be seen on this load test:
> hey -z 3m -c 20 https://...
Summary:
Total: 182.5697 secs
Slowest: 18.1360 secs
Fastest: 5.6872 secs
Average: 7.1627 secs
Requests/sec: 2.7880
Response time histogram:
5.687 [1] |
6.932 [428] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
8.177 [20] |■■
9.422 [0] |
10.667 [0] |
11.912 [0] |
13.156 [9] |■
14.401 [2] |
15.646 [18] |■■
16.891 [1] |
18.136 [30] |■■■
Latency distribution:
10% in 5.7676 secs
25% in 5.8591 secs
50% in 5.9584 secs
75% in 6.1272 secs
90% in 13.6303 secs
95% in 17.2409 secs
99% in 17.9711 secs
Details (average, fastest, slowest):
DNS+dialup: 0.0287 secs, 5.6872 secs, 18.1360 secs
DNS-lookup: 0.0002 secs, 0.0000 secs, 0.0092 secs
req write: 0.0000 secs, 0.0000 secs, 0.0043 secs
resp wait: 6.6999 secs, 5.3913 secs, 17.8255 secs
resp read: 0.3636 secs, 0.2820 secs, 0.9104 secs
Status code distribution:
[200] 509 responses
So, as you can see, we run here 20 concurrent web hook queries, for 3 minutes.
Majority of them finishes just fine in 7-8 seconds. But there are outliers — that respond in 13-18 seconds. Those are executions which happened to coincide with “complicated” scenario.
I suppose the problem lies in my scenario, and it somehow takes whole nodejs process resources to itself, blocking any i/o operations. Why is that?
Please share the workflow
Here’s our ‘complicated’ scenario: Complicated n8n scenario · GitHub
(couldn’t paste it here, its too long)
Information on your n8n setup
- n8n version: 0.199
- Database you’re using (default: SQLite): Postgres
- Running n8n with the execution process [own(default), main]: main
- Running n8n via [Docker, npm, n8n.cloud, desktop app]: Docker
We also tried queue mode — doesn’t help, since it overtakes whole worker (bad solution we have right now is having 3 workers with concurrency=1 — that way we guarantee that 3 concurrency are available).
I guess that “own” mode would work, but since the problem is with this particular scenario, I would want to know why does it happen.
Thanks!