Scaling n8n while maintaining API response time performance

Background

For my use case, I require maintaining a low workflow “scaffolding” overhead (i.e. empty webhook trigger + response time), while scaling to a high volume (potentially hundreds of simultaneous workflow executions).

Based on other forum posts (e.g. API Response Performance) and my own personal testing of bare bones webhook workflows individually / in series, the n8n scaffolding appears sufficiently performant - anywhere from the ~20ms to 50ms avg response time range.

My challenge is this: how can I maintain this response time while increasing the volume of simultaneous / parallel requests by leveraging n8n scaling functionality / queue mode (or any additional techniques)?

In my current testing / configuration, I am not able to achieve the desired performance, and in fact, notice some counterintuitive behavior. In testing, I see significant performance increases when scaling from simply [1 worker + 1 webhook processor] up to around [10 workers + 10 webhook processors]. But even in this configuration, I am not able to get to nearly the level of mean response time as I would see when testing individually / in series (I typically test with 100 simultaneous requests).

What is even more interesting, is that as I scale beyond the ~10X range (I tested all the way up to [50 workers + 50 webhook processors]), performance flatlines or even worsens a bit, with mean response time starting to increase a little.

This is counterintuitive to me, as I would expect that if I was testing 100 requests in parallel, as I scaled up towards a maximum of 100 workers / webhook processors, I should be seeing the same performance as in series.

I am using AWS ECS Fargate to launch services with target number of tasks to manipulate quantities of worker / webhook instances. (Also, I have confirmed in the logs that tasks are properly getting distributed across the worker / webhook instances such that they are all being leveraged)

Questions

-Are my requirements feasible for n8n to begin with?
-What potential bottlenecks could be limiting performance and what could I do to remedy?
-Any other ideas to meet my performance goals?

Please share your workflow

My testing workflow is simply two nodes: webhook trigger + respond to webhook.

And FYI, here are some results from my testing:

100 requests in series:

100 requests in parallel - 1 worker + 1 webhook processor:

100 requests in parallel - 10 workers + 10 webhook processors:

100 requests in parallel - 50 workers + 50 webhook processors:

Information on your n8n setup

  • n8n version: 1.18.0
  • Database (default: SQLite): Postgres
  • n8n EXECUTIONS_PROCESS setting (default: own, main): queue
  • Running n8n via (Docker, npm, n8n cloud, desktop app): Docker-image based self hosted setup using AWS ECS Fargate
  • Operating system: Linux

@krynble would you be able to advise? based on your other posts, you seem to be the expert in this domain. I also based my configuration on the directions from your scaling n8n youtube tutorial.

Hey @matt thank you for your well detailed post.

The main reason n8n starts to show performance degradation with multiple webhook + worker processes is the fact that a lot of our scaling is based on heavy database usage.

In order to avoid bloating Redis (since we use it as a broker only) we rely on reading and writing executions to database, which has some serious performance implications when it comes to heavy loads.

@netroy made serious improvements to allow n8n to receive around 800 req/s while keeping a somewhat constant 40ms response time, with the caveat that the webhook node should respond immediately, so that we are only adding messages to the queue but still not processing them at the time.

We are trying to reduce the database reliance and even optimize how it happens, but it’s a complicated topic.

If your use case would allow you to receive an http request, store it and immediately respond, while processing it asynchronously then maybe @netroy 's changes could help you.

@netroy are your changes merged already?

There is an AWS EKS setup you can try @matt

Thanks @krynble for your response.

My actual use case will require more logic in between webhook trigger and response (call other db’s, 3rd party apps, data manipulation) and ultimately require fast synchronous responses. I only wanted to first validate that the “scaffolding” itself would not be a bottleneck, which it appears to be.

So to confirm, to try to meet my requirements as best as possible, the alternative AWS EKS setup you linked is a possible improvement? Are there any other levers I have, any changes on the database side (sounds like this read/write performance is the bottleneck - I guess I might have some opportunity on AWS RDS Postgres side?) or some other configuration changes that might help?

Definitely scaling the database will help you get better results as n8n relies heavily on it, especially the execution_entity and execution_data tables.

Also you might want to tweak DB_POSTGRESDB_POOL_SIZE which defaults to 2 but we got interesting results changing it to 4. Worth testing.

Depending on the number of instances you use you might want to consider using pgBouncer for connection pooling as well.

I hope with those changes you can achieve the results you’re looking for.

Thanks. Scaling the database definitely helps. When I test AWS’ most performant (and expensive) database, I see significantly improved absolute times as well as continued improving response times with more webhook/worker instances up to around ~50X (performance still plateaus around there with no further improvement from ~50X to ~100X). I am still not able to achieve the in-series level of response times, however.

It will require more work from my end to determine if this performance will be sufficient for my use case, what is the minimum db scale/cost to see the benefits, and if that cost is acceptable. But at least I understand what the bottleneck is to achieve these performance goals with n8n in its current form. (FYI, changing DB_POSTGRESDB_POOL_SIZE from 2 => 4 did not seem to show any improvement for me.)

Would be interesting to see if there could be some n8n mode / configuration that would support this type of high volume + fast response time scaling better in the future, maybe like also using Redis for reading/writing executions or some other way to remove the db bottleneck.

FYI, for anyone curious, with AWS db.r6id.32xlarge:

100 requests in parallel - 10 workers + 10 webhook processors:

100 requests in parallel - 50 workers + 50 webhook processors:

2 Likes