Your question is super interesting. I have on my to-do list a benchmark to really crunch the numbers of how n8n scales. Also as part of this task comes the documentation process.
You got the overall idea, but I’m not sure about elasticache. API gateway is also not mandatory as a simple application load balancer should be enough.
My recommendation is that you set up ECS with 3 different tasks and a few services:
- Postgres 13+ as a shared database for all your workflows
- A Redis cluster that will be also used by all your n8n instances
- 1 task running the n8n default process in queue mode and webhooks disabled. This will cause this "main’ process to run forever and getting restarted if necessary. This one should not have replication i.e. should be a single instance of n8n
- 1 task running worker processes that can have multiple instances on the same machine. Each worker uses 1 process therefore running multiple instances helps you better use your resources that scale based on CPU / network
- 1 task running webhook processes that can also have multiple copies in the same host. Just like the workers, these are single processes so running multiple instances helps better use your resources
So what happens in the end is that your main n8n instance will be responsible for triggering workflows that are not webhook based, like crons, polls, etc. Everything else will be run on workers or webhook nodes. For this reason, this instance cannot and should not be scaled as this would cause duplication of work.
Your main instance is also your entry point to n8n, allowing you to edit workflows, view executions, etc.
If you have any more questions, feel free to add. I believe this community post can become a great place for a future documentation on deployment practices as your questions will help build this document.