How are you handling burst traffic / sequential execution?

Describe the problem/error/question

Curious how people are handling this, I have workflows that occasionally get a burst of triggers close together, and I want them to queue up and run one at a time instead of in parallel.
What setups have worked well for you? Queue mode with a single worker? Some per-workflow setting? Anything in front of the webhook? Any specific node set up?
Also curious if the answer differs between n8n Cloud and self-hosted on a VPS — does one make this easier than the other, or are people solving it the same way regardless of setup?

What is the error message (if any)?

Please share your workflow

(Select the nodes on your canvas and use the keyboard shortcuts CMD+C/CTRL+C and CMD+V/CTRL+V to copy and paste the workflow.)

Share the output returned by the last node

Information on your n8n setup

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

n8n doesn’t have a per-workflow “run one at a time” toggle. Queue mode is built for scaling (more parallelism), not sequential execution. Common approaches: set N8N_CONCURRENCY_PRODUCTION_LIMIT=1 (instance-wide), check /executions-current API at workflow start and exit if running.

Ainoflow Shield is a cleaner option - distributed execution locks that guarantee one-at-a-time. One HTTP Request at the start: 200 = proceed, 409 = already running.

@Noahh decouple trigger from work. Webhook inserts payload into a queue (Postgres row, Redis list) and returns 200 instantly. A Schedule Trigger pops one at a time, so bursts can’t pile up. Same on cloud/self-hosted.

Schedule Trigger (every 10s) → Postgres: DELETE FROM jobqueue WHERE id = (SELECT id FROM jobqueue ORDER BY created_at LIMIT 1) RETURNING payload; → Execute Workflow (your worker)

Swap in Postgres creds and the worker workflow id.

good day @Noahh
I’d start with the native concurrency control before adding locks or extra services. If you are self-hosting, N8N_CONCURRENCY_PRODUCTION_LIMIT=1 is probably the simplest setup because n8n will queue production executions instead of running them in parallel. On n8n Cloud, that environment variable is not available, so I’d handle it by separating the webhook from the actual work and processing the saved requests with a controlled worker flow.

We solved this in production with a PostgreSQL buffer table. Incoming webhooks insert into wa_msg_buffer(chat_id, message_id, content, inserted_at), then a Wait node holds for 10 seconds, then we SELECT all buffered messages for that chat_id, aggregate them into a single string, and DELETE the buffer entries. This batches rapid-fire messages naturally without any queue infrastructure. Works better than Redis for our use case since we already have Postgres running and it survives n8n restarts. The Wait node is the key — it gives the buffer time to collect multiple messages from the same conversation before the AI processes them.

This is a bottleneck but there are 2 ways

  1. if you are self hosted, the most robust way to handle is via queue mode with redis, setting the workers to 1(–currency-1)
  2. You can also use the cloud approach using the buffer workflow.