Workflow Randomly Reprocessing Old Queue Jobs After Worker Restart

Decoure_Ryan · May 20, 2026, 11:21pm

Hi everyone,
I’m running n8n in queue mode with multiple workers, and I’m seeing a strange issue after restarting workers or deployments.
My setup:Webhook → Queue → Worker → Process → Database
Sometimes after a worker restart:
Old jobs get processed again
Some jobs appear “stuck” and later retry unexpectedly
A few executions create duplicate DB writes/API calls
I already use retries and basic error handling, but I think the issue is related to how jobs are acknowledged or recovered after worker crashes.
Example processing logic:if ($json.status !== “processed”) {
// continue processing
}
I’m trying to understand:
• How n8n queue mode handles unfinished jobs after restart
• Whether jobs are re-queued automatically
• Best way to make workflows safe against duplicate execution after crashes
For people using queue mode in production:
• What’s the recommended pattern for crash recovery and idempotent processing?

Describe the problem/error/question

What is the error message (if any)?

Please share your workflow

(Select the nodes on your canvas and use the keyboard shortcuts CMD+C/CTRL+C and CMD+V/CTRL+V to copy and paste the workflow.)

Share the output returned by the last node

Information on your n8n setup

n8n version:
Database (default: SQLite):
n8n EXECUTIONS_PROCESS setting (default: own, main):
Running n8n via (Docker, npm, n8n cloud, desktop app):
Operating system:

Niffzy · May 20, 2026, 11:31pm

Hi there @Decoure_Ryan What you’re seeing is usually normal behavior in queue mode. If a worker crashes or restarts before a job is fully completed/acknowledged, the queue can mark that job as unfinished and reprocess it later. That’s why you’re seeing old jobs run again.

Try to Assume jobs can run more than once and make processing idempotent.

For example, before processing:if ($json.status === “processed”) {
return ;
}

And use DB-level protection like:ON CONFLICT DO NOTHING

or unique keys to prevent duplicate inserts.

Common production ways you can apply
Queue handles retries/recovery
Database handles deduplication/idempotency
Workers stay stateless

nguyenthieutoan · May 21, 2026, 2:50pm

Welcome @Decoure_Ryan to our community! I’m Jay and I am a n8n verified creator.

To add to what Niffzy said - the root cause is Bull’s “stalled job” recovery mechanism. When a worker restarts without gracefully completing a job, Bull marks that job as stalled after QUEUE_BULL_STALLED_INTERVAL milliseconds (default 30000ms) and re-queues it. You can tune this with QUEUE_BULL_MAX_STALLED_COUNT=1 to limit how many times a stalled job gets retried, and QUEUE_BULL_STALLED_INTERVAL to control the detection window. For idempotency at the n8n level, use $getWorkflowStaticData or a DB status check at the very start of the workflow to short-circuit if the execution_id was already processed. Setting a unique constraint on execution_id in your DB is the most reliable safeguard.

tamy.santos · May 22, 2026, 6:02pm

Welcome to the n8n community @Decoure_Ryan

Redis acts as a broker and workers execute the jobs, but I wouldn’t assume exactly-once guarantees; after crash/restart, treat it as at-least-once, review N8N_GRACEFUL_SHUTDOWN_TIMEOUT and design your workflow to handle reprocessing.

(I’m not shouting, just emphasizing )
ALWAYS USE A UNIQUE KEY
99% of issues could be solved with that

Decoure_Ryan · May 23, 2026, 1:10pm

Thanks @Niffzy @nguyenthieutoan for the reply

nguyenthieutoan · May 26, 2026, 8:40am

Great breakdown from @syed_noor. One thing I’d add: BullMQ also has a lockDuration setting (default 30s) — if your workflow takes longer than that, the lock expires and the job gets marked stalled even while still running. You can raise it via QUEUE_BULL_STALLED_INTERVAL as mentioned, but also make sure lockDuration is set appropriately in your BullMQ config.

Also worth noting — the Postgres idempotency key approach is the most reliable pattern I’ve seen in production. Combine it with n8n’s “Stop and Error” node after the INSERT check to cleanly exit duplicate runs without polluting your error logs.

Wheel · May 26, 2026, 11:52am

Thanks brudda i was having the same problemo and this helped out have a blessed day!!!

syed_noor · May 26, 2026, 6:48pm

Good addition on the lockDuration distinction — I should have called that out separately. The
QUEUE_BULL_STALLED_INTERVAL controls how often the checker runs, but lockDuration controls how long a job can be active before it’s considered stalled. Both need to exceed your longest workflow execution time.

The Stop and Error node tip is solid too. I use that after the idempotency INSERT with the message set to the job_id —that way when you review executions in n8n, you can immediately see which ones were legitimate duplicates vs actual failures. Keeps the execution list clean instead of showing false-positive errors.

For anyone implementing this pattern, I wrote a more detailed breakdown of all six production-readiness dimensions (idempotency is just one of them) here:

nguyenthieutoan · May 27, 2026, 8:20am

The job_id in the Stop and Error message is a smart touch - makes triage much faster when you’re scanning executions. One more thing worth adding on top of this pattern: set the continueOnFail on the idempotency check node and route the “already processed” path to a No-op node with a clear name (e.g. “DUPLICATE - skipped”), rather than relying solely on the error path. Keeps the execution graph readable and separates expected skips from actual failures at a glance.

Topic		Replies	Views
N8n Workflow Fails to Resume Safely After Partial Execution (Idempotency & Checkpointing Issue) Questions	4	86	April 30, 2026
Workflow picked up multiple times by multiple workers Questions core	7	662	November 30, 2023
Executions were success, after restart the status changes to crashed Questions	3	150	February 6, 2026
Worker Crash Mid Execution in queue Questions	1	60	March 11, 2026
Duplicate Executions and Job Stalling Issues Questions core	5	549	July 18, 2024