Hi All,
After upgrading my self-hosted n8n instance to v2.x (in queue mode), I’m noticing that the number of executions that are failing has increased with no material changes to my infrastructure.
I’m not sure why this happened, but there was some sort of fundamental change in the v2.x logic which has bumped my execution error rates from like (1-2% on v1.x) to like (5-20% on v2.x).
Can someone from the development team comment on whether or not this is a known issue and whether there’s any active investigation on resolving this issue?
Thanks!
Hi @Darien_Kindlund,
Have you read through the breaking changes in the documentation to see if there are any changes you need to make to avoid previously working workflows from breaking in version 2?
Also check the migration tool under settings if it is reporting anything you need to action
If yes, then can you share one or two workflow errors here so we can see the type of errors you are now getting
@Wouter_Nigrini , yes, I’ve read through the breaking changes. None of my workflows use sub-workflows, so that’s not the problem.
I’m about 95% confident, though, that this change is likely the culprit (Remove QUEUE_WORKER_MAX_STALLED_COUNT):
Unfortunately, there is nothing I can do to workaround this problem, as this appears to be a fundamental change to how n8n is performing queue retry logic.
Almost ALL of the errors look like this:
Error: This execution failed to be processed too many times and will no longer retry. To allow this execution to complete, please break down your workflow or scale up your workers or adjust your worker settings. at /usr/local/lib/node_modules/n8n/src/workflow-runner.ts:438:15 at processTicksAndRejections (node:internal/process/task_queues:105:5)
These workflows run easily within 5 mins or less, but 20% of the time, these errors get thrown across almost every workflow randomly since upgrading to n8n v2.x.
Please can you share your docker compose. PS; remove any secrets, passwords, etc.
What is the concurrency you have and how many workers are running?
Hi @Wouter_Nigrini , I use Google Cloud Run to deploy the main node as a service – here’s the corresponding YAML for that (sensitive info was REDACTED):
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: prod-n8n
namespace: 'REDACTED'
selfLink: /apis/serving.knative.dev/v1/namespaces/REDACTED/services/prod-n8n
uid: REDACTED
resourceVersion: REDACTED
generation: 88
creationTimestamp: '2024-10-10T20:25:03.694424Z'
labels:
run.googleapis.com/satisfiesPzs: 'true'
cloud.googleapis.com/location: us-east4
annotations:
serving.knative.dev/creator: darien@REDACTED
serving.knative.dev/lastModifier: [email protected]
run.googleapis.com/ingress: all
run.googleapis.com/operation-id: REDACTED
run.googleapis.com/ingress-status: all
run.googleapis.com/invoker-iam-disabled: 'true'
run.googleapis.com/minScale: '1'
run.googleapis.com/urls: '["https://prod-n8n-REDACTED.us-east4.run.app","https://prod-n8n-REDACTED-uk.a.run.app"]'
spec:
template:
metadata:
labels:
owner: darien
purpose: n8n
team: REDACTED
client.knative.dev/nonce: REDACTED
run.googleapis.com/startupProbeType: Custom
annotations:
run.googleapis.com/sessionAffinity: 'false'
run.googleapis.com/vpc-access-egress: private-ranges-only
autoscaling.knative.dev/minScale: '1'
n8n.auto.redeploy/start-date: '2026-01-02T18:00:11.497Z'
run.googleapis.com/execution-environment: gen2
autoscaling.knative.dev/maxScale: '1'
run.googleapis.com/network-interfaces: '[{"network":"default","subnetwork":"default","tags":["n8n"]}]'
run.googleapis.com/cpu-throttling: 'false'
run.googleapis.com/startup-cpu-boost: 'true'
spec:
containerConcurrency: 480
timeoutSeconds: 900
serviceAccountName: [email protected]
containers:
- name: n8n-1
image: us-east4-docker.pkg.dev/REDACTED/ghcr/n8n-io/n8n:stable
ports:
- name: http1
containerPort: 5678
env:
- name: N8N_VERSION
value: latest
- name: N8N_USER_FOLDER
value: /opt/n8n
- name: NODE_OPTIONS
value: --max-old-space-size=3072
- name: GENERIC_TIMEZONE
value: America/Los_Angeles
- name: EXECUTIONS_DATA_MAX_AGE
value: '168'
- name: EXECUTIONS_DATA_PRUNE
value: 'true'
- name: EXECUTIONS_DATA_PRUNE_MAX_COUNT
value: '100000'
- name: EXECUTIONS_MODE
value: queue
- name: EXECUTIONS_TIMEOUT
value: '2700'
- name: EXECUTIONS_TIMEOUT_MAX
value: '2700'
- name: N8N_DEFAULT_BINARY_DATA_MODE
value: database
- name: N8N_DISABLE_PRODUCTION_MAIN_PROCESS
value: 'false'
- name: N8N_EDITOR_BASE_URL
value: https://n8n.REDACTED
- name: N8N_HIRING_BANNER_ENABLED
value: 'false'
- name: N8N_HOST
value: n8n.REDACTED
- name: N8N_LOG_LEVEL
value: info
- name: N8N_PROXY_HOPS
value: '1'
- name: N8N_PUSH_BACKEND
value: websocket
- name: WEBHOOK_URL
value: https://n8n.REDACTED
- name: DB_POSTGRESDB_DATABASE
value: prod-n8n-REDACTED
- name: DB_POSTGRESDB_HOST
value: REDACTED.us-east-1.rds.amazonaws.com
- name: DB_POSTGRESDB_PORT
value: '5432'
- name: DB_POSTGRESDB_SCHEMA
value: public
- name: DB_POSTGRESDB_SSL_CA
value: REDACTED.us-east-1.rds.amazonaws.com
- name: DB_POSTGRESDB_SSL_REJECT_UNAUTHORIZED
value: 'false'
- name: DB_POSTGRESDB_USER
value: prod-n8n
- name: DB_TYPE
value: postgresdb
- name: N8N_SMTP_HOST
value: email-smtp.us-east-1.amazonaws.com
- name: N8N_SMTP_PORT
value: '465'
- name: N8N_SMTP_SENDER
value: noreply-n8n@REDACTED
- name: N8N_SMTP_USER
value: REDACTED
- name: DB_POSTGRESDB_CONNECTION_TIMEOUT
value: '30000'
- name: DB_POSTGRESDB_POOL_SIZE
value: '120'
- name: N8N_RUNNERS_ENABLED
value: 'true'
- name: N8N_RUNNERS_MODE
value: external
- name: N8N_RUNNERS_BROKER_PORT
value: '5679'
- name: N8N_RUNNERS_BROKER_LISTEN_ADDRESS
value: 127.0.0.1
- name: N8N_RUNNERS_TASK_TIMEOUT
value: '1800'
- name: N8N_RUNNERS_HEARTBEAT_INTERVAL
value: '1800'
- name: N8N_BLOCK_ENV_ACCESS_IN_NODE
value: 'false'
- name: N8N_GIT_NODE_DISABLE_BARE_REPOS
value: 'true'
- name: NO_COLOR
value: 'true'
- name: QUEUE_BULL_REDIS_HOST
value: 172.REDACTED
- name: QUEUE_BULL_REDIS_PORT
value: '6378'
- name: OFFLOAD_MANUAL_EXECUTIONS_TO_WORKERS
value: 'true'
- name: QUEUE_BULL_REDIS_TLS
value: 'true'
- name: NODE_EXTRA_CA_CERTS
value: /opt/n8n-ca/server-ca.pem
- name: N8N_EVENTBUS_LOGWRITER_LOGBASENAME
value: audit/n8nEventLog
- name: QUEUE_WORKER_LOCK_DURATION
value: '2100000'
- name: QUEUE_WORKER_LOCK_RENEW_TIME
value: '60000'
- name: QUEUE_WORKER_STALLED_INTERVAL
value: '120000'
- name: N8N_CONCURRENCY_PRODUCTION_LIMIT
value: '-1'
- name: N8N_RESTRICT_FILE_ACCESS_TO
value: /opt/n8n
- name: N8N_ENCRYPTION_KEY
valueFrom:
secretKeyRef:
key: latest
name: prod_n8n_secretkey
- name: DB_POSTGRESDB_PASSWORD
valueFrom:
secretKeyRef:
key: latest
name: prod_n8n_dbpassword
- name: N8N_SMTP_PASS
valueFrom:
secretKeyRef:
key: latest
name: prod_n8n_ses_password
- name: N8N_RUNNERS_AUTH_TOKEN
valueFrom:
secretKeyRef:
key: latest
name: prod_n8n_runners_auth_token
- name: QUEUE_BULL_REDIS_PASSWORD
valueFrom:
secretKeyRef:
key: latest
name: prod_n8n_redis_password
resources:
limits:
cpu: 2000m
memory: 4Gi
volumeMounts:
- name: prod-n8n
mountPath: /opt/n8n
- name: prod-n8n-runners
mountPath: /opt/n8n-ca
- name: in-memory-1
mountPath: /opt/n8n/.n8n/audit
livenessProbe:
initialDelaySeconds: 120
timeoutSeconds: 20
periodSeconds: 180
failureThreshold: 3
httpGet:
path: /
port: 5678
startupProbe:
initialDelaySeconds: 60
timeoutSeconds: 45
periodSeconds: 60
failureThreshold: 10
tcpSocket:
port: 5678
volumes:
- name: prod-n8n
csi:
driver: gcsfuse.run.googleapis.com
volumeAttributes:
bucketName: prod-n8n
- name: prod-n8n-runners
csi:
driver: gcsfuse.run.googleapis.com
readOnly: true
volumeAttributes:
bucketName: prod-n8n-runners
- name: in-memory-1
emptyDir:
medium: Memory
sizeLimit: 20M
traffic:
- percent: 100
latestRevision: true
status:
observedGeneration: 88
conditions:
- type: Ready
status: 'True'
lastTransitionTime: '2026-01-02T18:02:09.993294Z'
- type: ConfigurationsReady
status: 'True'
lastTransitionTime: '2026-01-02T18:01:04.134686Z'
- type: RoutesReady
status: 'True'
lastTransitionTime: '2026-01-02T18:02:09.960940Z'
latestReadyRevisionName: prod-n8n-REDACTED
latestCreatedRevisionName: prod-n8n-REDACTED
traffic:
- revisionName: prod-n8n-00088-2rc
percent: 100
latestRevision: true
url: https://prod-n8n-REDACTED-uk.a.run.app
address:
url: https://prod-n8n-REDACTED-uk.a.run.app
For the worker nodes, here’s the corresponding YAML. I specifically set concurrency to be 1, because I have a sidecar container that acts as an external task runner. I usually spin up between 14 to 30 instances of worker nodes (which worked just fine with n8n v1.x):
apiVersion: run.googleapis.com/v1
kind: WorkerPool
metadata:
name: n8n-workers
namespace: 'REDACTED'
selfLink: /apis/run.googleapis.com/v1/namespaces/REDACTED/workerpools/n8n-workers
uid: REDACTED
resourceVersion: REDACTED
generation: 9948
creationTimestamp: '2025-10-26T01:20:44.603565Z'
labels:
run.googleapis.com/satisfiesPzs: 'true'
cloud.googleapis.com/location: us-east4
annotations:
serving.knative.dev/creator: darien@REDACTED
serving.knative.dev/lastModifier: [email protected]
run.googleapis.com/launch-stage: BETA
run.googleapis.com/operation-id: REDACTED
run.googleapis.com/scalingMode: manual
run.googleapis.com/manualInstanceCount: '12'
spec:
template:
metadata:
labels:
client.knative.dev/nonce: REDACTED
annotations:
run.googleapis.com/vpc-access-egress: private-ranges-only
n8n.auto.redeploy/start-date: '2026-01-02T18:15:37.631Z'
run.googleapis.com/execution-environment: gen2
run.googleapis.com/network-interfaces: '[{"network":"default","subnetwork":"default","tags":["n8n"]}]'
run.googleapis.com/container-dependencies: '{"n8n-runners-1":["n8n-workers-1"]}'
spec:
serviceAccountName: [email protected]
containers:
- name: n8n-workers-1
image: us-east4-docker.pkg.dev/REDACTED/ghcr/n8n-io/n8n:stable
args:
- worker
- --concurrency=1
env:
- name: N8N_VERSION
value: latest
- name: N8N_USER_FOLDER
value: /opt/n8n
- name: NODE_OPTIONS
value: --max-old-space-size=6144
- name: GENERIC_TIMEZONE
value: America/Los_Angeles
- name: EXECUTIONS_DATA_MAX_AGE
value: '168'
- name: EXECUTIONS_DATA_PRUNE
value: 'true'
- name: EXECUTIONS_DATA_PRUNE_MAX_COUNT
value: '100000'
- name: EXECUTIONS_MODE
value: queue
- name: EXECUTIONS_TIMEOUT
value: '2700'
- name: EXECUTIONS_TIMEOUT_MAX
value: '2700'
- name: N8N_DEFAULT_BINARY_DATA_MODE
value: database
- name: N8N_DISABLE_PRODUCTION_MAIN_PROCESS
value: 'true'
- name: N8N_EDITOR_BASE_URL
value: https://n8n.REDACTED
- name: N8N_HIRING_BANNER_ENABLED
value: 'false'
- name: N8N_LOG_LEVEL
value: info
- name: N8N_PROXY_HOPS
value: '1'
- name: N8N_PUSH_BACKEND
value: websocket
- name: DB_POSTGRESDB_DATABASE
value: prod-n8n-conf
- name: DB_POSTGRESDB_HOST
value: REDACTED.us-east-1.rds.amazonaws.com
- name: DB_POSTGRESDB_PORT
value: '5432'
- name: DB_POSTGRESDB_SCHEMA
value: public
- name: DB_POSTGRESDB_SSL_CA
value: REDACTED.us-east-1.rds.amazonaws.com
- name: DB_POSTGRESDB_SSL_REJECT_UNAUTHORIZED
value: 'false'
- name: DB_POSTGRESDB_USER
value: prod-n8n
- name: DB_TYPE
value: postgresdb
- name: N8N_SMTP_HOST
value: email-smtp.us-east-1.amazonaws.com
- name: N8N_SMTP_PORT
value: '465'
- name: N8N_SMTP_SENDER
value: noreply-n8n@REDACTED
- name: N8N_SMTP_USER
value: REDACTED
- name: DB_POSTGRESDB_CONNECTION_TIMEOUT
value: '30000'
- name: DB_POSTGRESDB_POOL_SIZE
value: '6'
- name: N8N_RUNNERS_ENABLED
value: 'true'
- name: N8N_RUNNERS_MODE
value: external
- name: N8N_RUNNERS_BROKER_LISTEN_ADDRESS
value: localhost
- name: N8N_RUNNERS_BROKER_PORT
value: '5679'
- name: N8N_RUNNERS_BROKER_LISTEN_ADDRESS
value: 127.0.0.1
- name: N8N_RUNNERS_MAX_CONCURRENCY
value: '4'
- name: N8N_RUNNERS_TASK_TIMEOUT
value: '1800'
- name: N8N_RUNNERS_HEARTBEAT_INTERVAL
value: '1800'
- name: N8N_BLOCK_ENV_ACCESS_IN_NODE
value: 'false'
- name: N8N_GIT_NODE_DISABLE_BARE_REPOS
value: 'true'
- name: NO_COLOR
value: 'true'
- name: QUEUE_BULL_REDIS_HOST
value: 172.REDACTED
- name: QUEUE_BULL_REDIS_PORT
value: '6378'
- name: QUEUE_HEALTH_CHECK_ACTIVE
value: 'true'
- name: QUEUE_BULL_REDIS_TLS
value: 'true'
- name: NODE_EXTRA_CA_CERTS
value: /opt/n8n-ca/server-ca.pem
- name: N8N_HOST
value: n8n.REDACTED
- name: WEBHOOK_URL
value: https://n8n.REDACTED
- name: N8N_EVENTBUS_LOGWRITER_LOGBASENAME
value: audit/n8nEventLog
- name: QUEUE_WORKER_LOCK_DURATION
value: '2100000'
- name: QUEUE_WORKER_LOCK_RENEW_TIME
value: '60000'
- name: QUEUE_WORKER_STALLED_INTERVAL
value: '120000'
- name: N8N_RESTRICT_FILE_ACCESS_TO
value: /opt/n8n
- name: N8N_ENCRYPTION_KEY
valueFrom:
secretKeyRef:
key: latest
name: prod_n8n_secretkey
- name: DB_POSTGRESDB_PASSWORD
valueFrom:
secretKeyRef:
key: latest
name: prod_n8n_dbpassword
- name: N8N_SMTP_PASS
valueFrom:
secretKeyRef:
key: latest
name: prod_n8n_ses_password
- name: N8N_RUNNERS_AUTH_TOKEN
valueFrom:
secretKeyRef:
key: latest
name: prod_n8n_runners_auth_token
- name: QUEUE_BULL_REDIS_PASSWORD
valueFrom:
secretKeyRef:
key: latest
name: prod_n8n_redis_password
resources:
limits:
cpu: '2'
memory: 8Gi
volumeMounts:
- name: prod-n8n-workers
mountPath: /opt/n8n
- name: prod-n8n-runners
mountPath: /opt/n8n-ca
- name: in-memory-1
mountPath: /opt/n8n/.n8n/audit
livenessProbe:
initialDelaySeconds: 30
timeoutSeconds: 10
periodSeconds: 60
failureThreshold: 5
httpGet:
path: /healthz
port: 5678
startupProbe:
initialDelaySeconds: 20
timeoutSeconds: 5
periodSeconds: 30
failureThreshold: 3
httpGet:
path: /healthz/readiness
port: 5678
- name: n8n-runners-1
image: gcr.io/REDACTED/n8n-runners:latest
env:
- name: N8N_RUNNERS_TASK_BROKER_URI
value: http://127.0.0.1:5679
- name: N8N_RUNNERS_AUTO_SHUTDOWN_TIMEOUT
value: '15'
- name: N8N_RUNNERS_LAUNCHER_HEALTH_CHECK_PORT
value: '5680'
- name: GENERIC_TIMEZONE
value: America/Los_Angeles
- name: N8N_RUNNERS_MAX_CONCURRENCY
value: '1'
- name: N8N_RUNNERS_MAX_OLD_SPACE_SIZE
value: '6144'
- name: N8N_RUNNERS_CONFIG_PATH
value: /opt/n8n-runners/n8n-task-runners.json
- name: N8N_RUNNERS_TASK_TIMEOUT
value: '1800'
- name: N8N_BLOCK_ENV_ACCESS_IN_NODE
value: 'false'
- name: NO_COLOR
value: 'true'
- name: N8N_RUNNERS_HEALTH_CHECK_SERVER_ENABLED
value: 'true'
- name: ANTHROPIC_BASE_URL
value: https://REDACTED
- name: N8N_RUNNERS_HEALTH_CHECK_SERVER_HOST
value: 0.0.0.0
- name: N8N_RESTRICT_FILE_ACCESS_TO
value: /opt/n8n-runners
- name: N8N_RUNNERS_AUTH_TOKEN
valueFrom:
secretKeyRef:
key: latest
name: prod_n8n_runners_auth_token
- name: ANTHROPIC_API_KEY
valueFrom:
secretKeyRef:
key: latest
name: prod_n8n_anthropic_apikey
resources:
limits:
cpu: '2'
memory: 8Gi
volumeMounts:
- name: prod-n8n-runners
mountPath: /opt/n8n-runners
livenessProbe:
initialDelaySeconds: 15
timeoutSeconds: 10
periodSeconds: 300
failureThreshold: 8
httpGet:
path: /healthz
port: 5680
startupProbe:
initialDelaySeconds: 30
timeoutSeconds: 5
periodSeconds: 10
failureThreshold: 12
httpGet:
path: /healthz
port: 5680
volumes:
- name: prod-n8n-workers
csi:
driver: gcsfuse.run.googleapis.com
volumeAttributes:
bucketName: prod-n8n-workers
- name: prod-n8n-runners
csi:
driver: gcsfuse.run.googleapis.com
readOnly: true
volumeAttributes:
bucketName: prod-n8n-runners
- name: in-memory-1
emptyDir:
medium: Memory
sizeLimit: 20M
instanceSplits:
- latestRevision: true
percent: 100
status:
observedGeneration: 9948
conditions:
- type: Ready
status: 'True'
lastTransitionTime: '2026-01-08T21:03:38.789864Z'
latestReadyRevisionName: REDACTED
latestCreatedRevisionName: REDACTED
instanceSplits:
- revisionName: REDACTED
percent: 100
I was able to reduce the frequency of these errors by changing these values within my system:
QUEUE_WORKER_LOCK_DURATION |
3600000 |
|
QUEUE_WORKER_LOCK_RENEW_TIME |
120000 |
|
QUEUE_WORKER_STALLED_INTERVAL |
240000 |
|
However, that doesn’t fully solve the problem – it simply reduces the frequency of the problem. I’d really like an n8n developer to chime on how the long-term issue will get resolved.