Duplicate Executions and Job Stalling Issues

Describe the problem/error/question

I have a workflow in N8N for sending messages to some contacts, which processes a list with generally 500 to 1000 contacts in a Loop. I am constantly having issues with this Loop, even though I have tested using Baserow as the source, Google Sheets, and more recently my own CRM system’s database (PostgreSQL).

The triggers start normally, but after a while, they begin to duplicate. Looking at the container logs, I see that the execution itself is being duplicated. In the N8N interface, only one (the first one) appears, but from the logs, we see:

2024-07-18T13:20:38.824218974Z Start job: 264636 (Workflow ID: GDfIi966az0wJT9G | Execution: 269138)
2024-07-18T13:50:25.267082467Z Start job: 264636 (Workflow ID: GDfIi966az0wJT9G | Execution: 269138)

2024-07-18T14:07:46.302098819Z Error from queue:
2024-07-18T14:07:46.302138852Z Error from queue:
2024-07-18T14:07:46.306091887Z Error: Missing lock for job 264636 failed
2024-07-18T14:07:46.306137445Z at Object.finishedErrors (/usr/local/lib/node_modules/n8n/node_modules/bull/lib/scripts.js:225:16)
2024-07-18T14:07:46.306143643Z at Job.moveToFailed (/usr/local/lib/node_modules/n8n/node_modules/bull/lib/job.js:342:19)
2024-07-18T14:07:46.306148851Z at processTicksAndRejections (node:internal/process/task_queues:95:5)

I am using N8N in queue mode, with Redis, in Docker, with separate services for editor, webhook, and worker. The worker’s concurrency is set to 40.

What is the error message (if any)?

In a workflow that sends me error messages from executions, I receive the following message:

job stalled more than maxStalledCount

Information on your n8n setup

  • n8n version: 1.50.1
  • Database (default: SQLite): PostgreSQL
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app): Docker
  • Operating system: Ubuntu 20.4

I’ve set QUEUE_WORKER_MAX_STALLED_COUNT to 0 and it stopped duplicating the execution. Now the error is

job stalled more than maxStalledCount

hello @pedrofnts

Can you share the workflow? The first suggestion is that you have a very long-running workflow, so your main instance started to think that the worker is dead

Here’s the workflow. On loop I have ~400 contacts, and the main wait is set to 5~10 seconds.

This is the error that appears on the execution after some time (initially I only have the stalled message):

Error: Connection is closed. at EventEmitter.sendCommand (/usr/local/lib/node_modules/n8n/node_modules/ioredis/built/Redis.js:332:28) at EventEmitter.get (/usr/local/lib/node_modules/n8n/node_modules/ioredis/built/utils/Commander.js:90:25) at Object.get (/usr/local/lib/node_modules/n8n/dist/services/cache/redis.cache-manager.js:35:42) at CacheService.get (/usr/local/lib/node_modules/n8n/dist/services/cache/cache.service.js:146:46) at CredentialsHelper.credentialCanUseExternalSecrets (/usr/local/lib/node_modules/n8n/dist/CredentialsHelper.js:239:48) at CredentialsHelper.getDecrypted (/usr/local/lib/node_modules/n8n/dist/CredentialsHelper.js:190:42) at processTicksAndRejections (node:internal/process/task_queues:95:5) at getCredentials (/usr/local/lib/node_modules/n8n/node_modules/n8n-core/dist/NodeExecuteFunctions.js:1393:33) at Object.getCredentials (/usr/local/lib/node_modules/n8n/node_modules/n8n-core/dist/NodeExecuteFunctions.js:2272:56) at Object.router (/usr/local/lib/node_modules/n8n/node_modules/n8n-nodes-base/dist/nodes/Postgres/v2/actions/router.js:36:26)

Better to move that functionality into sub workflow

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.