Queue mode: Jobs stay in Queued status

Dogway · April 11, 2025, 12:47pm

I have this problem for almost two months, and it’s only in queued mode either running manually or triggered. The jobs keep stacking as queued so I have to cancel them.

In pm2 logs I get this:

3|n8n-work | Last session crashed
3|n8n-work | User settings loaded from: /home/dogway/.n8n/config
3|n8n-work | Last session crashed

I’m on v1.88.0, using one webhook and worker instance each (for testing purposes) on the same VPS, without using Docker to reduce overhead.

I’m posting here as a last resource, I post my envs in case it helps:

NODE_ENV=production
N8N_PROTOCOL=https
N8N_HOST=n8n.xxxxxxx.es
N8N_PORT=5680
WEBHOOK_URL=https://n8n.xxxxxx.es/
GENERIC_TIMEZONE=Atlantic/Canary
NODEJS_PREFER_IPV4=true
N8N_DISABLE_UI=false

EXTERNAL_FRONTEND_HOOKS_URLS=
N8N_DIAGNOSTICS_ENABLED=false
N8N_DIAGNOSTICS_CONFIG_FRONTEND=
N8N_DIAGNOSTICS_CONFIG_BACKEND=
N8N_VERSION_NOTIFICATIONS_ENABLED=false
N8N_HIRING_BANNER_ENABLED=false
N8N_ENFORCE_SETTINGS_FILE_PERMISSIONS=true
N8N_TELEMETRY_ENABLED=false
POSTHOG_ENABLED=false

OFFLOAD_MANUAL_EXECUTIONS_TO_WORKERS=true
EXECUTIONS_MODE=queue
EXECUTIONS_TIMEOUT=1800
EXECUTIONS_TIMEOUT_MAX=1800
EXECUTIONS_DATA_PRUNE=true
EXECUTIONS_DATA_MAX_AGE=6
N8N_ENDPOINT_WEBHOOK=webhook
N8N_ENDPOINT_WEBHOOK_TEST=webhook-test
N8N_ENDPOINT_WEBHOOK_WAIT=webhook-waiting
N8N_DISABLE_PRODUCTION_MAIN_PROCESS=true

QUEUE_BULL_PREFIX=n8n-queue-prod
QUEUE_BULL_REDIS_HOST=127.0.0.1
QUEUE_BULL_REDIS_PORT=6379
QUEUE_BULL_REDIS_DB=1
QUEUE_BULL_REDIS_CONCURRENCY=5
QUEUE_BULL_REDIS_DUALSTACK=true
QUEUE_HEALTH_CHECK_ACTIVE=true

DB_TYPE=postgresdb
DB_POSTGRESDB_HOST=localhost
DB_POSTGRESDB_PORT=5432
DB_POSTGRESDB_DATABASE=n8n_xx
DB_POSTGRESDB_USER=n8n_xxxxxx
DB_POSTGRESDB_PASSWORD=xxxxxxxx

N8N_PROXY_HOPS=1
N8N_RUNNERS_ENABLED=true
N8N_RUNNERS_BROKER_PORT=5681
NODE_FUNCTION_ALLOW_EXTERNAL=
NODE_FUNCTION_ALLOW_BUILTIN=*

CODE_ENABLE_STDOUT=true
N8N_METRICS=true
N8N_LOG_OUTPUT=console,file
N8N_LOG_LEVEL=warn

For the webhook and worker env the only difference is this:
N8N_DISABLE_UI=true
And the N8N_PORT for each, here worker.env is 5678 since this is a bug on the n8n part. This change worked for 2 weeks or so, but then stopped working (maybe after an update?). I’m running Redis in Hybrid mode.

I can provide more detailed logs. For instance in postgres execution_entity the jobs waitTill key is empty.

Other threads for reference:
https://community.n8n.io/t/workflows-stuck-in-queued-status-after-60-seconds-of-execution-time/82729
https://community.n8n.io/t/my-n8n-worker-node-keeps-crashing-on-railway/56553

jcuypers · April 12, 2025, 5:34am

Hi,

What is the status of the jobs execution_entity. Do they remain as status ‘new’ with a startedAt of null or do they change to running at some point…

Can you get a simple workflow to work (schedule per X sec, wait y sec)?

Do you have more worker logs as it mentions it crashed.

Regards
J.

Dogway · April 12, 2025, 1:16pm

Thanks. Yes, Redis host needs to be 127.0.0.1 mandatory to work.

Here is execution_entity with three examples:

  id  | finished |  mode   | retryOf | retrySuccessId | startedAt | stoppedAt | waitTill | status |    workflowId    | deletedAt |         createdAt
------+----------+---------+---------+----------------+-----------+-----------+----------+--------+------------------+-----------+----------------------------
 2427 | f        | trigger |         |                |           |           |          | new    | G8YJQGWdq86ptiAd |           | 2025-04-12 13:09:00.007+00
 2428 | f        | manual  |         |                |           |           |          | new    | 8gNbFEfBVQ3dVJHu |           | 2025-04-12 13:09:09.477+00
 2428 | f        | manual  |         |                |           | 2025-04-12 13:09:36.361+00 |          | canceled | 8gNbFEfBVQ3dVJHu |           | 2025-04-12 13:09:09.477+00

Trigger and manual while in queued status, and then manual after being manually stopped.

This time I’m testing without the webhook processor instance, to narrow down failure points.

I’m thinking this might be a configuration problem with Caddy.
This is the portion:

n8n.xxxxxx.xx {
       log {
               output file /var/log/caddy/n8n.log
               format json {
                       time_format rfc3339
               }
       }
        reverse_proxy localhost:5678 {
                header_up Host {host}
                header_up X-Forwarded-Proto https
                header_up X-Real-IP {http.request.header.CF-Connecting-IP}
       }
}

The last thing I need to test (again) is going back to port 5678 for the worker, given this bug.

This is redis monitor, and infinite loop of ZSCORE warnings:

" "n8n-queue-prod:jobs:completed" "n8n-queue-prod:jobs:failed" "26"
1744463845.028466 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:completed" "26"
1744463845.028474 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:failed" "26"
1744463845.028479 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:failed" "26"
1744463845.034768 [1 127.0.0.1:49800] "evalsha" "74c8631221de82c9ac8c1cb76e574a903fa0228c" "2" "n8n-queue-prod:jobs:completed" "n8n-queue-prod:jobs:failed" "24"
1744463845.034786 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:completed" "24"
1744463845.034791 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:failed" "24"
1744463845.034795 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:failed" "24"
1744463845.036085 [1 127.0.0.1:49800] "evalsha" "74c8631221de82c9ac8c1cb76e574a903fa0228c" "2" "n8n-queue-prod:jobs:completed" "n8n-queue-prod:jobs:failed" "22"
1744463845.036099 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:completed" "22"
1744463845.036103 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:failed" "22"
1744463845.036107 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:failed" "22"
1744463845.036149 [1 127.0.0.1:49800] "evalsha" "74c8631221de82c9ac8c1cb76e574a903fa0228c" "2" "n8n-queue-prod:jobs:completed" "n8n-queue-prod:jobs:failed" "21"
1744463845.036160 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:completed" "21"
1744463845.036164 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:failed" "21"
1744463845.036167 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:failed" "21"
1744463846.731029 [1 127.0.0.1:56652] "select" "1"
1744463846.798867 [1 127.0.0.1:56660] "select" "1"
1744463846.799125 [1 127.0.0.1:56660] "info"
1744463846.799716 [1 127.0.0.1:56670] "select" "1"
1744463846.799918 [1 127.0.0.1:56670] "subscribe" "n8n-queue-prod:jobs:progress"
1744463846.800068 [1 127.0.0.1:56676] "select" "1"
1744463846.800275 [1 127.0.0.1:56676] "subscribe" "n8n.commands"
1744463846.817940 [1 127.0.0.1:56684] "select" "1"
1744463846.818065 [1 127.0.0.1:56684] "client" "setname" "n8n-queue-prod:am9icw=="
1744463846.818122 [1 127.0.0.1:56684] "client" "setname" "n8n-queue-prod:am9icw=="
1744463847.085324 [1 127.0.0.1:49800] "evalsha" "74c8631221de82c9ac8c1cb76e574a903fa0228c" "2" "n8n-queue-prod:jobs:completed" "n8n-queue-prod:jobs:failed" "18"
1744463847.085359 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:completed" "18"
1744463847.085365 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:failed" "18"
1744463847.085370 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:failed" "18"
1744463849.061089 [1 127.0.0.1:49800] "evalsha" "74c8631221de82c9ac8c1cb76e574a903fa0228c" "2" "n8n-queue-prod:jobs:completed" "n8n-queue-prod:jobs:failed" "19"
1744463849.061124 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:completed" "19"
1744463849.061130 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:failed" "19"
1744463849.061134 [1 lua] "ZSCORE" "n8n-queue-prod:jobs:failed" "19"

Upon pm2 stop n8n-worker:

3|n8n-work | Last session crashed
3|n8n-work | User settings loaded from: /home/dogway/.n8n/config
3|n8n-work | Last session crashed
3|n8n-work | Failed to shutdown gracefully
3|n8n-work | Error: Failed to shutdown gracefully
3|n8n-work |     at ShutdownService.shutdownComponent (/home/dogway/n8n/node_modules/n8n/src/shutdown/shutdown.service.ts:113:29)
3|n8n-work |     at /home/dogway/n8n/node_modules/n8n/src/shutdown/shutdown.service.ts:99:52
3|n8n-work |     at Array.map (<anonymous>)
3|n8n-work |     at ShutdownService.startShutdown (/home/dogway/n8n/node_modules/n8n/src/shutdown/shutdown.service.ts:99:18)
3|n8n-work |     at ShutdownService.shutdown (/home/dogway/n8n/node_modules/n8n/src/shutdown/shutdown.service.ts:79:31)
3|n8n-work |     at process.<anonymous> (/home/dogway/n8n/node_modules/n8n/src/commands/base-command.ts:279:25)
3|n8n-work |     at Object.onceWrapper (node:events:633:26)
3|n8n-work |     at process.emit (node:events:518:28)
3|n8n-work |     at process.emit (/home/dogway/n8n/node_modules/source-map-support/source-map-support.js:516:21)
3|n8n-work |

Dogway · April 12, 2025, 7:13pm

Resolved, it was a mix of the linked bug and temporary misconfigurations while debugging (for the time being). Will report back if the issue arises again.

system · April 19, 2025, 7:13pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.