Help with Prometheus metrics understanding

Noamedri · December 22, 2025, 3:15pm

What does the metric n8n_queue_job_dequeued_total means?

Because I thought it indicates the amount of jobs that went from the queue in the Redis to the workers.

Now it doesn’t make sense because I have that metric n8n_workflow_started_total and this metric have 55k in the result value and the dequeued metric is a bit higher than 51k, so it doesn’t make sense that the workflow started(started executions) is grater than the dequeued amount.

musharraf39 · December 22, 2025, 6:24pm

Hey,

Good catch - the difference makes sense actually. Here’s why:

n8n_workflow_started_total counts ALL workflow executions, including:

Queue mode executions (from Redis)
Manual executions (triggered directly)
Webhook triggers that run immediately
Test executions

n8n_queue_job_dequeued_total only counts jobs that went through the Redis queue.

So if you have 55k started and 51k dequeued, that means around 4k executions ran directly without going through the queue (manual runs, webhooks, etc).

Are you running in queue mode with workers? What’s your setup like - do you have webhooks or manual triggers that might be bypassing the queue?

Noamedri · December 23, 2025, 2:43pm

Hi, firstly ty for the fast response.

Yes we running a queue mode with workers,

We using this flag in the main under the env variable OFFLOAD_MANUAL_EXECUTION_TO_WORKERS = true.so it should make the manual jobs also pass in the redis no? Because this flag makes the manual executions to offloaded to the workers.

about the webhook that make sense what you said.

Can you name me all the different execution that might not pass my redis in this setup(my setup looks like the setup in the queue mood that in the docs).

Ty for your help I’m really appreciate it🙏🏻

musharraf39 · December 23, 2025, 3:19pm

Honestly it depends on what exactly you’re trying to measure in n8n, the metrics that Prometheus pulls from n8n usually include things like the number of workflows running, how long each execution took, and any errors that occurred. The important thing is understanding that each metric has specific labels that help you filter the data, like workflow_id or status. If you want to understand a specific metric copy its name and check its type whether it’s a counter or gauge or histogram because each one has a different way of reading it. Counters only increase never decrease, gauges go up and down, and histograms give you distribution of values. My advice is start with basic metrics like n8n_workflow_executions_total and n8n_workflow_execution_duration_seconds to understand your workflows performance, then expand to other metrics based on your needs. If you have a question about a specific metric tell me its name and I’ll explain it in detail.

Noamedri · January 4, 2026, 1:48pm

Hi, thank you for the response.
I am familiar with prometheus and its metric types.
The metrics you mentioned are not exposed either by the workers nor the main instance, metrics like n8n_workflow_execution_duration_seconds are missing, do you have an idea why,
i use version 1.116.2?
I have this metric n8n_workflow_success_total and i have two workflow that running one that runs ok, doing http request to thanos with promql and returns a result, and the other doing the same thing and in the last node failed with one main one worker(pods) (I’m in queue mode if its matter).
Now i looked at the metrics in the worker and i saw that the start and success workflow value is the same, how it makes sense? maybe because a client error counts as a success and only if the problem was in the n8n node run it will not count as success?
Ty for your help i’m really appreciate that.

Noamedri · January 5, 2026, 4:03pm

hi also, i have the same problem in this metric n8n_scaling_mode_queue_jobs_failed in the main instance but in this the metric value equals 0, maybe it’s because we are using queue mode and the metric is only for the main instance when it calculates the manual triggering?