Track executions across instances

Hi everyone,

I’m running multiple n8n instances on Google Cloud Run using a single license key.

What’s the best way to track usage per deployment/tenant (executions, active workflows, API usage, etc.)?

Hi @rgrzesk
I think the best way is to use the insights:

But if you want multiple instances of data into a single frame, then I would recommend this:

I have used it, and it works all the time.
Als,o you can just call the N8N API:

GET /api/v1/executions?status=success&limit=250
If you are not on the n8n cloud, the Ud metrics option is the best.

Does this help, @rgrzesk ?

For tracking across multiple Cloud Run instances into one place, the API polling approach won’t scale well - each instance has its own API endpoint and there’s no unified view. A cleaner pattern is to add a dedicated “execution logger” workflow to each instance: it runs on a schedule, hits GET /api/v1/executions with a time window, adds an instance_id tag, and posts the results to a shared Postgres table or Google Sheet. Then you have one central place to query usage across all instances. You can include the Cloud Run service name as the instance identifier via an environment variable injected at deploy time.

This one sounds really nice, but the problem is that I don’t and won’t have access to every instance. I can orchestrate it, but cannot be a user.
I would need to create such predefined workflow when deploying, but I guess it’s not possible to do that easily. The only way is to somehow use public available data. Is it possible to use /metric endpoint? Will I get enough data I could use?

If you do not have API/user access to each instance, I would treat `/metrics` as a partial infrastructure signal, not as a complete usage model.

It can help with questions like “is this instance alive, how busy is it, are executions failing more than usual”, but it usually will not give you clean business attribution like tenant usage, active workflow count per customer, or billable API calls unless you designed the deployment around those labels from the beginning.

For your case, because you can orchestrate the deployment, I would push the tracking boundary into the deploy template:

- give every Cloud Run service a stable `deployment_id` / `tenant_id` label

- enable metrics/log export at deploy time, not after the customer starts using it

- make Cloud Run logs include the same deployment label

- if possible, pre-install one small internal logging workflow during provisioning

- if workflow access is not possible, at least collect instance health, execution totals, failures, latency and restart/error signals centrally

The important part is that the ID must exist outside n8n too. If each instance emits metrics but they arrive without a stable deployment label, you will still end up with a global pile of numbers that is hard to reconcile.

So I would use `/metrics` for operational monitoring, but not depend on it alone for tenant/customer reporting. For usage reporting I would want either a predefined logger workflow, API access, or deployment-level labels that your collector adds before data lands in the central store.

Thanks, that makes sense.
How about using the DB as source of truth for historical/license usage? Would that make sense, does it keep the data I need?
Checking by execution_entity and counting all not marked as manual?

Hi @rgrzesk

there is another approach for your setup which is OpenTelemetry tracing. Since you control deployment, set these env vars during provisioning:

N8N_OTEL_TRACE_ENABLED=true
N8N_OTEL_TRACE_EXPORTER=otlp
OTEL_EXPORTER_OTLP_ENDPOINT=https://your-central-collector:4318

Every execution emits a workflow.execute span with the execution mode, status, workflow ID, and the instance’s unique n8n.instance.id. Point all instances to one collector (Jaeger, Grafana Tempo, etc.) and you get per-instance, per-execution tracking with mode filtering. No DB access needed, no pruning risk, standard protocol, and it’s configured entirely at deploy time which fits your constraints perfectly.

For the DB route as a fallback: yes, execution_entity with mode != 'manual' works, but be aware that pruning deletes these records based on EXECUTIONS_DATA_MAX_AGE. Aggregate into your own store on a shorter schedule than the pruning window.

Let me know if it helps :crossed_fingers:

Nice one!
OL is a good solution for already new instances. Are we somehow able to get historical data as well?

OTel only captures from the moment it’s enabled, no retroactive traces.

For historical data on existing instances, two options since you have DB access:

  1. execution_entity table: query mode != 'manual' for production counts. Only as far back as pruning allows.

  2. Insights tables: n8n stores compacted insights data separately from execution_entity, retained for up to 365 days by default (N8N_INSIGHTS_MAX_AGE_DAYS). This survives execution pruning. Check the insight_* tables in your Postgres schema for aggregated historical counts.

For anything older than what’s in the DB, it’s gone.