N8n-trace: 🚀A self-hosted observability dashboard for n8n workflows

Hi everyone,

I’d like to share a project I recently built for the n8n community:

n8n-trace is a self-hosted observability dashboard designed to provide visibility into workflow performance without exposing sensitive workflow data.

The idea came from a common challenge: in many organizations, business teams need insight into workflow activity (success rates, failures, execution trends), but giving them direct access to n8n can expose payload data, secrets, or internal workflow details.

So I built n8n-trace as a separate analytics and monitoring layer for n8n.

Some of the features:

- Workflow execution analytics (success / failure rates and duration trends)

  • Instance health monitoring
  • Metrics explorer inspired by Prometheus-style queries
  • Multi-instance support
  • Role-based access control (Admin / Analyst / Viewer)
  • Audit logging
  • Runs as a single Docker container alongside PostgreSQL

Data is ingested through n8n workflows that write to PostgreSQL using a dedicated database user with limited permissions, ensuring only monitoring data is stored while sensitive application data remains protected.

This is also my first real “vibe coding” project, built largely with the help of modern AI tools like GitHub Copilot. It was a great learning experience.

The project is fully open source and I’d love to hear your feedback.
Contributions, ideas, bug reports, and pull requests are very welcome.
And if you find it useful, a :star: star on GitHub would really help the project reach more people in the community.

Thanks!

2 Likes

Small update:

You can think of n8n-trace as a Grafana-like observability layer for n8n workflows.

The goal is to provide workflow performance visibility without exposing workflow data.

1 Like

The “business teams need visibility but shouldn’t have raw n8n access” use case is very real. If you’ve ever had to manually answer “did workflow X run successfully this week?” via Slack, you feel this.

A few questions about the architecture:

How are you capturing execution data? Are you polling the n8n API on a schedule, using webhooks from n8n’s built-in execution events, or intercepting at another layer?

How does RBAC work in practice — do you define permissions at the workflow level (this team can see executions for workflows tagged X) or at the instance level?

On the Prometheus metrics side — are you exposing these as a Prometheus exporter that a separate Grafana instance can scrape, or is the metrics explorer built into n8n-trace’s own UI?

I’ve been thinking about similar problems from the opposite direction — I run an n8n instance that handles a lot of business operations and I need a way to spot “this workflow has been failing silently for 3 days” without manually checking the execution log. The execution tracking + failure rate trending is the feature that immediately caught my eye.

Nice project — the security-first framing (visibility without data exposure) is the right way to position this for teams that have compliance or data sensitivity concerns.

Thanks for the thoughtful questions @OMGItsDerek — you’ve described exactly the problem that led me to build this.

Execution data

n8n-trace does not poll the API. Instead, a lightweight n8n workflow collects execution metadata (status, duration, workflow id, instance id, etc.) from executions and writes it into the dedicated PostgreSQL database used by n8n-trace. Only operational metadata is stored — no payload or sensitive data leaves n8n.

System metrics

System health metrics are handled separately. n8n’s built-in metrics must be enabled, and another workflow collects those metrics (instance health, runtime metrics, etc.) and stores them in the same PostgreSQL database used by n8n-trace. This system-metrics collection can be enabled or disabled through the n8n-trace Docker Compose configuration.

RBAC

Access is managed through groups. A group can be scoped by:

• instance (e.g. prod)

• specific workflow IDs

• workflow tags

So a team might only see workflows from the prod instance, only workflows tagged finance, or specific workflow IDs.

Silent failures

Yes — the dashboard helps surface these situations. Teams can see workflow states like running, waiting, success, and error, as well as trends showing workflows that have been failing repeatedly or haven’t run successfully for some time.

Alerting is not implemented yet, but it’s something I plan to add if there is enough interest.

Appreciate the feedback!

observability for n8n is a real gap — once you’re running multiple production workflows for clients the built-in execution logs just don’t cut it anymore. what exactly does n8n-trace track? just execution status and duration or also node-level metrics? and what db are you using for storage — that matters a lot for people running privacy-sensitive workflows.

@Benjamin_Behrens
n8n-trace tracks both workflow-level and node-level data.
Workflow level:

  • executions over time (throughput trend),

  • status breakdown (success/error),

  • failure rate,

  • median and P95 duration.

Node level:

  • per-node execution duration (median/P95/avg),

  • run counts,

  • items-out stats,

  • slowest nodes (P95) to spot bottlenecks quickly.

If n8n metrics are enabled, it also adds instance/runtime telemetry (CPU, memory, event loop, etc.) via the metrics endpoint.
Storage is PostgreSQL (self-hosted). No external analytics API is required, so data stays in your environment (important for privacy-sensitive workflows).

@MJSY this is exactly the level of detail i was after — node-level P95 per execution is the data point we’ve been missing in client setups. PostgreSQL is the right call for privacy-sensitive workflows, and keeping the n8n workflow as the collection mechanism is clean (no external agent needed).

the “workflow hasn’t run successfully in 3 days” case is where we feel the pain most — alerting would close the loop completely. even a simple webhook trigger on failure rate crossing a threshold would cover 80% of the monitoring we currently do manually. looking forward to seeing that added.

Thanks for the suggestion, that’s a really good point.

I’ll take a look at how alerting could be implemented in a clean way. The “workflow hasn’t run successfully for X time” case and simple threshold-based alerts sound like a good starting point.

Appreciate the feedback!

That’s great to hear! A simple threshold-based alert like “workflow hasn’t completed successfully in X hours” combined with a webhook notification would cover most production monitoring needs without overcomplicating the architecture. Looking forward to seeing how you approach it — this would make n8n-trace a really complete solution for production setups.

1 Like