Feature Request: Native Support for AI Observability Stack (Langfuse, Lunary, Arize, Phoenix, Langsmith, LangWatch, Opik)

The idea is: To implement a native, no-code “Analytics & Observability” integration within all AI-related nodes (AI Agent, Chain, Chat Model). This should allow users to connect to leading AI tracing and evaluation platforms by simply selecting a provider and entering credentials, similar to the implementation in Flowise.

The integration should support:

  • Traces & Spans: Automatic reporting of the entire execution chain.

  • Supported Providers: Langfuse, LunaryAI, Langsmith, LangWatch, Arize Phoenix, and Opik.

  • Custom Base URLs: Essential for enterprise clients using self-hosted/on-prem versions of these tools (especially Langfuse and Phoenix).

  • Metadata Mapping: Automatic injection of n8n workflowId, executionId, and nodeName as tags.

My use case: As a Principal AI & Digital Solutions Architect, I provide AI strategy and architecture consultancy to enterprise-level companies. Most of our clients want to use n8n for orchestration but are blocked by the “Observability Gap.”

Currently, unless they use Langsmith via environment variables, there is no easy way to monitor LLM costs, traces, or evaluations without writing custom code or complex workarounds. Companies need to choose their own observability stack (e.g., Langfuse for self-hosting or Arize for evaluation) to meet their compliance and security requirements.

I think it would be beneficial to add this because:

  1. Enterprise Readiness: No enterprise deploys AI agents to production without robust monitoring and auditing. This feature moves n8n from “PoC tool” to “Production-Grade Infrastructure.”

  2. Competitive Advantage: Competitors like Flowise and LangFlow already offer native “Analytics” tabs for these providers. Adding this will prevent users from switching platforms for better “Day 2” operations.

  3. Reduced Complexity: It eliminates the need for manual LangChain configuration, making it accessible for low-code users while remaining powerful for architects.

  4. Vendor Flexibility: Clients aren’t locked into one provider; they can switch between Langfuse, Opik, or Phoenix as their needs evolve.

Any resources to support this?

Are you willing to work on this? I can provide architectural guidance, testing, and feedback from a consultant’s perspective to ensure the implementation meets enterprise standards.

+1 from me on this.

One thing I have found in practice is that the highest-friction gap is not just tracing, but alerting on the moments that actually need follow-up: low eval scores, failed checks, or suspicious production traces.

At PlugFlow we ended up handling that with a simple Langfuse → Slack plus webhook pattern so n8n can branch into incident workflows, Linear tickets, or other follow-up automations. It is not the same as first-class native observability support, but it has been a useful stopgap while the deeper integration story is still evolving.

If n8n ships anything here, I would strongly vote for both sides:

  • native tracing / metadata support into the observability backends
  • first-class eventing or alert hooks so users can turn bad scores or important trace events into automations immediately