Capturing n8n flows with Observability & Evaluations platforms like Langfuse or Opik

Hello Friends:

It’s been a minute and nice to be back. :smiling_face: I have a question for the community.

I use the Langfuse and Comet Opik Observability & Evaluations platforms, which require using their Python SDK @decorators, API wrapper classes, and Callback facilities.

This pattern is easy to integrate into, say, LlamaIndex or LangChain, but in many cases those code-heavy frameworks are unnatural, even suboptimal for what is essentially a UI visual build effort (as with n8n), especially in Multi-agent cases.

So, what ways can these platforms be integrated into n8n for Observability & Evaluations, including specific worklow subsets and entire ones end-to-end?

Thank you! :smiling_face:

1 Like

It looks like your topic is missing some important information. Could you provide the following if applicable.

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

This is something I would also be really interested in! Being able to build out our use-cases quickly is a great bonus of n8n but I’ve been unable to find a way to log the traces in the same way we could if we were working in python which is quite an important element for us.

1 Like

I haven’t tried this myself, but have you experimented with using code nodes, webhooks, or other possible integrations that n8n has? It might get messy, but I’m just curious if you’ve given it a shot.

In my opinion, a game-changing feature for any of these UI-based builders would be the ability to fully export the underlying generated code. This would allow the workflows to be run and integrated into other contexts. I think this is a huge missing piece in the ecosystem right now.

I’ve found that you can extract execution data using the n8n api node which feels like a short term fix.

But I would really like to be able to use something like LangSmith or Arize Phoenix because they can trace the full pipeline and make it easier for us to work with others in identifying where the responses aren’t quite what we would expect.

1 Like

Thank you. I hope the @n8n people are reading, understanding, and taking this discussion seriously because implementing a tie-in into platforms we’ve mentioned would be compelling for would-be many customers.

I just posted a feature request which you may want to comment on. :smiling_face:

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.