New n8n Builders Berlin Session Recording: AI Evaluations 🎥

Evaluations in Agentic Workflows

For anyone who couldn’t make it to n8n Builders Berlin, we’ve got another session recording ready for you. This one is from the Advanced Track, where JP van Oosten (Engineering Manager of the AI team at n8n) walks through how he uses evaluations to make AI workflows more reliable.

He covers common issues like inconsistent LLM outputs, context drift, and edge cases, and also shows how to compare prompts and models in a structured way. There’s a live demo too, where JP sets up evaluations in n8n and explains how he interprets the results.

If you’re building AI workflows or agents and want a clearer way to test, monitor, and improve them, this is a great session to watch. :blush:

4 Likes

Hi, There is a new node that has been approved to be published in the comming days: JaaSAI that is an evaluation tool for AI agents and very easy to use
. Very interesting. jaas-ai.net

I tried out the new N8N evaluation feature and it is a game changer. It appears to be limited in the community edition to a single workflow. What plans support unlimited use? I could not see anything on the paid tiers.