We connected Claude to a self-hosted n8n instance via MCP and used it to co-build a 71-node production workflow. Here's the honest version

:waving_hand: Hey n8n Community!

A friend of mine was building an automation project for a multi-brand retail operation – taking messy supplier files and turning them into Oracle EBS-compatible records. 735 brands, 321,000+ barcodes, seven AI extraction pipelines. He’d connected Claude Opus to his self-hosted n8n instance through MCP and was building workflows by just talking to it. I wanted to see if this actually works, so we ran the remaining project phases together.

:hammer_and_wrench: What the setup looks like

Through MCP, Claude can search workflows, read every node config, and execute workflows on the live n8n instance. But n8n is just one connector – it also talks to Google Drive, Chrome, and Slack. So in one conversation, Claude can read your workflow, pull a reference file from Drive, verify a live Sheet through Chrome, and message your team on Slack.

Model choice matters. Opus consistently knows n8n-specific stuff – Code node vs Function node, Switch vs IF, expression syntax. Earlier models got the shape right but missed the details.

:building_construction: What we built

Over four phases: a 71-node production workflow with webhook trigger, file routing, PDF chunker, seven AI extraction pipelines (running through easybits), merge/validation layers, Google Sheets integration across 10+ tabs, error handling, and dashboard refresh.

The standout moments: Claude analyzed ~940,000 rows across two reference files and built a color-coded duplicate report with five matching categories. Later, it reverse-engineered how the data team manually converts 13-column supplier files into a 47-column EBS format – identified the mapping logic, handled that different brands use different column names (UPC vs INTERNATIONAL BARCODE vs EAN), and designed a four-step season resolution chain that became production logic. It found three fields we’d missed.

:bug: Where it saved the most time: debugging

Stale schema. Workflow wrote empty rows but showed “success.” Claude read the node config through MCP and found the Sheets node had cached 31 columns with old names while the live sheet had 45 columns with new names. Mapped every discrepancy, gave step-by-step fix. Would’ve taken hours manually.

Phantom match. Claude’s own fuzzy matching used string.includes("") – always true for empty strings. Every empty barcode matched against 321k records. Clean code, invisible bug. Found it once we described the symptom.

The one it couldn’t solve. Apps Script onChange trigger created a race condition during sheet wipes. Claude kept suggesting vanilla JS that doesn’t apply to Apps Script. My friend solved it with a WIPE_IN_PROGRESS flag, taught Claude the pattern, and it applied it correctly going forward. Some domain stuff is still on you — but the learning compounds.

:bar_chart: The honest numbers

  • Simple workflows (webhook → transform → write): 40-50% work first try
  • Complex multi-step with conditional logic: 15-20% work first try
  • Time: 2-4 hours manually → 30-90 minutes with Claude

The first-try rate is misleading. Even when it doesn’t run immediately, the architecture is right. You’re debugging details, not building from scratch.

:speech_balloon: How prompting actually works

Nobody types “build me a pipeline.” Real prompt:

“Look at the upload workflow. I need a sub-workflow that takes raw supplier files with 13 columns and maps them into our 47-column BSMS_ITEM_UPLOAD format. The January file is the reference.”

~80% correct on first pass. The 20% was edge cases. Mental model: treat it like a senior dev on their first week. Technically strong, needs context about your specific setup.

:white_check_mark: Bottom line

This isn’t “AI builds n8n workflows.” It’s Claude operating across the full project – reading live configs, analyzing data, debugging multi-layer issues, writing stakeholder emails, building decks. All through MCP.

Best description: a senior engineer who can read your systems and debug your configs, but needs you as the architect who signs off before anything touches production.

Happy to answer questions about the setup or specific situations we ran into.

Best,
Felix

The MCP integration for co-building is interesting, from my own experiments, Claude is noticeably better at refactoring existing nodes than generating them from scratch. For new nodes it occasionally hallucinates outdated parameter names that don’t match current n8n schema, so you still need a quick validation pass.

I have a question though, at 71 nodes, how are you handling error tracing? At that scale, pinpointing which node in a long execution log actually triggered a failure gets painful. Curious whether you built custom error bubbling or rely on n8n’s native error workflow node.

1 Like

Hey @Vivek_Malakar, thanks for the feedback – I completely agree that Claude is much stronger at refactoring, debugging, and improving existing workflows than building them from scratch.

To your question: for workflows of this size, we rely heavily on error workflows using the error trigger node. It’s been really helpful for identifying failures throughout the execution.

Before n8n introduced that feature, we built our own error logging in Google Sheets. That worked reasonably well, but I’d say only about 95% reliably – so there were still cases where errors slipped through.

With the error workflows, it’s now a much more robust and reliable setup.

1 Like

Great, thanks for the clarification @easybits.

This is a great use case of having huge number of nodes in a single workflow and at the same time having error handling done efficiently.

1 Like