FlowMind — AI Delivery Intelligence Airtable

Hi n8n community,

I wanted to share a project I’ve been building with n8n around project delivery, sprint risks and decision automation.

The project is called FlowMind Command.

The idea came from a real problem I’ve seen in SaaS and internal projects: teams already have the data, but they still lose too much time understanding what is actually blocking the sprint.

The data is usually somewhere in Airtable, Slack, GitHub, PostgreSQL or another tool. But when a sprint starts slipping, the team often needs more than another dashboard.

They need a clear operational decision.

FlowMind Command is my attempt to move from “metrics” to “action”.

The workflow currently connects Airtable to n8n, normalizes sprint tasks, calculates delivery risks, detects developer overload, stores risk data in PostgreSQL, and sends Slack alerts when action is needed.

I also added a multi-agent AI layer with different roles:

one agent looks at deadlines
one agent looks at review and validation friction
one agent looks at knowledge gaps, dependencies and missing senior support
then a final synthesizer turns everything into a clear recommendation

The goal is not only to say:

“The sprint is at risk.”

But to produce something more useful, like:

“Contact this person today, activate a backup reviewer, reduce the scope of this task, and escalate this blocker before the end of the day.”

I’m also working on a few features to make it more useful for real teams:

inter-sprint memory
sprint health trend
Airtable webhook instead of polling
weekly executive report from PostgreSQL
fallback logic when the AI provider is unavailable

It’s still a work in progress, and I’m improving the workflow step by step. But I think there is something interesting here for small teams, agencies, consultants and PME/SMB teams that don’t need another expensive delivery dashboard, but need a practical decision assistant connected to the tools they already use.

I’d be happy to get feedback from the n8n community, especially around:

how to simplify the workflow for a public template
how to structure the Postgres tables properly
how to make the AI fallback more robust
how to make this easier to install for non-technical users

GitHub: Tchatchoua14
LinkedIn: www.linkedin.com/in/thomas-viny-tchatchoua-b378b7119

Thanks to the n8n community. I’m learning a lot by building this.

The separation of agents by concern — deadlines, review friction, knowledge gaps — rather than one generic “risk analyzer” is a smart design. Each agent can be prompted with domain-specific context, which tends to produce more precise outputs than asking a single agent to reason across all dimensions at once.

The synthesizer pattern on top is also the right call; it avoids each sub-agent needing to know about the others and keeps the routing logic clean.

Curious: are the sub-agents running in parallel (using n8n’s parallel branches) or sequentially? And how are you passing their outputs to the synthesizer — direct JSON merge or something more structured?

1 Like

Thanks for taking the time to read it carefully — and for picking up on exactly the part I was most unsure about when I designed it.

To answer your question directly: yes, the three sub-agents run in parallel using n8n’s branching. The context object is built once upstream (sprint health score, top risks, developer load, remaining sprint days) and each agent receives the same enriched payload. They don’t communicate with each other — each one just reasons from its own angle.

For the merge: I use n8n’s native Merge node (3 inputs) which waits for all branches to complete before passing the combined output to the synthesizer. Each agent returns a structured JSON with a consistent shape — agent name, outputs, confidence_score, critical_decision flag — so the synthesizer knows what it’s working with even if one agent failed and returned a fallback payload.

The reason I went with parallel rather than sequential was partly latency (three Gemini calls in sequence would be slow) but also independence — I didn’t want the Guardian of Deadlines output to bias the Friction Killer prompt before it had reasoned on its own. The synthesizer is the only node that sees everything.

One thing I’m still not fully happy with: when one agent fails and sends a fallback JSON, the synthesizer can technically still run, but the quality of the final decision degrades silently. I log it in ai_data_quality but I haven’t found a clean way to surface that degradation clearly to the PM without being noisy.

If you’ve seen patterns for handling partial multi-agent failures gracefully, I’d genuinely be interested — it’s one of the open problems I haven’t solved well yet.

For partial failures the pattern I’ve found cleanest is to add a data_quality_score to each agent’s output before the Merge - something like 0 for fallback, 0.5 for partial, 1 for clean. Then in the synthesizer prompt, pass that score explicitly so the LLM can weight each agent’s contribution accordingly rather than treating all inputs equally.

For surfacing it to the PM without noise: instead of alerting on every degraded run, I’d set a threshold - only flag when the combined quality score drops below a certain level (e.g. 2 out of 3 agents returned fallback). That way it stays quiet on minor issues but escalates when the decision is genuinely unreliable.

1 Like

That’s exactly the kind of feedback I was hoping for — and honestly, the data_quality_score idea is something I should have thought of earlier.

Right now I’m passing a binary flag (agent_failed: true/false) and the synthesizer has no way to weight partial outputs against clean ones. It just sees three inputs and treats them equally, which is the core of the problem.

The 0 / 0.5 / 1 scale makes much more sense. It gives the LLM actual signal to reason with instead of forcing it to guess whether a fallback payload is worth anything. I’m going to rework the fallback Set nodes to compute that score at the agent level before the Merge — it’s a small change but it changes what the synthesizer can do with the data.

The threshold approach for PM notification is also the right call. The way I had it, any degraded run could theoretically bubble up, which would kill trust in the alerts fast. Capping it at “2 out of 3 agents returned fallback” before escalating is a much cleaner contract between the system and the person reading Slack.

One follow-up question if you don’t mind: when you pass the quality scores to the synthesizer prompt, do you inject them inline in the JSON payload or do you add a separate instruction block at the top of the prompt explaining how to interpret them? I’m wondering whether the LLM actually picks up on a numeric field in a large JSON or whether it needs an explicit framing to use it correctly.

Explicit framing works better in practice. Injecting the score inline in the JSON payload without any context means the LLM has to infer what the number means - and in a large payload it often just treats it as metadata and ignores it. What works reliably is adding a short instruction block at the top of the system prompt like: “Each agent result includes a quality_score (0 = failed, 0.5 = partial, 1 = clean). Weight your synthesis accordingly and flag any result below 0.5 as potentially unreliable.” The framing tells the LLM what to do with the number before it sees the data, which makes a big difference in how consistently it uses it.