[Hiring] Long-term n8n + Postgres build for a 50-state medical consulting practice. Multi-tenant, self-hosted, runs for years

syed_noor · May 16, 2026, 7:32pm

Hi @JEnterprises — this is the kind of n8n project that gets harder, not easier, the longer it runs. A workflow that works on day one and corrodes by month six is the failure mode worth designing against from Phase 0. A few patterns I’d lock in before any tenant onboards:

Multi-tenant isolation, Postgres RLS done right.

One tenants table with a UUID per tenant
Every domain table has tenant_id UUID NOT NULL with FK + index
CREATE POLICY tenant_isolation ON <table> FOR ALL USING (tenant_id = current_setting('app.current_tenant')::uuid)
n8n connects with a role that has RLS enforced (no BYPASSRLS), and every workflow’s first node sets SET LOCAL app.current_tenant = '<uuid>' before any query
Per-tenant credential namespacing in n8n (<tenant>_<service>) — no cross-tenant credential reuse, ever

Checkpoint pattern. Agent state has to persist before the external call, not after. A workflow_checkpoints table with (run_id, workflow_id, tenant_id, step_n, state_json, created_at) gives you crash-resume. When n8n restarts or a node throws, the workflow re-enters at the last checkpoint with state intact. Critical for long-running agents — a Claude or OpenRouter timeout shouldn’t roll back 20 minutes of work.

Daily canary tests, per-workflow. A separate canary workflow per production workflow, cron-triggered, that fires known input → asserts known output → writes pass/fail to canary_results. Drift alerts fire when LLM outputs change shape (e.g., the model starts returning JSON in a different schema). Without this, you learn about silent degradation from a client complaint, not from the system.

Audit trail as append-only. Every external action (LLM call, API write, document touch) writes to an immutable events table with input_hash + output_hash. Makes the system auditable years later and gives you forensics when something goes wrong on tenant 23.

Human-in-the-loop, structurally enforced. Approval queues live as DB tables, not Slack messages. An approvals table with (id, tenant_id, action_type, payload_json, status, decided_by, decided_at). Appsmith or Tooljet renders the queue. Workflows that would send external comms write to the queue and stop — auto-send is structurally impossible, not just policy.

Routing layer for LLMs. Default to cheaper models, escalate to Claude Sonnet only on complexity heuristics, retry with backoff on rate-limit, log token cost per tenant per workflow. Otherwise the bill is the silent killer at scale.

Engagement structure I’d propose:

Phase 0 — $2,800 fixed, 2-3 weeks. Self-hosted n8n + Postgres on Hetzner, RLS schema + migrations, credential namespacing, checkpoint + audit tables live, first canary workflow shipped, first tenant onboarded as test. Three milestones — pay per milestone, can cancel after any.
Phase 1 — $2,200 fixed, 2-3 weeks. Operational workflows on top of Phase 0 foundation. Browserbase + OpenRouter wired. Per-workflow canary + audit. Approval queues rendered in Appsmith.
Maintenance retainer — $400/month. Canary monitoring, drift remediation, model upgrades, schema migrations as you add tenants. Cancel any month.

One genuine differentiator. I build n8n workflows with Claude Code + the n8n-mcp server, which means I can demo construction of a multi-tenant workflow live on our discovery call — not a recording, real-time. None of the standard fluff matters as much as you seeing the build mechanic. Happy to set it up.

DM me if you want to scope further. Available within 24 hrs.

Syed Noor

maxim_makselyanov · May 16, 2026, 11:12pm

Hi JEnterprises. I can help with a bounded paid Phase 0/1 slice, but I want to be transparent: I would not claim a verifiable 6+ month unattended n8n + Postgres medical-ops system unless I can actually show you one. If that is a hard filter, no worries.

What I can offer is a reliability-first Phase 0 sprint: self-hosted n8n + Postgres/RLS schema, Drive document index, audit/error tables, daily canary workflow, one approval-gated lead-gen pipeline stub, and handoff docs. I would keep PHI/patient data out of scope and treat scraping/authenticated sessions as separate approvals.

For context rot / agent drift, I would avoid relying on long free-form agent memory: structured checkpoints in Postgres, deterministic workflow state, versioned prompts, eval/canary tests, per-workflow logs, retry isolation, and human approvals before any external sends.

Rough first milestone: 3-5 days for Phase 0 design + runnable skeleton; Phase 1 after that as a separate scoped build once sources and acceptance tests are clear.

One question: do you want candidates to quote the full Phase 0 + Phase 1 now, or would a smaller paid architecture/prototype milestone be acceptable first?

timai11 · May 18, 2026, 9:11pm

Hi — for a multi-tenant platform that should run for years, I’d avoid rushing straight into a huge build.

I’d start with a paid technical discovery: map tenants, roles, data boundaries, Postgres schema, core workflows, audit/error logs, and one production-grade vertical slice in n8n. That gives you a maintainable foundation before adding agents or more states.

I can help with the first architecture/workflow slice and clear handoff docs.

Contact: travisofwork@gmail.com

nxning_ai · May 19, 2026, 3:00am

Hi timai11!

Saw your post about [Hiring] Long-term n8n + Postgres build for a 50-state medical consulting practi.

I specialize in n8n workflow automation and have built end-to-end systems for:

CRM pipelines & lead generation automation
Multi-platform integrations (Google, Salesforce, HubSpot, custom APIs)
Custom AI agents with n8n (OpenAI, Anthropic)
Production error handling and monitoring

Happy to discuss your requirements. Feel free to DM or email!
BTW, I just released some free automation templates on Gumroad if useful: https://nxning.gumroad.com
Cheers!

Suhail_Narot · May 19, 2026, 5:15am

Hi JEnterprises,

This is exactly the kind of build I specialise in — and your phased plan is well thought out.

To answer your questions directly:

Have I run n8n + Postgres unattended for 6+ months?
Yes. I maintain a production Telegram-to-Zoho Invoice pipeline (n8n self-hosted, Postgres backend) that has been running without intervention since early 2025 for a field services business. It handles real invoicing data daily. I also run a Meta DM chatbot (n8n + Webhook + external API) that has been live for months. I can walk you through the architecture on a call.

How do I prevent context rot and agent drift?
Three layers: (1) Structured memory with explicit schema — every agent writes to a typed Postgres table, never free-form text blobs; (2) Daily canary workflows that replay known input→output pairs and alert on deviation; (3) LLM call logging with structured diffs — I compare today’s output against last week’s on the same input to catch prompt drift before it affects real data.

Phase 0 + Phase 1 rough quote and timeline:
Phase 0 (Hetzner VPS, self-hosted n8n, Postgres with RLS, Appsmith/Tooljet, pgvector, OpenRouter): 1.5–2 weeks, $800–1,200 fixed.
Phase 1 (lead-gen pipeline, scoring agent, human-in-loop approvals): 2–3 weeks, $1,000–1,500 fixed.
Total Phase 0+1: 4–5 weeks, $1,800–2,700. Happy to sharpen once I see your full brief.

Cooperation model:
Hybrid — fixed fee per phase, then monthly retainer for maintenance and future phases.

I’m based in South Africa (UTC+2), available full-time remotely, async by default. Happy to sign an NDA immediately and review your full developer brief.

Feel free to DM me here or reach me at narotsuhail@gmail.com.

— Suhail Narot | Fajr AI | fajrai.net

Suhail_Narot · May 19, 2026, 7:42pm

Hi, this is exactly the kind of build I enjoy working on — long-lived, self-hosted, multi-tenant, with real operational stakes. Let me answer your questions directly.

Have you run an n8n + Postgres system unattended for 6+ months?

Yes. I run production n8n deployments self-hosted on VPS infrastructure (Docker + Compose) with Postgres as the system of record for my own automation business. The systems handle daily scheduled workflows, webhook-triggered pipelines, and LLM-routing logic using OpenRouter. They run without babysitting. For client work, I have built n8n + Postgres pipelines for field-service invoicing that have been live and unattended for months. I do not have a public repo to link, but I can walk through architecture and show workflow exports on a call.

How do you prevent context rot and agent drift in long-running agents?

Three practices I treat as non-negotiable:

1. Canary workflows: A dedicated n8n workflow runs daily on a fixed, known-good input set and validates outputs against expected schema and content. If anything drifts the canary fires before it touches real operations.

2. Structured outputs with schema validation: Every LLM call uses a strict JSON schema so the workflow fails fast on malformed responses. Postgres stores both raw LLM response and parsed output for audit.

3. Checkpoint tables: Each agent run writes its state to a Postgres checkpoints table (run_id, stage, payload, status, created_at). If a run fails mid-way I can inspect exactly where it broke, replay from that stage, and audit what the agent saw. This also covers your audit trail requirement.

Rough quote and timeline for Phase 0 and Phase 1:

Phase 0 (VPS, n8n self-hosted, Postgres with RLS multi-tenant schema, Appsmith/Tooljet dashboard skeleton, Drive document index, OpenRouter routing, pgvector): roughly 2 weeks, $1,200 to $1,800.

Phase 1 (NP/PA lead-gen pipeline — daily scrape of public sources, scoring agent, human-in-the-loop button-click approvals, never auto-send): roughly 2 to 3 additional weeks, $1,500 to $2,200.

These are estimates before seeing your full developer brief. Happy to sharpen them once I read it.

Cooperation model:

Hybrid. Fixed fee per phase for the build work, plus a monthly retainer for monitoring, maintenance, canary reviews and minor updates. This works well on long-term builds.

I am based in South Africa, available async across US time zones. Happy to receive your full brief and do a short call. You can DM me or reach me at narotsuhail@gmail.com.

timai11 · May 20, 2026, 6:03am

Hi — your requirements are unusually clear, especially the “no PHI, operations/compliance only” boundary and the insistence on Postgres, isolation, audit logs, canaries and human approval.

I’ll be transparent: I would not claim a public 6+ month n8n+Postgres medical ops system I can show. What I can offer is a smaller paid technical pre-check or Phase 0 slice where the deliverable is concrete and low-risk: schema/RLS outline, workflow isolation pattern, credential/logging model, canary test design, and one working n8n → Postgres → approval/dashboard path.

For context rot/agent drift, I’d avoid relying on chat history. I’d use structured checkpoints per workflow, source-linked retrieval, deterministic validation rules, canary records, versioned prompts, and an audit table that stores inputs/outputs/decision reasons.

For Phase 0+1, I’d first quote a discovery/architecture slice, then a fixed build once the brief is reviewed.

Contact: travisofwork@gmail.com

maxim_makselyanov · May 22, 2026, 5:19am

Hi JEnterprises. I would treat this as a reliability platform first, and an agent project second.

I cannot honestly point to a public 6+ month n8n + Postgres medical-ops deployment under my name. Most comparable work I have done is private backend/API, CRM/admin, workflow automation, and long-running operations tooling. So I would make fit testable with a paid first milestone instead of asking you to trust claims.

For context rot and agent drift, I would keep agents stateless where possible: Postgres is the source of truth, every run has a ledger row, prompts/models are versioned, source evidence is stored, outputs are schema-validated, low-confidence items go to manual review, and a daily canary checks source availability, output shape, and alert delivery. Nothing should auto-send or update standing-order material without an approval step.

My proposed first slice: one non-sensitive public-source monitor → Postgres schema with tenant/state boundary → n8n workflow with retry/error ledger → manual approval queue → canary result → short runbook. No PHI, no patient data, no legal/regulatory advice, and no production credentials in the public scoping stage.

Roughly, a serious Phase 0 plus one Phase 1 vertical slice sounds like a 3-5 week build depending on dashboard depth and source complexity. I would start with a fixed paid architecture/proof milestone, then quote the larger build after reviewing your developer brief. For maintenance, I prefer hybrid: fixed scopes for new modules plus a monthly care block for canary failures, source changes, and small fixes.

If that evaluation style fits, I can review the brief and propose the smallest paid milestone that proves the system will keep running.

Adliebe · May 22, 2026, 7:23am

Hi JEnterprises — if this is still open, I can take the first paid architecture pass and then the Phase 1 vertical slice.

I would put the reliability layer in before any candidate workflow runs: Postgres/RLS tenant model, workflow-run ledger, source snapshots, approval queue, prompt/model registry, idempotency keys, dead-letter queues, Drive document index, and daily canaries for source freshness, schema-valid output, retrieval quality, scoring drift, and alert delivery.

For context rot, I would not keep durable state inside the agent. Postgres owns state, pgvector is retrieval memory, every run has an input snapshot/output contract/review state, and external-facing actions stay behind explicit approval.

I will not invent a public 6-month medical-ops deployment link if the relevant work is private. The clean proof path is a paid audit-friendly Phase 0 pass with schema, workflow contracts, canary definition, runbook, and acceptance gates another developer can review.

From the public post alone, I would quote:

Phase 0 architecture pass: $2,500 fixed, 5-7 business days.
Phase 0 foundation build: usually $8,500-$14,000 depending dashboard and Drive-index depth.
Phase 1 vertical slice: usually $9,500-$16,000 depending source count and approval workflow detail.
Ongoing maintenance: hybrid fixed-scope phases plus monthly reliability retainer.

No PHI, no patient data, no medical/legal advice, no auto-send. If you share the developer brief and source list, I can turn this into a concrete Phase 0/1 scope with acceptance gates.

oimrqs_ops · May 22, 2026, 9:33am

For a long-term n8n + Postgres build, I’d define Phase 0 as the operating contract: event schema, retry rules, approval ownership, and the handoff point for non-technical staff. That keeps the platform from becoming a pile of one-off automations. If you already know the first workflow to ship, that one use case will tell you whether this is a platform build or a single pilot.

Adliebe · May 22, 2026, 5:34pm

Hi JEnterprises,

This is the right shape for a phased build. I would not start by trying to wire every agent at once. I would start with Phase 0 as an operations kernel:

Postgres schema + RLS boundaries for physician, state, document, workflow, approval, and audit entities
self-hosted n8n deployment wired to Postgres, with backup/restore proof
Drive document index and canonical document lookup
execution/audit tables for every workflow run, source, approval, and error
canary checks for data freshness and output drift
one Phase 1 lead-gen path with human approval and no auto-send behavior

I will not pretend I have a public six-month n8n/Postgres case study I can link. What I can offer is a paid Phase 0/1 slice that proves the architecture before you commit to the full platform: 5 business days, $1,500-$3,500 depending on access and how much of Phase 1 you want included. Deliverables would be schema, deployment notes, workflow skeletons, test data, failure/retry rules, and a handoff another developer could pick up.

For long-term maintenance I prefer a hybrid model: fixed milestones for new capability, plus a small monthly retainer for monitoring, backups, workflow drift checks, and emergency fixes.

If useful, DM me the developer brief and I will respond with a concrete Phase 0 acceptance checklist and estimate.

Ayman50 · May 23, 2026, 7:15pm

Hi JEnterprises,

I want to start differently from most replies here —
I’m not just a developer who builds for healthcare.
I work inside a hospital 3 days a week as a Healthcare
Quality & Safety professional. I’ve lived the exact
fragmentation you described: Notion trackers, scattered
Google Sheets, one-off scripts, compliance data that
never connects. I build the automation layer to fix it.

Have I run an n8n + Postgres system unattended?

Yes. I maintain automated compliance reporting workflows
using structured Postgres-compatible data pipelines that
run without manual intervention — generating department
KPIs, flagging overdue compliance items, and producing
executive summaries on schedule.

Portfolio reference:

For n8n: I use Postgres as the system of record with
structured logging and status checkpoints at each node.
If a workflow fails at step 4, I know exactly why and
at what point — and the next run picks up cleanly.

How do I prevent context rot and agent drift?

Stateless execution + Postgres as single source of truth:

Agent state is never held in memory — every execution
reads from and writes to Postgres
Each run creates an immutable audit row:
timestamp, input, output, status
session_id keyed to the record being processed —
workflow restarts pick up exactly where they left off
LLM nodes: prompt + response checkpointed to Postgres
immediately after execution, not at workflow end
Canary workflows on CRON: known input → expected output
→ Telegram/email alert on drift

Rough quote for Phase 0 + Phase 1

Phase 0 — Platform Foundation (3–4 weeks): $2,000–2,800

Hetzner VPS, self-hosted n8n + Postgres with RLS
Multi-tenant schema with state-level isolation
Credential vault + API key management
Audit trail, error logging, OpenRouter integration
pgvector memory layer, Appsmith/Tooljet dashboard

Phase 1 — Regulatory Watch + Lead Pipeline (2–3 weeks):
$1,200–1,800

Daily 50-state NP/PA rule change scraper with
deduplication
Human-in-the-loop approval before any outbound action
Welcome packet + standing orders automation from
Google Drive

Total Phase 0 + 1: $3,200–4,600

Maintenance billing preference

Hybrid model:

Monthly retainer for monitoring + canary oversight +
minor updates ($300–500/month)
Per-engagement for new phases or major feature adds

What makes me different from other replies here:

Most developers will build exactly what you spec.
I’ll also tell you when the spec needs to change —
because I’ve seen how medical operations actually break
down in the field, not just in architecture diagrams.

Portfolio: Ayman50 · GitHub
Contra: Ayman Ramadan's Work | Contra

Happy to review your PDF spec via DM and give a more
precise quote before any commitment.

-– Ayman Ramadan
Healthcare Quality & Safety | n8n + AI Automation

ronny_workflows · May 23, 2026, 8:27pm

Hi, I would treat this as an operations/compliance platform rather than a normal n8n build.

Because you said no PHI/patient data, the first paid slice can stay safe and concrete:

Entity map: states, regulations, candidates, collaborations, documents, packets, standing orders, approvals.
Postgres as system of record, with n8n as orchestration rather than the data owner.
Human approval gates for regulatory changes and candidate outreach.
Google Drive document lookup/versioning rules.
Audit log for every automation decision and sent packet.

The risk I would solve first is schema + approval design. If that is wrong, every later workflow becomes expensive to change.

I can do a small paid diagnostic first from the owner brief and a redacted phase-0/phase-1 outline. I have a short no-PHI Postgres/n8n diagnostic map ready and can send it by DM if useful.

XiaoLuxtl · May 24, 2026, 12:47am

Hi — I’ll answer your four questions directly.

1. Live system

I run a self-hosted multi-tenant n8n + PostgreSQL system in production on Oracle Cloud (Docker, ARM64). It handles WhatsApp webhooks, RAG pipelines, and per-tenant row-level data isolation. It’s been running unattended for months.

Case study + live demo: Production-Grade Multi-Tenant RAG WhatsApp Bot on Oracle Free Tier

2. Context rot and agent drift

I address this with persistent Redis-backed memory (session keyed by entity ID), daily canary executions that compare output against a known-good baseline, and structured audit logs in PostgreSQL. Each agent workflow is isolated — a failure in one cannot propagate to others.

3. Phase 0 + Phase 1 quote

$800–$1,200 USD fixed. Timeline: 14–18 days from VPS access.

Includes: n8n on Hetzner, Postgres with row-level security, pgvector, OpenRouter integration, Drive document index, Appsmith/Tooljet dashboard scaffold, and the NP/PA lead-gen pipeline with human-in-the-loop approval.

4. Engagement model

Hybrid. Fixed fee per phase delivery + monthly retainer for maintenance, monitoring, and canary oversight. This aligns my incentives with system stability, not billable hours.

Happy to review your developer brief privately. DM me anytime.

Flowmatic_Works · May 24, 2026, 7:25am

Hi JEnterprises,

Your brief is exceptional — it’s clear this needs to run like infrastructure, not a hobby project.

Self-hosted n8n + Postgres running 6+ months unattended? Yes. I run Flowmatic’s own ops on a self-hosted n8n instance backed by Postgres with row-level security — email triage, lead scoring, and reporting agents running 8+ months without intervention. Happy to share a Loom walkthrough.

How I prevent context rot:

Stateless design: agents checkpoint structured state to Postgres, not in-memory context
- Nightly canary workflows compare output against golden datasets, alerting via Telegram on drift
- - Strict JSON schemas enforced at every LLM boundary with retry + fallback
- Rough quote for Phase 0 + 1:
- Phase 0 (VPS, n8n, Postgres RLS, Appsmith, pgvector, OpenRouter, Drive index): €1,800–€2,400
- Phase 1 (Lead-gen pipeline, scoring agent, human-in-the-loop approvals): €1,200–€1,800
- Timeline: Phase 0 → 2 weeks, Phase 1 → 2 additional weeks
Preferred model: Hybrid — low monthly retainer (~€300–500/mo) for monitoring + canary tests, per-engagement fees as new phases open.

Happy to review your developer brief privately — just DM me.

Best,

Oktay Kilic — Flowmatic Works

syed_noor · May 24, 2026, 9:13am

Hi JEnterprises,
I run noorflows — productized n8n infrastructure consulting. Your spec is one of the most architecturally precise briefs I have seen on this board, so I will match that precision instead of sending a generic pitch.

Answering your four questions directly:

Have I run n8n + Postgres unattended for 6+ months?

I will be honest — I do not have a single public link to a 6+ month deployment I can point you to. What I can show you is the production-readiness framework I use on every engagement, which I published on this forum: The 6-dimension production-readiness checklist ( The 6-dimension production-readiness checklist I've been using on every n8n workflow review ). It covers the exact patterns your spec requires —idempotency, retry logic, dead-letter queues, audit trails, monitoring, and secrets management. I built this from the failure modes I have seen in real client environments.

I also maintain my own self-hosted n8n + Postgres infrastructure and have just this week taken on two community members’ production migrations (MySQL->Postgres stabilization and Hostinger->Docker queue mode), so I can provide those as references once delivered.

How do I prevent context rot and agent drift in long-running agents?

Three layers:

Stateless agents, stateful database. Agents do not hold memory between runs. Every fact they need is queried from Postgres at invocation time, and every output is written back with a timestamp. There is no in-memory state that can degrade — the agent is rebuilt from the database on every execution. pgvector embeddings get a created_at column and a staleness threshold; anything older than N days is re-embedded before use.
Daily canary test. A dedicated workflow runs a known-good input through each agent every 24 hours and compares the output against a baseline. Deviation beyond a threshold triggers an alert to you, not a silent failure. This is a separate workflow so a failure in production does not suppress the canary.
Output check summing on critical paths. For compliance digests and contract review, the agent’s output includes a structured confidence field. If the LLM returns low confidence or a malformed structure, the workflow halts at the human-approval step instead of proceeding. This prevents drift from becoming action.

Rough quote and timeline for Phase 0 and Phase 1:

Phase 0 (Platform foundation): $3,500 USD, 2-3 weeks. Hetzner VPS provisioned, Docker Compose stack (n8n +Postgres + Redis queue mode), row-level security schema, Appsmith dashboard scaffold, Drive document index, OpenRouter integration, pgvector memory store, daily canary workflow, backup + monitoring baseline. Deliverable: a running platform you can SSH into, with documentation sufficient for developer handoff.
Phase 1 (NP/PA lead-gen pipeline): $2,500 USD, 2 weeks after Phase 0 validated. Source scrapers, scoring agent with configurable weights, intake analyzer, human-in-the-loop approval via Appsmith button (never auto-send), Postgres audit trail on every candidate state change.
Combined Phase 0+1: $5,500 USD, 4-5 weeks total.

These are fixed-price per phase with milestone-based payment — you pay Phase 1 only after Phase 0 is validated and running to your satisfaction. If Phase 0 does not meet your spec, you owe nothing further.

Engagement model for ongoing work:

Hybrid. Fixed-price per phase for builds (Phase 2, 3, 4 scoped individually), plus a monthly retainer ($500-$800/mo) for monitoring, canary test maintenance, drift remediation, and ad-hoc operational support. The retainer is optional — some clients prefer per-engagement only, and that works too. My recommendation for a system this critical is the retainer, because the value is in catching problems before they become incidents, not fixing them after.

What I would add to your spec that is not there yet:

One thing I noticed missing from your phase descriptions is ownership and handoff readiness — specifically: who owns each workflow after go-live, what constitutes a successful run beyond a green execution, which actions are dangerous on retry (e.g., duplicate welcome packets), and where the replay procedure lives. I would build this into Phase 0 as a lightweight ops runbook per workflow, so when you onboard a second physician the operational knowledge transfers with the system, not just the code.

About me: I work async, communicate via email, and document everything I build. Based in Pakistan, available in US Eastern overlap hours. Profile and service catalog at https://noorflows.com.
Happy to review the full developer brief and master plan privately. DM me here or email: noor@noorflows.com.

Noor

Rillesnille1 · May 25, 2026, 4:40am

Hi,

Your post caught my eye because I’m running almost exactly this stack right now.

The short version: I have a self-hosted n8n + Postgres system on Hetzner that’s been running unattended since late 2024. It currently has 300+ active workflows, multi-tenant isolation with row-level security (163 RLS policies), Redis queue mode with 3 workers, pgvector for semantic memory, and daily watchdog canary tests that alert me via Telegram if anything drifts. It’s the backend for a SaaS platform I built for the restaurant industry.

On your specific questions:

Context rot and agent drift - this is something I’ve spent a lot of time on. The approach that’s worked for me: checkpoint agent state to Postgres at every decision point, run daily canary tests that replay a reference scenario and diff the output against a known-good baseline, and keep episodic memory (session context) separate from semantic memory (document retrieval via pgvector). I also pin model versions and log exact model + temperature per output, which matters a lot for anything compliance-sensitive. Your spec already anticipates most of this, which tells me you’ve thought it through seriously.

Phase 0 + 1 estimate: I’d say 2-3 weeks for Phase 0 (I already run this exact infrastructure, so it’s more about configuring your specific schema and dashboard than figuring out the architecture), and 2-3 weeks for Phase 1. I’ve built lead scraping + scoring + human-approval pipelines before - processed 2000+ businesses per municipality with AI classification and button-click approvals. So 4-6 weeks total for Phase 0+1, and I can start right away.

Engagement model: For a build like this, hybrid has worked best for me - fixed price for Phase 0+1 delivery so you can evaluate the work cleanly, then a monthly retainer for maintenance and future phases. Happy to discuss what makes sense for your situation.

A few things that might matter to you:

I set up RLS from day one on my own platform. Retrofitting it later was painful (learned that the hard way), so I’m glad you’re starting there.
Credentials go in an encrypted vault with approval workflow, not in environment variables or config files.
Every workflow is isolated - a failure in one never contaminates another. Full error logging, retry logic, and audit trail on everything.
I write documentation with the assumption that someone else needs to take over. It’s a habit from running a production system where I need things to survive without me.

I’d love to see your full developer brief. Feel free to DM me and I’ll share more details about my setup as well.

Flowghost_24 · May 25, 2026, 6:20pm

Hi JEnterprises, Manchit here. This fits my background well.

I’ve spent 17+ years around healthcare systems, SQL/ETL, API validation, UAT, production support, and workflow automation. Your note about no PHI, audit trails, row-level separation, canary checks, and long-running stability is exactly the right way to frame this.

I won’t claim a 6+ month n8n + Postgres reference I can publicly show. What I can bring is the production discipline you’re asking for: rollback plans, error logging, approval gates, source checks, and documentation another developer can inherit.

For Phase 0 and 1, I’d start with the database/workflow boundaries, failure logging, approval gates, and the first lead-gen pipeline slice. Happy to quote a paid first milestone above $500 once I understand the current stack and expected sources.

timai11 · May 28, 2026, 9:00am

Hi JEnterprises,

I like that you separated this from PHI/patient data and framed it as operations/compliance/contract workflow. I would not try to build the whole platform in one jump.

The safest first paid slice would be Phase 0 plus one tiny Phase 1 path:

self-hosted n8n + Postgres foundation with environment variables, backups and basic run logging
simple schema for states, candidates, sources, scores and approval status
one public-source lead intake workflow
one human approval step before anything is sent or promoted
clear test data, rollback notes, and a short operator handoff

For multi-tenant readiness I would keep tenant/entity IDs in the data model from day one, avoid mixing secrets with workflow logic, and make every automation leave an auditable trail. For the compliance side, I would keep AI as a drafting/scoring assistant with human review, not as the final decision maker.

I can help with a fixed-scope Phase 0 review/build checklist or a small working pilot before you commit to the larger roadmap.

Best,
Tim

sravan27 · May 29, 2026, 7:42pm

Hi @JEnterprises — this is the kind of build I actually want to be doing: a system of record meant to run for years, not a throwaway Zap. A few specifics so you can see I read the brief, not just “I do n8n”:

Postgres as the system of record (off Notion): model the core entities (collaborations, candidates, states, documents, standing orders) properly, with migrations and an audit trail — in a medical/compliance context every state change has to be traceable.
State-by-state regulatory monitoring (NP/PA collaboration, IV hydration, med spa, GLP-1, aesthetics): a scheduled pipeline that pulls each state’s source, diffs against the last known version, and surfaces only real changes for review (no alert fatigue), with the rule logic versioned so you can see why something fired.
Candidate sourcing + human-in-the-loop: ingest from boards/groups → dedupe → score → queue for your one-click approve/reject. Nothing auto-sends without your gate.
Welcome packets / standing orders: templated, consistent generation triggered when a collaboration goes active, with the artifacts stored and logged.

On reliability (what matters for a years-long medical system): I obsess over correctness and tests. Recent proof — this month I shipped reduced-repro + tested fixes that were merged into production databases (Rocicorp’s Zero and InstantDB) plus an approved app PR to Omi. I build in a sandbox with synthetic data, hand off documented + version-controlled JSON, and you hold prod credentials.

Suggestion: start with Phase 1 = Postgres system-of-record + one regulatory-monitoring pipeline for a single state, fixed scope, so you can judge the foundation before committing to the multi-year build. Send the phases PDF when you can and I’ll come back with a concrete plan + timeline.

— Sravan (sravan272001@gmail.com)

Topic		Replies	Views
[HIRING] Claude API + n8n Engineer — 31-Agent Business Intelligence System \| Remote Jobs	52	1200	June 6, 2026
[FOR HIRE] n8n + Claude API + Postgres engineer — remote, 10-20 hrs/week, multi-tenant production experience Jobs	2	83	May 25, 2026
Looking for n8n expertise Jobs	22	567	March 27, 2026
Looking to hire n8n developers for ongoing automation projects - what should I look for? Jobs n8n-form-trigger	25	490	May 25, 2026
Automation Systems Builder (n8n) – Europe (Remote) Jobs	23	1046	May 21, 2026

[Hiring] Long-term n8n + Postgres build for a 50-state medical consulting practice. Multi-tenant, self-hosted, runs for years

1. Live system

2. Context rot and agent drift

3. Phase 0 + Phase 1 quote

4. Engagement model

Related topics