[Hiring] Long-term n8n + Postgres build for a 50-state medical consulting practice. Multi-tenant, self-hosted, runs for years

Hi all,

I’m a physician who owns a small medical consulting practice licensed across all 50 states. I’m hiring an n8n developer (or small team) to build the foundation of an agentic operations platform that I expect to run for years, not weeks. I had AI help me write the mockup just so things were explained clearly. I also had it create a longer PDF to outline all phases, but i cannot upload it at this time as I am a new user.

About the work, in plain terms

I currently run my business out of a fragmented stack: Notion trackers, Google Sheets, Zapier flows, a few one-off Node.js scripts. I want a single self-hosted system that:

  • Replaces Notion as the system of record (Postgres)

  • Catches state-by-state regulatory changes affecting NP/PA collaboration, IV hydration, med spa, GLP-1, and aesthetics

  • Finds new collaboration candidates from job boards and groups, with a human-in-the-loop approval step

  • Sends welcome packets and standing orders consistently when a new collaboration goes active, pulling live documents from Google Drive

  • Scales to a second physician without doubling my operational load

No PHI, no patient data. This is operations, compliance, and contract data only.

Phased build

  • Phase 0, Platform foundation: Self-hosted n8n on a Hetzner VPS, Postgres as system of record, multi-tenant from day one with row-level security, low-code dashboard (Appsmith or Tooljet, I’m open), Drive document index, OpenRouter for LLM routing, Browserbase for authenticated web sessions, pgvector for memory.

  • Phase 1, NP/PA lead-gen pipeline: Daily scrape of a few public sources, scoring agent, intake analyzer, button-click approvals (never auto-send).

  • Phase 2, Email-only digests: Weekly compliance digest, 503B pharmacy monitor, legislative monitor, FMV refresh, fee-schedule companion, portfolio digest.

  • Phase 3, Dashboard workflows: Intake analyzer, contract review, standing-order updater, onboarding-document generator v2, capacity tracker, transcript-to-candidate pipeline, welcome-packet dispatcher (Drive-canonical, button-click only).

  • Phase 4, Deeper integrations: Job-board monitor v2, ToDoist bridge, clause library, controlled-substance authority lookup.

I want to evaluate fit on Phase 0 and 1 first, then continue with whoever delivers cleanly.

Hard requirements

  • Vanilla n8n, self-hosted (not n8n Cloud, not OpenClaw)

  • Postgres as the system of record, with row-level security for multi-tenant isolation

  • Persistent memory and checkpointing. Agents must not degrade after weeks or months of running

  • Daily canary test that alerts me if quality drifts

  • Full error logging, retry logic, audit trail

  • Documentation good enough that another developer could take over

  • All credentials stored securely, all API keys owned and controlled by me

  • Each automation isolated. A failure or false-positive in one workflow does not contaminate the others

What I’d like in your reply

  1. Have you run an n8n + Postgres system unattended for 6+ months? If yes, point me to one (link, repo, video walkthrough, anything verifiable).

  2. How do you prevent context rot and agent drift in long-running agents?

  3. A rough quote and timeline for Phase 0 and Phase 1.

  4. For ongoing maintenance and future phase expansion, do you prefer a monthly retainer, per-engagement fees as new work comes up, or a hybrid? Open to hearing what’s worked for you on past long-term builds.

I have a full developer brief and an owner-level master plan ready to share privately with anyone who looks like a fit. Reply here or DM me, I’ll respond to everyone within a few days.

Thanks

4 Likes

Hi JEnterprises, welcome to the community :waving_hand:.

Very solid architecture direction especially the emphasis on workflow isolation, Postgres as the system of record, long-running reliability, and human-in-the-loop approvals instead of uncontrolled agents.

I mainly work on orchestration-heavy automation systems using n8n, APIs, scraping pipelines, AI tooling, and operational workflows, so Phase 0 and 1 are very aligned with the kind of systems I enjoy building.

A few things I’d be curious about:

• are you already decided on Appsmith vs Tooljet for the internal dashboard layer?

• for memory persistence, are you leaning more toward pgvector retrieval or structured checkpoint snapshots per workflow?

A few related builds:

https://www.upwork.com/freelancers/~0122761e4734295f4b?p=2038586338272239616

https://www.upwork.com/freelancers/~0122761e4734295f4b?p=2039118619839795200

Happy to discuss further and review the developer brief.

Email: folafoluwaolaneye@gmail.com

Call: Discovery Call | Folafoluwa Olaneye | Cal.com

Hi JEnterprises — I would treat this as a reliability/platform build first, not an “agent workflow” build.

For Phase 0, the spine I’d want before any lead-gen agent runs is: Postgres schema/RLS with tenant boundaries, workflow-run ledger, per-source audit tables, idempotency keys, checkpoint snapshots, dead-letter/manual-review queues, and a daily canary pack that tests source availability, Drive doc freshness, LLM output schema, pgvector retrieval and alert delivery.

Then Phase 1 should be one NP/PA candidate pipeline only: collect public-source candidates, score into structured rows, attach evidence, and require button approval before any outbound packet/digest. That proves the handoff model without letting one false-positive contaminate the platform.

On the 6+ month proof question, I would not pretend I can publicly link a private medical-ops deployment if it is not public. The safer evaluation is a paid Phase 0/1 slice with artifacts another developer can audit: schema, workflow contracts, tests, runbooks and failure/retry behavior.

Rough timeline from the post alone: Phase 0 + one Phase 1 vertical slice is usually a 2-4 week async build, depending on source count and dashboard depth. I would quote fixed-scope after seeing the brief/source list; no PHI, no medical/legal advice, no auto-send.

This is close enough to work I already know how to structure, and I’d rather start with a paid Phase 0/1 scope than give you a fragile “agent demo.” Reply here or use the email in my profile if you want the exact scope/timeline against your brief.

oimrqs ops

Hey @JEnterprises — this is the kind of project I actually like seeing here because you’re thinking about infrastructure longevity, not just flashy AI demos.

I’m an n8n Level 1 & 2 certified developer, and I’ve built self-hosted automation systems using n8n + Postgres + external agent frameworks for long-running operational workflows.

A few things I specifically align with in your requirements:

  • Postgres as source of truth (not Notion/Sheets)

  • isolated workflows with failure containment

  • human-in-the-loop approvals

  • execution persistence + retry logic

  • auditability and logging

  • modular agents instead of giant “god workflows.”

  • self-hosted architecture with ownership of credentials and infra

For long-running agent reliability, the biggest thing is preventing context pollution and uncontrolled memory growth. The approach I usually take is:

  • durable state in Postgres instead of agent-only memory

  • strict workflow boundaries

  • checkpoint-based execution

  • scoped memory windows

  • periodic summarization/pruning

  • canary workflows that validate expected outputs daily

  • explicit fallback logic when confidence thresholds fail

For Phase 0 specifically, I’d likely structure this around:

  • self-hosted n8n on Hetzner

  • Postgres + pgvector

  • queue mode for execution stability

  • structured logging + error alerting

  • Appsmith or ToolJet admin layer

  • Google Drive indexing workflows

  • OpenRouter abstraction layer

  • Browserbase for authenticated automation

  • tenant-aware schema/RLS strategy from day one

I’d want to review your full brief before giving a precise quote, but based on the scope described:

Phase 0: ~2–3 weeks
Phase 1: additional ~2–4 weeks, depending on scraping complexity and scoring logic.

For ongoing work, I usually prefer a hybrid model:

  • fixed scope for major builds

  • lightweight monthly retainer for monitoring, maintenance, and incremental improvements

Portfolio/projects:
:globe_with_meridians: Portfolio
:brain: Project demos and walkthroughs
:briefcase: LinkedIn

Happy to review the full developer brief privately if you think the fit is there.

Hi,

Stack. Self-hosted n8n and Supabase installers, open-source on my GitHub. The Supabase installer includes a hardening script that lets n8n connect to Supabase Postgres via host.docker.internal (same server) or whitelisted external IP (different servers), through iptables DOCKER-USER rules. SSL auto-renewal with nginx reload hook, log rotation, Kong configured with 5-minute timeouts for long-running AI workflows. Edge Functions ship with auth + Postgres-based rate limiting.

Multi-tenant. Implemented as additional tables (tenants, tenant_members) in the same instance and same schema, with RLS policies joining through tenant_members. Single Supabase instance, application-level tenant isolation. I’ve built per-user RLS; this is the same pattern with an extra membership layer.

Lead-gen. I’m building a lead-gen pipeline scraping LinkedIn, GitHub, Skool, Fiverr and the n8n forum — funnels per source, contact coverage, geo, dashboard with run history. Triggered on demand. All backend logic is implemented in n8n. Screenshot below. Separately, I run a job-search pipeline using Apify that scrapes 20 sources daily — currently aimed at finding job postings for myself, but the same architecture (scrapers + LLM scoring + structured output) carries over to finding candidates instead. Currently Python — for your build I’d port the orchestration into n8n using Claude Code with the n8n MCP.

Headless / authenticated scraping. Worked with CRW self-hosted (drop-in Firecrawl-compatible REST, simple HTTP Request node from n8n). Open to Browserbase if you prefer managed; happy to discuss tradeoffs.

Dashboard. Prefer building it with Claude Code for full control. WeWeb is also in my regular stack if you want a low-code admin layer instead.

Documentation. Generated and maintained alongside the code with Claude Code — keeps it in sync rather than drifting.

Long-running reliability. For drift in long-running agents I’d lean on durable state in Postgres rather than agent memory, scoped context windows, structured logging of every run, and a daily canary workflow that runs a small fixed test set through scoring/intake and alerts on schema or score deltas.

Quote / timeline. Happy to take Phase 0 + a vertical slice of Phase 1 (one NP/PA candidate pipeline end-to-end with button approval) as a fixed-scope paid pilot — rough estimate 3 weeks once I see the brief and source list. Phases 1 full + 2 in a follow-up sprint.

Engagement model. Hybrid — fixed-scope for major builds, light monthly retainer between sprints for monitoring, source-format breakage fixes when sites change markup, and small additions.

Honest fit. Strong on: self-hosted infra, Supabase/Postgres + RLS, scraping pipelines, n8n workflows, dashboards, documentation. Less prior experience: formal eval/canary harness for LLM drift, multi-tenant at production scale. Both buildable; not pretending I’ve shipped them in this exact shape before.

Happy to review the full brief.

Igor

Hi Profile - JEnterprises - n8n Community,

This is exactly the kind of long-term system I want to be building. The brief is well-structured and the phased approach is the right call.

1. Six-month unattended n8n + Postgres system
Honest answer: my production deployments are under client NDA and I don’t have a public repo I can point you to. What I can offer is a technical walkthrough on a call of how I architect for long-running stability — error isolation per workflow, retry logic with exponential backoff, daily canary pings, and Postgres as the single source of truth. I’ve worked with Supabase in production — which is Postgres with row-level security natively — so the multi-tenant isolation requirement in Phase 0 is familiar ground, not something I’d be figuring out on your dime. MongoDB experience on top of that means I’m comfortable with schema design and persistent data layers more broadly.

2. Preventing context rot and agent drift
Agents never carry state internally. Everything gets written to Postgres after each run — last seen records, scoring history, decision logs. On the next run the agent reads from the DB, not from its own memory. Drift gets caught at the DB layer before you notice it in production. Daily canary test implemented as a synthetic record running through the full pipeline, output checked against expected schema, any deviation triggers an alert.

3. Recruiting relevance
Worth mentioning — I’ve already built a suite of 13 AI agents specifically for recruiting operations: lead sourcing, candidate scoring, intake analysis, outreach sequencing, and pipeline tracking. The NP/PA collaboration candidate pipeline you described in Phase 1 maps directly onto architecture I’ve already proven out. That’s not a build from scratch — it’s an adaptation.

4. Engagement model
Hybrid: fixed price per phase for new builds, small monthly retainer for monitoring, canary maintenance, and priority response when something breaks.

Happy to sign an NDA and review the full brief. DM whenever ready.

-– Aryan

pmediaaryan@gmail.com

Hello @JEnterprises , welcome to n8n community, I have worked and have experience with n8n and l will love to collaborate with you on this you can schedule a call Here and you can checkout my upwork profile Here, for my pastworks and certifications

Hey :waving_hand:,

I’m Milan, with 8 years of experience in Business Automation and AI. Including 2 years at Apify working on enterprise-level browser automation.

Currently specializing in n8n, but also proficient in Python & Javascript.

Find out more about my work here:

If you think I might be a match, please:

Book a call here with me

Or reach out at hello@smoothwork.ai

Looking forward to hearing from you!

This is the kind of system where the foundation matters more than the first few workflows. I would not treat Phase 0 as a quick n8n install. The real work is designing Postgres correctly, isolating workflows, building the audit trail, putting guardrails around the agents, and making sure the system can still be understood six months from now by someone who did not build it.

I’m a strong fit for this because my work is mostly in healthcare-adjacent operations platforms, AI workflow systems, compliance-heavy dashboards, and long-running SaaS infrastructure. I have built systems where role-based access, auditability, secure document handling, human review, and workflow reliability were not optional. Examples include healthcare client-management platforms, compliance tracking systems, AI document intelligence tools, and automated operations workflows tied to live business rules.

To answer your direct question honestly: I have built and maintained long-running Node/Postgres/AWS/AI automation systems and document/compliance platforms, but if that exact n8n + Postgres proof point is a hard filter, I want to be upfront about it.

That said, I do understand how this should be built.

For context rot and agent drift, I would not rely on long prompt chains or vague “memory.” I would structure it more like this:

  • Postgres remains the source of truth

  • pgvector stores retrieval memory, but not uncontrolled decision history

  • every agent run writes inputs, outputs, confidence, source links, and decision state

  • recurring canary tests check known expected behavior against real workflows

  • prompts are versioned, not edited invisibly

  • scoring logic is separated from sending or approval logic

  • every workflow has its own logs, retry handling, and failure boundary

  • human approval is required before anything external-facing goes out

  • alerts fire when scraping quality, scoring confidence, or digest output drifts

For Phase 0, I would expect roughly 3 to 4 weeks depending on how much of the dashboard and Drive index needs to be production-ready versus foundational. Budget range: $8,500 to $14,000.

For Phase 1, I would expect another 3 to 5 weeks depending on the number of sources, scraping complexity, scoring rules, and approval workflow detail. Budget range: $9,500 to $16,000.

A realistic combined range for Phase 0 and Phase 1 would be $18,000 to $30,000. I would tighten that after reviewing the developer brief and confirming source count, dashboard expectations, and the exact approval flow.

For maintenance, I would recommend a hybrid model. Fixed-scope pricing for each build phase, then a monthly support retainer for monitoring, small fixes, workflow adjustments, prompt/version updates, and infrastructure upkeep. For a system like this, pure per-engagement support usually breaks down because small failures need to be handled before they become operational problems.

My suggested first step would be a paid technical discovery and architecture pass. I would review your brief, map the Phase 0 and Phase 1 workflows, identify the data model, define the failure boundaries, and give you a firm implementation quote. That way you are not funding vague automation work. You are funding a system that has a real operating model behind it.

Hello! This sounds like a project where structural integrity is just as important as the automation itself. Moving from a fragmented stack to a unified, self-hosted platform is exactly what’s needed to scale without your operational load doubling.

Below is a direct response to your requirements and the four specific questions you asked.

Direct Answers to Your Questions

1. Have you run an n8n + Postgres system unattended for 6+ months? Yes. I have operated self-hosted n8n on Linux VPS environments using Postgres as the system of record. My setups utilize systemd for process management and automated daily backups with point-in-time recovery. I am deeply familiar with the stability requirements of long-running agent systems and ensure that workflows are strictly isolated so a failure in one never contaminates others.

2. How do you prevent context rot and agent drift? I follow a “Postgres-first” architecture where no state lives only in RAM. My approach rests on four pillars:

  • Checkpointing: Every agent run writes inputs, outputs, and intermediate states to a structured audit table in Postgres before acting.

  • Structured Output Validation: Every LLM call must return a defined schema; if it fails, the workflow halts and logs immediately.

  • Daily Canary Workflows: An isolated workflow runs a known test case every morning and compares it to a baseline, alerting you if quality drifts.

  • Model Routing: Using OpenRouter allows us to swap underlying models without changing the workflow logic if a specific model begins to degrade.

3. Rough Quote and Timeline for Phase 0 & Phase 1

  • Phase 0 (Platform Foundation): 3 to 4 weeks | $2,500 – $3,500.

  • Phase 1 (Lead-Gen Pipeline): 3 weeks | $1,500 – $2,500.

  • Total Project Scope: 6 to 7 weeks | $4,000 – $6,000.

  • Note: I offer a $300 - $500 paid discovery milestone where I set up the VPS and n8n/Postgres instance so you can evaluate my work before committing to the full scope.

4. Preferred Engagement Model I recommend a hybrid model: fixed-fee agreements for each phase (predictable costs) combined with a light monthly retainer of $300 to $500 for canary monitoring, minor fixes, and system health.


Meeting Your Hard Requirements

  • Multi-Tenancy: I will build the Postgres schema with Row-Level Security (RLS) from day one, ensuring data isolation for future physicians.

  • Documentation: My baseline standard is documentation clear enough for another developer to take over the system without a walkthrough.

  • Control: You will own every API key and credential; nothing is stored on my infrastructure.

I have attached a detailed document to this reply for you to view, which outlines the full technical architecture, phase-by-phase deliverables, and audit trail protocols.

I’d love to review your developer brief whenever you’re ready to share it.

Hamza
itsameerhamza203@gmail.com

AXONA - Agentic Operations Platform Project Proposal.pdf (111.9 KB)

Hi!

Your brief is refreshing. It’s rare to find a client who prioritizes a self-hosted Postgres-first architecture with Row-Level Security (RLS) and canary testing. As an automation architect specializing in n8n and self-hosted infrastructure (Docker/Ubuntu/Hetzner), I am a perfect fit for this long-term build.

To answer your specific questions:

  • Long-running systems: I have built and managed several n8n + Postgres systems that have been running unattended for over a year. One example is an automated debt-tracking and notification system for a fintech client. It handles daily logic, database updates, and multi-channel alerts (WhatsApp/Email) based on strict date triggers. While I cannot share private client data, I can provide a video walkthrough of the architecture and logic during a call.

  • Preventing Context Rot: I tackle this by using stateless execution paired with vector memory (pgvector). Instead of letting an agent “drift,” every run starts with a fresh retrieval of the current “source of truth” from Postgres/Drive. I also implement versioned prompts and LLM-as-a-judge nodes to validate the quality of outputs against your “canonical” documentation.

  • Phase 0 & 1 Estimate:

    • Phase 0 (Foundation): $1,500 – $2,000. (Setup of VPS, Dockerized n8n, Postgres RLS, Appsmith/Tooljet boilerplate). Timeline: 1 week.

    • Phase 1 (Lead-gen): $1,200 – $1,800. (Scraping logic with Browserbase, scoring agents, and human-in-the-loop dashboard). Timeline: 1-2 weeks.

  • Cooperation Model: For a system this critical, I recommend a Hybrid model. Fixed-price for the initial phases (0-4), followed by a monthly retainer for maintenance, monitoring of canary tests, and small adjustments.

Why work with me?

I don’t just “connect nodes.” I build systems. My background in SQL (PostgreSQL/Supabase), JavaScript, and Python allows me to write custom logic inside n8n that standard nodes can’t handle. I have extensive experience building “Content Factories” and lead-routing engines for regulated industries like Real Estate, where “human-in-the-loop” and data integrity are non-negotiable.

Tech Stack alignment:

  • Expertise in Docker/Hetzner deployments.

  • Proficient in Postgres RLS and pgvector.

  • Deep experience with OpenRouter and Browserbase.

I am ready to review your developer brief and master plan. I’m looking for a partnership where I can deliver a clean, documented, and bulletproof foundation for your practice.

Portfolio: https://mikedevai.netlify.app/

Telegram: @hely_chatbots
WA: +375293761570

Best regards,

Mikhail (Mike) Rogal

Hi JEnterprises

I get your concept, you want an onboarding platform which greets your new collaborators and gives you a state by state report with an HR department inside the dashboard for finding leads. This is more than simple. We`ll just make a lead pipeline with human in loop, combine API and send custom logic based emails for new recreuiters. Seems like everyday task man!

We can have you on an exploration call and can clear any doubts you have!

Best Regards

Hi there, I’ve sent you a DM. Your focus on workflow isolation, long-term reliability, memory architecture, and operational durability really resonated with us. We’ve been building similar long running n8n systems and would love to discuss this further.