Getting data in and out of legacy systems without an API

I’ve been working with a client in healthcare who’s stuck on a legacy software with no API. We spent weeks trying browser automation and it was a nightmare. Has anyone else run into this? I’d love to hear strategies on how you got the data in and out of the systems without an API

1 Like

Hi @Gabe_Mamallo Welcome to the community!

If that legacy system has a database you can put that system online and take use of it, although i would say this is not possible so for now the best approach here to export that legacy system’s data maybe in csv formats and save it in sheets and use google cloud console to access those sheets cause this is the only approach you can take if that system allows exports else there is not much you can do.

1 Like

I worked as a browser automation engineer for a few years. Sure it’s not easy but always there’s a way, it is a specific expertise though.

Usually you’re better off reverse engineering the APIs the browser sends than trying to implement each mouse and keyboard event.

If you1d consider outsourcing, message me or find me on [email protected]. I can send some example browser automations I did before.

I definitely feel for you regarding the nightmare that is browser automation—it’s honestly the last thing anyone should rely on, unless it’s both vital and legal to do so. However, there’s a much deeper issue here. Relying on UI scraping for medical data doesn’t just create stability problems; it points to a fundamental misunderstanding of system architecture. It’s a massive legal and ethical liability that won’t pass a HIPAA audit. When people’s lives and privacy are at stake, you need a secure, system-level data pipeline, not a brittle workaround. It’s worth rethinking the entire strategy from the ground up before the risks become too high

Great topic — legacy system integration is something I’ve dealt with a lot. Here are the practical approaches that actually work in n8n:

1. Direct Database Connection (most common for healthcare legacy systems)

Many legacy systems (even old ones) have a SQL database underneath. n8n has built-in nodes for MySQL, PostgreSQL, MSSQL, and SQLite. If you can get read/write access to the DB directly, you can bypass the UI entirely.

  • Use the Postgres/MySQL node with a read replica if you don’t want to touch production data directly
  • Combine with a Schedule Trigger to poll for new records

2. File-Based Integration (CSV/flat file exchange)

If the system exports to CSV/XML/flat files:

  • Watch for file changes using n8n’s local file trigger or polling a shared folder
  • Use the Read/Write File node + Spreadsheet File node to parse and process
  • Write output files to a network share the legacy system reads from

For n8n + FTP/SFTP: use the FTP node if files are on a remote server.

3. Email-based integration

Many older healthcare systems send reports/notifications via email. Trigger on IMAP Email in n8n, parse the structured content (often PDFs or CSV attachments), and process from there.

4. Browser automation via n8n + external tool

If all else fails and it’s truly UI-only: connect n8n to Puppeteer/Playwright via an HTTP Request to a small local service, or use the community node n8n-nodes-browserless which wraps a Browserless.io container. This is the most brittle but sometimes the only option.

5. Middleware/adapter approach

If the system has any kind of data export scheduled (even print-to-file), build a lightweight Python/Node.js script that reads those files and exposes a simple REST endpoint — then call it from n8n as a webhook/HTTP Request.

For healthcare specifically: always check if the system supports HL7 FHIR or even old HL7 v2 messaging — many legacy EHRs do, even if they don’t call it an “API”.

What specific legacy system are you dealing with? That would help narrow down the best approach.

Hi @Gabe_Mamallo, welcome to the n8n community :tada:
Before jumping into RPA, which honestly tends to become a maintenance nightmare, it would help to clarify a few things: Do you have direct database access? Does the system support structured exports like CSV, XML, or JSON? Is there any standard like HL7 or FHIR available? And is it on-premise or SaaS? If you can share a bit more about what kind of technical access you actually have, without getting into any sensitive data, I can probably point you toward something more specific.