Looking for help with n8n SQLite timeouts / stability (happy to jump on a paid call)

Hey everyone :waving_hand:

I’m running a self-hosted n8n (Docker + Traefik) setup using SQLite and I’m running into recurring database timeouts / instability, especially when multiple workflows execute around the same time.

The problem

  • n8n works fine most of the time

  • but under load (parallel executions), we regularly see:

    • Database connection timed out

    • 503 – Database is not ready

  • CPU/RAM are not maxed out, so this doesn’t seem to be a pure resource issue

What we tried

  • Stayed on SQLite (no Postgres yet)

  • Tried to improve stability by adding SQLite-related env vars:

    • execution data pruning

    • reducing stored execution data

  • After applying env changes + docker compose down / up:

    • n8n became completely inaccessible

    • only returned {"code":503,"message":"Database is not ready!"}

    • logs showed repeated DB connection timeouts

  • Even reverting env vars didn’t help at first

What I’m looking for

  • Someone experienced with n8n internals / SQLite behavior under concurrency

  • Help understanding:

    • why SQLite gets into this broken state so easily

    • whether there’s a safe way to tune SQLite for moderate concurrency

    • or at what point Postgres becomes unavoidable

  • I’m very happy to jump on a call (and pay for your time) to properly fix this and set things up cleanly

If you’ve dealt with this before or know the right approach, I’d really appreciate your help :folded_hands:
Feel free to reply here or DM me.

Hi @Luca2

To be honest I stay away from SQLite with n8n environments.
It is too much of a hassle to go over to Postgres when you do need it and it is too easy to just start with Postgres from the start. Docker makes that very easy, of course tuning can be needed but to get going just a default install of Postgres is always fine.

That being said, you can probably fix your SQLite for your instance. But at some point you will need to go to Postgres. For example when you start using Queue mode of n8n to scale. So in my opinion there is no real point in trying to fix SQLite.
So my best advice would be to migrate over to Postgres.

Hey Luca,

SQLite is fast but it’s just a file — one write at a time. When you run parallel workflows, they queue up waiting for the lock. Hit the timeout limit and you get exactly what you’re seeing.

You can tune it (WAL mode, busy_timeout settings) to handle moderate load better, but there’s a ceiling. If you’re regularly running concurrent workflows, Postgres is the real fix — it’s designed for this.

Happy to jump on a call and either tune what you have or help migrate. DM me.

— Ivan