Workspace offline (503) & Connection lost during workflow executions

Describe the problem/error/question

Hi everyone,
I am running multiple scraping and data-processing workflows in n8n Cloud and I am running into a recurring problem.
My workflows start normally and run fine for some time, but after a few seconds or minutes I frequently get:

  • “Connection lost” in the editor
  • and after refreshing the page: Workspace offline (503)
    The workspace then seems to restart automatically, interrupting all running workflows.

Current setup

  • Running on n8n Cloud
  • Workflows process rows from Google Sheets
  • Loops with batch size = 1
  • Multiple HTTP requests per row
  • OpenAI API calls
  • Updates back into Google Sheets

Observations

  • Workflows run correctly in principle - and did for thousand of rows until now
  • Problem appears under heavier load
  • Happens across different workflows
  • Executions often stop unexpectedly
  • Workspace becomes temporarily unavailable

Questions

  1. Is my workspace crashing due to resource limits?
  2. Are there execution or memory limits in n8n Cloud that cause automatic restarts?
  3. What is the recommended way to run long-running scraping workflows reliably?
  4. Is this related to concurrency or memory usage?

Goal

I want workflows to run reliably until completion without workspace restarts or interruptions.
Any guidance or best practices would be greatly appreciated.
Thanks!

What is the error message (if any)?

Please share your workflow

Hi

I can see you raised a support ticket for this, you are currently with the AI - reply to its message stating that you would like to speak to a human so that you can get through to one of our engineers.

I’d also recommend checking out this article:

Hey @Ayman,

503 errors during workflow execution usually point to resource constraints or connection issues. Here’s what I’d check:

If self-hosted:

  • Monitor your server’s RAM/CPU during workflow runs - n8n can be memory-hungry with large datasets
  • Check your reverse proxy timeout settings (nginx/traefik) - long-running workflows might hit the default timeout
  • Look at docker logs or your n8n logs for more specific errors

If n8n Cloud:

  • Large data processing can hit execution limits
  • Try breaking your workflow into smaller chunks using sub-workflows

Quick fixes to try:

  1. Increase N8N_PAYLOAD_SIZE_MAX if processing large files
  2. Add error handling nodes to capture where exactly it fails
  3. Enable saveExecutionProgress in settings to see partial results

Can you share more about your setup (cloud/self-hosted) and what the workflow is processing when it fails?