PSA: v2.7.x HAS BUGS ON GCP - DO NOT UPGRADE

I just tried upgrading from v2.6.4 to 2.7.4 and got bit by this issue:

Highly recommend NOT UPGRADING until this is resolved. FYI.

1 Like

hello @Darien_Kindlund

I’ve updated to 2.7.4 and have no issues.

Check that you have the ENV QUEUE_HEALTH_CHECK_ACTIVE=true as the /healthz endpoint is disabled by default on self-hosted instances
Monitoring | n8n Docs

we have 2.7.4 with QUEUE_HEALTH_CHECK_ACTIVE=true and are getting 404 on healthz.

setup via gcloud

@barn4k , so this issue is on a self-hosted single instance of v2.7.4 not running in queue mode. (I run multiple different n8n deployments in GCP as Google Cloud Run services.) I’m really confused as to why QUEUE_HEALTH_CHECK_ACTIVE is even needed if the n8n instance is not in queue mode.

Regardless, I’ll check if this resolved the issue and report back.

@barn4k

Google Cloud Run (and several other Google Cloud services like App Engine) reserves the /healthz path for internal infrastructure use.
If you attempt to use /healthz as an endpoint in your application, requests to that path will often be intercepted by the Google Front End (GFE) or the Cloud Run control plane, resulting in a 404 Not Found error before the request ever reaches your container.
Why does this happen?
Google Cloud follows a pattern common in the Kubernetes ecosystem where paths ending in z (e.g., /healthz, /livez, /readyz) are reserved for the system’s own health probes.
Even though you can define these paths in your code, Cloud Run’s routing layer assumes they belong to the underlying infrastructure. Consequently:

  • Requests are “short-circuited”: The infrastructure tries to handle the request itself.
  • Container Logs are empty: You won’t see these requests in your application logs because they never hit your server.
    Reserved Paths to Avoid
    To ensure your traffic reaches your container, you should avoid the following reserved patterns in Cloud Run:
    | Path Pattern | Example | Reason |
    |—|—|—|
    | Ends in z | /healthz, /readyz, /livez | Reserved for internal system health probes. |
    | Starts with /_ah/ | /_ah/health, /_ah/start | Reserved for App Engine/Serverless internal hooks. |
    | Specific paths | /eventlog | Reserved for internal logging/telemetry. |
    The Fix: How to implement Health Checks
    If you need to implement health checks for your Cloud Run service (using Startup or Liveness probes), follow these steps:
  • Rename your endpoint: Change your application code to listen on a non-reserved path, such as /health, /status, or /ready.
  • Update Cloud Run Probes: Configure your service to use this new path for its health checks.
    • Startup Probe: Used to determine when the container is ready to receive traffic.
    • Liveness Probe: Used to determine if the container should be restarted.

[!TIP]
Probing Tip: If you are using a framework like Streamlit, which defaults to /healthz, you may need to explicitly change its configuration or use a version (like Streamlit 1.14.1+) that has addressed this GCP-specific conflict.

Even if I re-route like /health → /healthz via some sort of custom YAML setting in Google Cloud Run, I’d still have to monkey patch the front end code to use /health instead. It would be VERY USEFUL to have a ENV variable set which allows me to rename the /healthz endpoint name in n8n.

@barn4k , I suspect the reason you have no issues is because you did not deploy self hosted n8n inside a Google Cloud Run service, as this appears to be a GCR specific limitation.

v2.7.4 makes it impossible to run self hosted n8n on GCR

It seems to be an issue with the GCR configuration, rather than n8n.

Have you tried configuring the health check for /healthz endpoint of the container?
Configure container health checks for services | Cloud Run | Google Cloud Documentation
as it’s the main endpoint for n8n’s monitoring

@barn4k , I have specified health checks for the container in Google Cloud Run (see below). The health checks from inside the GCR service are working perfectly.

The problem is that after upgrading to v2.7.4, the browser is now performing external health checks to /healthz which are now getting blocked by Google Cloud Run and preventing users from saving any workflows.

@barn4k , when I look at the network activity in my browser, I’m seeing this:

And I’m assuming that because these external /healthz checks fail, that causes this button to go *offline:
*

This is the culprit line:

Which was added in this PR:

In short, this feature was never properly tested on Google Cloud Run deployments of self-hosted n8n.

So to sum up: in v2.7.0, this new “pause autosave feature was added”, which broke n8n self-hosted deployments running on Google Cloud Run.

As a temporary workaround, in case you are forced to upgrade to v2.7.x or higher before this underlying fix is resolved, you can deploy the following Tampermonkey script to effectively “fake” the /healthz responses back to the browser.

Note: You’ll need to change the @include lines to match the domain names of your websites. This example version matches all domains that have n8n in the subdomain field.

// ==UserScript==
// @name         n8n Health Check Bypass for Cloud Run
// @namespace    http://tampermonkey.net/
// @version      1.0
// @description  Bypass n8n /healthz polling to prevent false "connection lost" errors on Google Cloud Run
// @author       Darien Kindlund
// @include      /^https?:\/\/n8n\..+\..+\/.*/
// @include      /^https?:\/\/.+\.n8n\..+\/.*/
// @grant        none
// @run-at       document-start
// ==/UserScript==

/**
 * This script intercepts browser fetch() calls to /healthz and returns a fake
 * successful response to prevent n8n v2.7.4+ from showing "connection lost"
 * errors when deployed on Google Cloud Run (which blocks public /healthz access).
 *
 * Matches domains like:
 * - n8n.example.com
 * - n8n.company.io
 * - subdomain.n8n.example.com
 * - n8n-prod.example.org
 */

(function() {
    'use strict';

    console.log('[n8n Health Check Bypass] Script loaded');

    // Store original fetch function
    const originalFetch = window.fetch;

    // Override window.fetch
    window.fetch = function(...args) {
        const url = args[0];

        // Check if this is a /healthz request
        if (typeof url === 'string' && url.includes('/healthz')) {
            console.log('[n8n Health Check Bypass] Intercepted /healthz request - returning fake success');

            // Return a fake successful response without making the actual network request
            return Promise.resolve(new Response(
                JSON.stringify({ status: 'ok' }),
                {
                    status: 200,
                    statusText: 'OK',
                    headers: new Headers({
                        'Content-Type': 'application/json'
                    })
                }
            ));
        }

        // Pass through all other requests to the original fetch
        return originalFetch.apply(this, args);
    };

    console.log('[n8n Health Check Bypass] fetch() interceptor installed successfully');
})();

Where do you install this workaround in cloudRun?

@Chris_Bradley , the workaround is client side, not server side. You install it in your browser.

Yes, I see the same behavior once I’ve blocked the access to the healthz endpoint.

.

Apparently other platforms also mangle /healthz:

  1. Azure Container Apps (ACA) & Azure Functions

Azure doesn’t “hard-block” /healthz at a global firewall level like Google does, but it has a “reserved by convention” behavior:

  • The Conflict: If you enable Container Health Probes in ACA, Azure’s infrastructure (specifically the Envoy-based ingress) may hijack the path if you haven’t explicitly mapped it to your container.

  • Result: You might see a 503 Service Unavailable or a 404 generated by the Azure Load Balancer rather than your code.

2. Managed Kubernetes (EKS, AKS, GKE, DigitalOcean)

If you are running a containerized app on a managed Kubernetes service, the Ingress Controller (like NGINX, Traefik, or Istio) often has default settings that “steal” specific paths:

  • Paths: /healthz, /livez, /readyz.

  • The “Similar Fashion” Block: Many Ingress controllers are configured to respond to these paths directly to prove the load balancer is working, without ever forwarding the request to your application.

  • The Symptom: Your app logs show zero traffic for these paths, but your monitoring tool shows they are returning 200 OK (because the load balancer is answering for you).

3. Fly.io (The “Internal-Only” Block)

Fly.io doesn’t block /healthz for external users, but it treats it uniquely for internal networking:

  • Fly-Proxy: Their proxy can be configured to use /healthz for its own health checks. If you misconfigure the fly.toml to use a path that your app doesn’t actually serve, the Fly Proxy will kill the connection before it reaches your machine.

  • Difference: This is a “fail-closed” block rather than a “system-reserved” block.