Redis erro read only replica

Hey guys,

I’m facing a problem with my redis that from time to time, it restarts, so the n8n executions get this read only replica error, what can I do to stop this from happening in production?

Do I need to change something in my redis stack? Or is there some n8n variable that I need to add?

I’m using ubuntu 20.04 and docker swarm

Stack redis

version: "3.7"

services:
  redis:
    image: redis:latest
    networks:
      - MyNetWork
    ports:
      - 6379:6379
    volumes:
      - redis_data:/data
    deploy:
      placement:
        constraints: [node.role == manager]
    command: ["redis-server", "--appendonly", "yes"]

volumes:
  redis_data:
    external: true
    name: redis_data

networks:
  MyNetWork:
    external: true
    name: MyNetWork

Stack N8N:

version: "3.7"

services:
  n8n_editor:
    image: n8nio/n8n:1.91.0
    command: start
    networks:
      - MyNetWork
    environment:
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_DATABASE=MyDB
      - DB_POSTGRESDB_HOST=MyHostDB
      - DB_POSTGRESDB_PORT=5432
      - DB_POSTGRESDB_USER=MyUserDB
      - DB_POSTGRESDB_PASSWORD=MyPasswordDB
      - N8N_ENCRYPTION_KEY=MyN8NENCRYPTIONKEY
      - N8N_HOST=myexample.com
      - N8N_EDITOR_BASE_URL=https://myexample.com/
      - N8N_PROTOCOL=https
      - NODE_ENV=production
      - WEBHOOK_URL=https://webhook.myexample.com/
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=MyHostRedis
      - QUEUE_BULL_REDIS_PORT=6379
      - QUEUE_BULL_REDIS_DB=2
      - NODE_FUNCTION_ALLOW_EXTERNAL=moment,lodash,moment-with-locales,node-fetch
      - EXECUTIONS_DATA_PRUNE=true
      - EXECUTIONS_DATA_MAX_AGE=720
      - EXECUTIONS_DATA_PRUNE_MAX_COUNT=0
      - GENERIC_TIMEZONE=America/Sao_Paulo
      - TZ=America/Sao_Paulo
      - N8N_ENFORCE_SETTINGS_FILE_PERMISSIONS=true
    volumes:
      - /root/n8n-data/backup:/data/backup
      - /usr/share/fonts/truetype:/usr/share/fonts/truetype/host/
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.role == manager
      labels:
        - traefik.enable=true
        - traefik.http.routers.n8n_editor.rule=Host(`myexample.com`)
        - traefik.http.routers.n8n_editor.entrypoints=websecure
        - traefik.http.routers.n8n_editor.priority=1
        - traefik.http.routers.n8n_editor.tls.certresolver=letsencryptresolver
        - traefik.http.routers.n8n_editor.service=n8n_editor
        - traefik.http.services.n8n_editor.loadbalancer.server.port=5678
        - traefik.http.services.n8n_editor.loadbalancer.passHostHeader=1

  n8n_webhook:
    image: n8nio/n8n:1.91.0
    command: webhook
    networks:
      - MyNetWork
    environment:
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_DATABASE=MyDB
      - DB_POSTGRESDB_HOST=MyHostDB
      - DB_POSTGRESDB_PORT=5432
      - DB_POSTGRESDB_USER=MyUserDB
      - DB_POSTGRESDB_PASSWORD=MyPasswordDB
      - N8N_ENCRYPTION_KEY=MyN8NENCRYPTIONKEY
      - N8N_HOST=myexample.com
      - N8N_EDITOR_BASE_URL=https://myexample.com/
      - N8N_PROTOCOL=https
      - NODE_ENV=production
      - WEBHOOK_URL=https://webhook.myexample.com/
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=MyHostRedis
      - QUEUE_BULL_REDIS_PORT=6379
      - QUEUE_BULL_REDIS_DB=2
      - NODE_FUNCTION_ALLOW_EXTERNAL=moment,lodash,moment-with-locales,node-fetch
      - EXECUTIONS_DATA_PRUNE=true
      - EXECUTIONS_DATA_MAX_AGE=720
      - EXECUTIONS_DATA_PRUNE_MAX_COUNT=0
      - GENERIC_TIMEZONE=America/Sao_Paulo
      - TZ=America/Sao_Paulo
      - N8N_ENFORCE_SETTINGS_FILE_PERMISSIONS=true
    volumes: 
      - /root/n8n-data/backup:/data/backup
      - /usr/share/fonts/truetype:/usr/share/fonts/truetype/host/
    deploy:
      mode: replicated
      replicas: 5
      placement:
        constraints:
          - node.role == manager
      labels:
        - traefik.enable=true
        - traefik.http.routers.n8n_webhook.rule=(Host(`whk.myexample.com`))
        - traefik.http.routers.n8n_webhook.entrypoints=websecure
        - traefik.http.routers.n8n_webhook.priority=1
        - traefik.http.routers.n8n_webhook.tls.certresolver=letsencryptresolver
        - traefik.http.routers.n8n_webhook.service=n8n_webhook
        - traefik.http.services.n8n_webhook.loadbalancer.server.port=5678
        - traefik.http.services.n8n_webhook.loadbalancer.passHostHeader=1

  n8n_worker:
    image: n8nio/n8n:1.91.0
    command: worker --concurrency=20
    networks:
      - MyNetWork
    environment:
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_DATABASE=MyDB
      - DB_POSTGRESDB_HOST=MyHostDB
      - DB_POSTGRESDB_PORT=5432
      - DB_POSTGRESDB_USER=MyUserDB
      - DB_POSTGRESDB_PASSWORD=MyPasswordDB
      - N8N_ENCRYPTION_KEY=MyN8NENCRYPTIONKEY
      - N8N_HOST=myexample.com
      - N8N_EDITOR_BASE_URL=https://myexample.com/
      - N8N_PROTOCOL=https
      - NODE_ENV=production
      - WEBHOOK_URL=https://webhook.myexample.com/
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=MyHostRedis
      - QUEUE_BULL_REDIS_PORT=6379
      - QUEUE_BULL_REDIS_DB=2
      - NODE_FUNCTION_ALLOW_EXTERNAL=moment,lodash,moment-with-locales,node-fetch
      - EXECUTIONS_DATA_PRUNE=true
      - EXECUTIONS_DATA_MAX_AGE=720
      - EXECUTIONS_DATA_PRUNE_MAX_COUNT=0
      - GENERIC_TIMEZONE=America/Sao_Paulo
      - TZ=America/Sao_Paulo
      - N8N_ENFORCE_SETTINGS_FILE_PERMISSIONS=true
    volumes: 
      - /root/n8n-data/backup:/data/backup
      - /usr/share/fonts/truetype:/usr/share/fonts/truetype/host/
    deploy:
      mode: replicated
      replicas: 2
      placement:
        constraints:
          - node.role == manager
networks:
  MyNetWork:
    name: MyNetWork
    external: true

Having the same issue here.

I dug into every single conversation here that states “READONLY” and it looks like it always got back to being a docker compose setting. I made sure my depends_on are as needed, and this is not the case here.

The N8N instance works, but then around 40-43 hours later, I start getting this error:
READONLY You can’t write against a read only replica


I checked and docker has all my containers running and active, all the errors are from the worker container with the same READONLY error.

Redis responds to PING with PONG, so there is no issue with it not being working like on https://community.n8n.io/t/error-with-worker-readonly-you-cant-write-against-a-read-only-replica/26679/2

If I do a reboot, it fixes itself. if I docker compose stop and docker compose up -d, it fixes itself.

Any help here?

Docker Compose file:

  n8n:
    image: n8nio/n8n
    restart: always
    ports:
      - "5678:5678"
    links:
      - postgres
    environment:
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=postgres
      - DB_POSTGRESDB_PORT=5432
      - DB_POSTGRESDB_DATABASE=n8n
      - DB_POSTGRESDB_USER=n8n
      - DB_POSTGRESDB_PASSWORD=password
      - N8N_BASIC_AUTH_ACTIVE=true
      - N8N_BASIC_AUTH_USER=user
      - N8N_BASIC_AUTH_PASSWORD=password
      - WEBHOOK_URL=https://n8n.domain.com/
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=redis
      - QUEUE_BULL_REDIS_PORT=6379
      - N8N_ENCRYPTION_KEY=key
      - GENERIC_TIMEZONE=location
      - N8N_REINSTALL_MISSING_PACKAGES=true
      - N8N_SECURE_COOKIE=false
      - NODE_FUNCTION_ALLOW_EXTERNAL=moment
    volumes:
      - /root/n8n-compose/local-files:/files
    depends_on:
      - redis
      - postgres

  postgres:
    image: postgres
    restart: always
    environment:
      POSTGRES_USER: n8n
      POSTGRES_PASSWORD: password
      POSTGRES_DB: n8n

  caddy:
    image: caddy
    restart: always
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile
    depends_on:
      - n8n

  redis:
    image: redis
    restart: always
    ports:
      - "6379:6379"

   n8n-worker-1:
    ports:
      - "5679:5679"
    image: docker.n8n.io/n8nio/n8n
    command: worker
    restart: always
    environment:
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=postgres
      - DB_POSTGRESDB_PORT=5432
      - DB_POSTGRESDB_DATABASE=n8n
      - DB_POSTGRESDB_USER=n8n
      - DB_POSTGRESDB_PASSWORD=password
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=redis
      - QUEUE_BULL_REDIS_PORT=6379
      - N8N_ENCRYPTION_KEY=key
      - GENERIC_TIMEZONE=location
      - NODE_FUNCTION_ALLOW_EXTERNAL=moment
    volumes:
      - /root/n8n-compose/local-files:/files
    depends_on:
      - n8n

I was able to find the source. It looks like a random IP owned by Chinese Tencent is highjacking the redis instance and trying to replicate the data, which then makes it read only, and then the entire n8n server freezes.

I used the following code to get timestamped ordered logs:

docker-compose logs -t --no-color | sort -u -k 3 | less > logs.txt
And then I scrolled for a few minutes until the first read only error, and went up a bit from it to see what happened just before.


Took me days to figure it out. But now that I figured it out it makes so much sense why it was super random and no pattern to the issue.

Fix implemented:
Changing the redis container to the following code to disallow remote connections:
redis:
image: redis
restart: always
ports:
- “127.0.0.1:6379:6379”