Describe the issue/error/question
Hello everyone,
I am running n8n in a docker compose setup on an EC2 instance (Amazon Linux 2 - ECS Optimized).
I was able to make it work in a normal (non-queued) mode and when I started to do the scaling, replace the db engine to Postgres successfully - my endpoints were accessible from the internet and everything worked correctly.
Then I followed the rest of the guide to finalize the scaling process but got stuck with “cannot get” error message when trying to access any production webhook, with no idea how to progress.
I noticed some Postgres and Redis errors and warnings being logged after spinning docker up, which I since resolved but that didn’t help with this particular issue. One Redis warning that is still there is:
redis_1 | 1:C 20 Sep 2022 08:56:56.477 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo redis_1 | 1:C 20 Sep 2022 08:56:56.477 # Redis version=7.0.4, bits=64, commit=00000000, modified=0, pid=1, just started redis_1 | 1:C 20 Sep 2022 08:56:56.477 # Configuration loaded redis_1 | 1:M 20 Sep 2022 08:56:56.478 * monotonic clock: POSIX clock_gettime redis_1 | 1:M 20 Sep 2022 08:56:56.482 * Running mode=standalone, port=6379. redis_1 | 1:M 20 Sep 2022 08:56:56.483 # Server initialized redis_1 | 1:M 20 Sep 2022 08:56:56.483 # WARNING Your system is configured to use the 'xen' clocksource which might lead to degraded performance. Check the result of the [slow-clocksource] system check: run 'redis-server --check-system' to check if the system's clocksource isn't degrading performance. redis_1 | 1:M 20 Sep 2022 08:56:56.484 * Ready to accept connections
Currently my editor is accessible via internet, and test webhooks also work - thus indicating that the issue lies in either Redis or routing/load balancing(Traefik). Also from my understanding, Cannot GET is a node error meaning the HTTP method for an endpoint is not defined in the application code.
Error and config details bewlow.
Thanks for looking into this!
Michal
What is the error message (if any)?
Cannot GET /path_to_webhook
when running docker log on the individual containers, I find some weird behavior in:
- webhook listener repeats these messages:
2022-09-16T20:48:30.996Z [Rudder] debug: in flush 2022-09-16T20:48:30.999Z [Rudder] debug: batch size is 1 2022-09-16T20:48:51.513Z [Rudder] debug: in flush 2022-09-16T20:48:51.513Z [Rudder] debug: cancelling existing timer... 2022-09-16T20:48:51.513Z [Rudder] debug: queue is empty, nothing to flush 2022-09-17T02:48:30.995Z [Rudder] debug: no existing flush timer, creating new one
- worker process repeats these error:
Starting n8n worker... 2022-09-16T21:16:19.302Z | debug | No codex available for: N8nTrainingCustomerDatastore.node.js "{ file: 'LoadNodesAndCredentials.js', function: 'addCodex' }" 2022-09-16T21:16:19.307Z | debug | No codex available for: N8nTrainingCustomerMessenger.node.js "{ file: 'LoadNodesAndCredentials.js', function: 'addCodex' }" n8n worker is now ready * Version: 0.192.0 * Concurrency: 10 2022-09-16T21:16:21.410Z | error | Error from queue: "{\n command: {\n name: 'evalsha',\n args: [\n 'ff9c18634832b0b4115a19b4de5f4788a7cfbd4e',\n '7',\n 'bull:jobs:stalled',\n 'bull:jobs:wait',\n 'bull:jobs:active',\n 'bull:jobs:failed',\n 'bull:jobs:stalled-check',\n 'bull:jobs:meta-paused',\n 'bull:jobs:paused',\n '1',\n 'bull:jobs:',\n '1663362981409',\n '30000'\n ]\n },\n file: 'worker.js'\n}" 2022-09-16T21:16:21.412Z | error | Error from queue: "{\n command: {\n name: 'evalsha',\n args: [\n 'ff9c18634832b0b4115a19b4de5f4788a7cfbd4e',\n '7',\n 'bull:jobs:stalled',\n 'bull:jobs:wait',\n 'bull:jobs:active',\n 'bull:jobs:failed',\n 'bull:jobs:stalled-check',\n 'bull:jobs:meta-paused',\n 'bull:jobs:paused',\n '1',\n 'bull:jobs:',\n '1663362981409',\n '30000'\n ]\n },\n file: 'worker.js'\n}" /usr/local/lib/node_modules/n8n/node_modules/redis-parser/lib/parser.js:179 return new ReplyError(string) ^ ReplyError: READONLY You can't write against a read only replica. script: ff9c18634832b0b4115a19b4de5f4788a7cfbd4e, on @user_script:30. at parseError (/usr/local/lib/node_modules/n8n/node_modules/redis-parser/lib/parser.js:179:12) at parseType (/usr/local/lib/node_modules/n8n/node_modules/redis-parser/lib/parser.js:302:14) { command: { name: 'evalsha', args: [ 'ff9c18634832b0b4115a19b4de5f4788a7cfbd4e', '7', 'bull:jobs:stalled', 'bull:jobs:wait', 'bull:jobs:active', 'bull:jobs:failed', 'bull:jobs:stalled-check', 'bull:jobs:meta-paused', 'bull:jobs:paused', '1', 'bull:jobs:', '1663362981409', '30000' ] } }
- traefik:
time="2022-09-16T14:48:05Z" level=info msg="Configuration loaded from flags." time="2022-09-17T18:17:50Z" level=error msg="Error while Hello: EOF"
- redis:
1:S 20 Sep 2022 08:53:42.873 * Connecting to MASTER 178.20.47.79:8886 1:S 20 Sep 2022 08:53:42.873 * MASTER <-> REPLICA sync started 1:S 20 Sep 2022 08:53:42.929 * Non blocking connect for SYNC fired the event. 1:S 20 Sep 2022 08:53:42.984 # Failed to read response from the server: Invalid argument 1:S 20 Sep 2022 08:53:42.984 # Master did not respond to command during SYNC handshake
Information on your n8n setup
- **n8n version: 0.192.0
- **Database you’re using: Postgres
- **Running n8n via Docker:
What I did was get the whole config here and alter it to match the rest of my setup and then I tried to implement fixes which I found in these threads:
https://community.n8n.io/t/webhook-scaling-issues/10404
https://community.n8n.io/t/problems-with-the-n8n-queue-configuration/13697/6
Docker compose below
version: '3.8'
volumes:
db_storage:
n8n_storage:
services:
traefik:
image: "traefik"
restart: always
command:
- "--api=true"
- "--api.insecure=true"
- "--providers.docker=true"
- "--providers.docker.exposedbydefault=false"
- "--entrypoints.web.address=:80"
- "--entrypoints.web.http.redirections.entryPoint.to=websecure"
- "--entrypoints.web.http.redirections.entrypoint.scheme=https"
- "--entrypoints.websecure.address=:443"
- "--certificatesresolvers.mytlschallenge.acme.tlschallenge=true"
- "--certificatesresolvers.mytlschallenge.acme.email=${SSL_EMAIL}"
- "--certificatesresolvers.mytlschallenge.acme.storage=/letsencrypt/acme.json"
ports:
- "80:80"
- "443:443"
volumes:
- ${DATA_FOLDER}/letsencrypt:/letsencrypt
- /var/run/docker.sock:/var/run/docker.sock:ro
postgres:
image: postgres:11
restart: always
environment:
- POSTGRES_USER=${POSTGRES_USER}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
- POSTGRES_DB=${POSTGRES_DB}
- POSTGRES_NON_ROOT_USER=${POSTGRES_NON_ROOT_USER}
- POSTGRES_NON_ROOT_PASSWORD=${POSTGRES_NON_ROOT_PASSWORD}
volumes:
- db_storage:/var/lib/postgresql/data
- ./init-data.sh:/docker-entrypoint-initdb.d/init-data.sh
healthcheck:
test: ["CMD-SHELL", "pg_isready -h localhost -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
interval: 5s
timeout: 5s
retries: 10
redis:
image: redis:7.0-alpine
restart: always
# CHANGE PASSWORD
# command: redis-server --requirepass ${REDIS_PASSWORD}
# below is to fix a redis memory issue
sysctls:
net.core.somaxconn: 1024
volumes:
- ~/redis.conf:/home/redis/redis.conf
ports:
- 6379:6379
environment:
- REDIS_REPLICATION_MODE=master
command: redis-server "../home/redis/redis.conf"
n8n:
image: n8nio/n8n
restart: always
environment:
- DB_TYPE=postgresdb
- DB_POSTGRESDB_HOST=postgres
- DB_POSTGRESDB_PORT=5432
- DB_POSTGRESDB_DATABASE=${POSTGRES_DB}
- DB_POSTGRESDB_USER=${POSTGRES_NON_ROOT_USER}
- DB_POSTGRESDB_PASSWORD=${POSTGRES_NON_ROOT_PASSWORD}
- N8N_BASIC_AUTH_ACTIVE=true
- N8N_BASIC_AUTH_USER=${N8N_BASIC_AUTH_USER}
- N8N_BASIC_AUTH_PASSWORD=${N8N_BASIC_AUTH_PASSWORD}
- N8N_HOST=${SUBDOMAIN}.${DOMAIN_NAME}
- N8N_PORT=5678
- N8N_PROTOCOL=https
- NODE_ENV=production
- N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
- N8N_ENDPOINT_WEBHOOK=webhook
- N8N_ENDPOINT_WEBHOOK_TEST=webhook-test
- WEBHOOK_URL=https://${SUBDOMAIN}.${DOMAIN_NAME}/
- GENERIC_TIMEZONE=${GENERIC_TIMEZONE}
- EXECUTIONS_MODE=queue
- QUEUE_BULL_REDIS_HOST=redis
- QUEUE_BULL_REDIS_PORT=6379
- N8N_DISABLE_PRODUCTION_MAIN_PROCESS=true
ports:
- 5678:5678
labels:
- traefik.enable=true
- traefik.http.routers.n8n.rule=Host(`${SUBDOMAIN}.${DOMAIN_NAME}`)
- traefik.http.routers.n8n.tls=true
- traefik.http.routers.n8n.entrypoints=web,websecure
- traefik.http.routers.n8n.tls.certresolver=mytlschallenge
- traefik.http.middlewares.n8n.headers.SSLRedirect=true
- traefik.http.middlewares.n8n.headers.STSSeconds=315360000
- traefik.http.middlewares.n8n.headers.browserXSSFilter=true
- traefik.http.middlewares.n8n.headers.contentTypeNosniff=true
- traefik.http.middlewares.n8n.headers.forceSTSHeader=true
- traefik.http.middlewares.n8n.headers.SSLHost=${DOMAIN_NAME}
- traefik.http.middlewares.n8n.headers.STSIncludeSubdomains=true
- traefik.http.middlewares.n8n.headers.STSPreload=true
- traefik.http.middlewares.n8n-redirectregex.redirectregex.regex=/webhook/(.*)
- traefik.http.middlewares.n8n-redirectregex.redirectregex.replacement=:5679/webhook/$$1
depends_on:
- redis
- postgres
volumes:
- ~/.n8n:/home/node/.n8n
# Wait 5 seconds to start n8n to make sure that PostgreSQL is ready
# when n8n tries to connect to it
command: /bin/sh -c "sleep 5; n8n start"
n8n-queue:
image: n8nio/n8n
restart: always
environment:
- DB_TYPE=postgresdb
- DB_POSTGRESDB_HOST=postgres
- DB_POSTGRESDB_PORT=5432
- DB_POSTGRESDB_DATABASE=${POSTGRES_DB}
- DB_POSTGRESDB_USER=${POSTGRES_NON_ROOT_USER}
- DB_POSTGRESDB_PASSWORD=${POSTGRES_NON_ROOT_PASSWORD}
- N8N_BASIC_AUTH_ACTIVE=true
- N8N_BASIC_AUTH_USER=${N8N_BASIC_AUTH_USER}
- N8N_BASIC_AUTH_PASSWORD=${N8N_BASIC_AUTH_PASSWORD}
- QUEUE_BULL_REDIS_HOST=redis
- QUEUE_BULL_REDIS_PORT=6379
- N8N_HOST=${SUBDOMAIN}.${DOMAIN_NAME}
- WEBHOOK_URL=https://${SUBDOMAIN}.${DOMAIN_NAME}/
- N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
- NODE_FUNCTION_ALLOW_BUILTIN=*
- NODE_FUNCTION_ALLOW_EXTERNAL=*
- GENERIC_TIMEZONE=${GENERIC_TIMEZONE}
- N8N_PORT=5680
- N8N_LOG_LEVEL=debug # error, warning, info, verbose, debug
- N8N_PROTOCOL=https
cpus: 1
ports:
- 5680:5678
depends_on:
- postgres
- redis
- n8n
volumes:
- ~/.n8n:/home/node/.n8n
command: /bin/sh -c "n8n worker"
n8n-wh:
image: n8nio/n8n
restart: always
environment:
- DB_TYPE=postgresdb
- DB_POSTGRESDB_HOST=postgres
- DB_POSTGRESDB_PORT=5432
- DB_POSTGRESDB_DATABASE=${POSTGRES_DB}
- DB_POSTGRESDB_USER=${POSTGRES_NON_ROOT_USER}
- DB_POSTGRESDB_PASSWORD=${POSTGRES_NON_ROOT_PASSWORD}
- N8N_BASIC_AUTH_ACTIVE=true
- N8N_BASIC_AUTH_USER=${N8N_BASIC_AUTH_USER}
- N8N_BASIC_AUTH_PASSWORD=${N8N_BASIC_AUTH_PASSWORD}
- EXECUTIONS_MODE=queue
- QUEUE_BULL_REDIS_HOST=redis
- N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
- QUEUE_BULL_REDIS_PORT=6379
- N8N_HOST=${SUBDOMAIN}.${DOMAIN_NAME}
- WEBHOOK_URL=https://${SUBDOMAIN}.${DOMAIN_NAME}/
- NODE_FUNCTION_ALLOW_BUILTIN=*
- NODE_FUNCTION_ALLOW_EXTERNAL=*
- GENERIC_TIMEZONE=${GENERIC_TIMEZONE}
- N8N_LOG_LEVEL=debug # error, warning, info, verbose, debug
- N8N_PROTOCOL=https
- N8N_PORT=5679
- N8N_ENDPOINT_WEBHOOK=webhook
- N8N_ENDPOINT_WEBHOOK_TEST=webhook-test
cpus: 1
labels:
- traefik.enable=true
- traefik.http.middlewares.n8n.headers.SSLRedirect=true
- traefik.http.middlewares.n8n.headers.STSSeconds=315360000
- traefik.http.middlewares.n8n.headers.browserXSSFilter=true
- traefik.http.middlewares.n8n.headers.contentTypeNosniff=true
- traefik.http.middlewares.n8n.headers.forceSTSHeader=true
- traefik.http.middlewares.n8n.headers.SSLHost=${DOMAIN_NAME}
- traefik.http.middlewares.n8n.headers.STSIncludeSubdomains=true
- traefik.http.middlewares.n8n.headers.STSPreload=true
- traefik.http.middlewares.n8n-redirectregex.redirectregex.regex=/webhook/(.*)
- traefik.http.middlewares.n8n-redirectregex.redirectregex.replacement=:5679/webhook/$$1
ports:
- 5679:5678
networks:
- default
depends_on:
- postgres
- redis
- n8n
- n8n-queue
volumes:
- ~/.n8n:/home/node/.n8n
command: /bin/sh -c "n8n webhook"
Setup also includes
- .env, only containing stuff like credentials, keys and domain names no point posting it here.
- init-data.sh - left as found in github
- redis.conf - left as found in github