Is it necessary to mount in volumes n8n user config folder in queue mode + postgres DB?

JavierCane · April 5, 2024, 10:26am

Hi!

We have a Docker Swarm with n8n running in queue mode. We followed the Docker Compose file provided in the official docs and mixed it with the queue mode config + redis + Postgres DB:

version: '3.7'

x-common-env: &common-env
  NODE_ENV: production
  N8N_ENCRYPTION_KEY: ${N8N__ENCRYPTION_KEY}
  EXECUTIONS_MODE: queue
  QUEUE_BULL_REDIS_HOST: redis
  DB_TYPE: postgresdb
  # … (redacted for brevity)

services:
  n8n-main:
    image: n8nio/n8n:1.35.0
    environment:
      <<: *common-env
    volumes:
      - n8n_data:/home/node/.n8n
      # The following server folder allow us to download a CSV file (for instance) from a worker node, and deal with it a different worker.
      # It also allows us to share files between deployments and simplify the access to the folder without entering Docker container for debug purposes.
      - /home/${our_user}/n8n_workflow_executions-shared_files:/home/node/n8n_workflow_executions-shared_files
    deploy:
      replicas: 1
      # … (redacted for brevity)
    healthcheck:
      # … (redacted for brevity)

  n8n-worker:
    image: n8nio/n8n:1.35.0
    command: worker
    environment:
      <<: *common-env
      # Specific configuration for workers
      QUEUE_HEALTH_CHECK_ACTIVE: "true"
      QUEUE_HEALTH_CHECK_PORT: 5678
    volumes:
      # Same purpose as with the n8n-main
      - /home/${our_user}/n8n_workflow_executions-shared_files:/home/node/n8n_workflow_executions-shared_files
    deploy:
      replicas: ${N8N__WORKER_INSTANCES}
      # … (redacted for brevity)
    healthcheck:
      # … (redacted for brevity)

  redis:
    image: "redis:7.2.3"
    deploy:
      replicas: 1
      # … (redacted for brevity)
    healthcheck:
      # … (redacted for brevity)

volumes:
  n8n_data:
    external: true

However, we have some doubts:

Is it really necessary to declare the n8n_data, mapped to the user config directory /home/node/.n8n, as a volume in order to persist that information from the main node given that:
- We are setting DB_TYPE: postgresdb in order to store all the executions and workflows data in an external DB instead of a local SQLite
- We are configuring the encryption key using the N8N_ENCRYPTION_KEY environment variable
- We are using the binding volume maped to the server folder /home/${our_user}/n8n_workflow_executions-shared_files in order to share binary files between executions and nodes instead of the binaryData contained in the .n8n config folder
If it is necessary, it is also necessary for the worker nodes? If so, should it be shared as in this other official example?
Why is it needed to be declared as externally managed?

Thanks!

Information on your n8n setup

n8n version: 1.35.0
Database: PostgreSQL
n8n EXECUTIONS_MODE: setting: queue
Running n8n via: Docker Swarm
Operating system: Ubuntu

n8n · April 5, 2024, 10:26am

It looks like your topic is missing some important information. Could you provide the following if applicable.

n8n version:
Database (default: SQLite):
n8n EXECUTIONS_PROCESS setting (default: own, main):
Running n8n via (Docker, npm, n8n cloud, desktop app):
Operating system:

Jon · April 16, 2024, 10:51am

Hey @JavierCane,

The .n8n folder also keeps the logs so I would keep it persisted where possible so you can keep them between updates just incase anything goes wrong. It doens’t need to be externally managed our guide is just a guide and you are free to change it as you want to.

JavierCane · May 3, 2024, 12:26pm

Thanks for the clarification Jon!

So in that case, is there any problem sharing the data volume between different instances of n8n?

I mean, in the example, the main and worker nodes are sharing the same data volume, and taking a look at the contents of the .n8n directory we have:

binaryData/ 
config
crash.journal
git/ // Only in the main node
n8nEventLog.log // Only in the main node
n8nEventLog-worker.log // Only in workers node
nodes/
ssh/ // Only in the main node

Non conflicting files and folders:

binaryData folder: Do not collision because we would be dealing with these kind of files using the n8n_workflow_executions-shared_files binding volume
config file: Same file in all nodes (config based in environment variables such as N8N_ENCRYPTION_KEY)
git/, n8nEventLog.log, and ssh/: Do not collision due to only appearing in the main node

Possible collisions:

crash.journal: Shared file between main and worker nodes
n8nEventLog-worker.log: Shared file between worker nodes

At first glance, it might be tricky to decide whether we should allocate a different data volume for each node, considering they are defined by the number of Docker Swarm replicas

Jon · May 3, 2024, 1:03pm

Hey @JavierCane,

Personally I wouldn’t share the same home folder between them all as you wouldn’t want log files to be overwritten but this is not something I have ever tried… apart from the nodes folder I share that one.

I guess if log files don’t really matter you could run without I did some digging and we don’t really enforce it for wokers

JavierCane · May 3, 2024, 3:46pm

Hi @Jon!

Yes, completely agree on that. If n8n is going to override the contents of the crash.journal and, in the case of the worker nodes, also the n8nEventLog-worker.log, I would also prefer to avoid sharing the home folder between n8n nodes

Arrived to this point, the post question has been answered (thank you Jon!), so I do not expect any additional help or customer support from your side (obviously appreciated if you feel open to do so). However, for the sake of completeness and just in case it helps anyone else, here are my thoughts about using n8n with Docker Swarm regarding the volume mapping issue

Ideally we would have 1 volume for each node. The problem is that we are defining the number of n8n worker nodes as Docker Swarm replicas. From the initial example:

services:
  n8n-worker:
    image: n8nio/n8n:1.39.1
    command: worker
    volumes:
      - /home/${user}/n8n_workflow_executions-shared_files:/home/node/n8n_workflow_executions-shared_files
    deploy:
      replicas: 3

It would be perfect to be able to dynamically declare the volume names in Docker. It would be something as:

volumes:
  # {{.Task.Name}} would follow "${stack}_${service}.${replica}" naming convention as in "docker ps", so it would be: "n8n_n8n-main.1", "n8n_n8n-worker.1", "n8n_n8n-worker.2"…
  n8n_data-{{.Task.Name}}: 

services:
  n8n-main:
    image: n8nio/n8n:1.39.1
    ports:
      - "80:5678"
    volumes:
      - n8n_data-{{.Task.Name}}:/home/node/.n8n
      - /home/${user}/n8n-workflow_executions_shared_files:/home/node/n8n-workflow_executions_shared_files
    deploy:
      replicas: 1

  n8n-worker:
    image: n8nio/n8n:1.39.1
    command: worker
    volumes:
      - n8n_data-{{.Task.Name}}:/home/node/.n8n
      - /home/${user}/n8n_workflow_executions-shared_files:/home/node/n8n_workflow_executions-shared_files
    deploy:
      replicas: 3

The problem is that this is not possible (but dreaming about it was for free )

So I think we only have 2 paths:

a) Manually declare the 4 volumes (1 for the main node and another one for each worker node), and also declare the 3 worker nodes as separated Docker Swarm services instead of using the replicas parameter
b) Diving a little in the n8n docs I have discovered that you can specify the user folder location, so I guess we could also go for sharing a single n8n_data volume for all the nodes, but specifying a different location for each node.

The problem in this case would be in fact the very same as before. We can specify a folder for the main node different from the worker ones because they are 2 different services, but we can not dynamically set the N8N_USER_FOLDER environment variable value for the worker nodes based on the Docker Swarm replica number

Any thoughts?

JavierCane · May 11, 2024, 1:53am

Challenge completed!

I was wrong assuming that you can not dynamically declare volumes in Docker Swarm. The problem was how I was trying to do so (and assuming the sources I ended up in pointing that it was not possible were accurate).

Final solution just in case it helps anyone or you want to include it in the docs :

Considerations regarding template variables:

'{{.Service.Name}}_{{.Task.Slot}}' produce volume names like:
- n8n_n8n-main_1
- n8n_n8n-worker_1
- n8n_n8n-worker_2
If you use {{.Task.Name}} it will include the unique identifier of the task, so you will be generating a new volume each time you restart a container while updating the n8n version for instance

Complete example:

version: '3.7'

volumes:
  n8n_data:
    name: '{{.Service.Name}}_{{.Task.Slot}}'
  redis_data:

x-common-env: &common-env
  NODE_ENV: production
  N8N_LOG_LEVEL: warn
  EXECUTIONS_MODE: queue
  QUEUE_BULL_REDIS_HOST: redis
  DB_TYPE: postgresdb
  # … (redacted for brevity)

services:
  n8n-main:
    image: n8nio/n8n:1.40.0
    ports:
      - "80:5678"
    environment:
      <<: *common-env
    volumes:
      - n8n_data:/home/node/.n8n
      - /home/${user}/n8n-workflow_executions_shared_files:/home/node/n8n-workflow_executions_shared_files
    deploy:
      replicas: 1
      update_config:
        order: stop-first
        failure_action: rollback
      restart_policy:
        delay: 5s
        max_attempts: 5
        window: 2m
    healthcheck:
      test: wget --no-verbose --tries=1 --spider http://127.0.0.1:5678/healthz || exit 1
      start_period: 30s
      interval: 30s
      timeout: 5s
      retries: 5

  n8n-worker:
    image: n8nio/n8n:1.40.0
    command: worker
    environment:
      <<: *common-env
      # Specific configuration for workers
      QUEUE_HEALTH_CHECK_ACTIVE: "true"
      QUEUE_HEALTH_CHECK_PORT: 5678
    volumes:
      - n8n_data:/home/node/.n8n
      - /home/${user}/n8n_workflow_executions-shared_files:/home/node/n8n_workflow_executions-shared_files
    deploy:
      replicas: 3
      update_config:
        order: start-first
        failure_action: rollback
      restart_policy:
        delay: 5s
        max_attempts: 5
        window: 2m
    healthcheck:
      test: wget --no-verbose --tries=1 --spider http://127.0.0.1:5678/healthz || exit 1
      start_period: 30s
      interval: 30s
      timeout: 5s
      retries: 5

  redis:
    image: "redis:7.2.4"
    volumes:
      - redis_data:/data
    deploy:
      replicas: 1
      update_config:
        order: stop-first
        failure_action: rollback
      restart_policy:
        delay: 5s
        max_attempts: 5
        window: 2m
    healthcheck:
      test: [ "CMD", "redis-cli", "ping" ]
      interval: 30s
      timeout: 3s
      retries: 5

Hope it helps anyone else

Thanks for the clarifications Jon!

system · May 18, 2024, 1:53am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.