What be a good deployment story?

I’d like to comment on how the lack a deployment story could be rectified.

Starting with some observations;

  • the ability to specify some configuration items via env vars is a good one
  • this ability could be extended to support at-runtime configuration items that affect the running of the nodes; I’d rather have the ability to do "endpoint": {{ env.RUNTIME_ENV === 'prod' : 'https:// ..." : "http://localhost:3000" }}
  • secondly, having an abstraction for a configuration bucket, which is configured via the above methods (env vars and pass-through env vars) and also files would be great; similar to a generic key-value store with string/number/json object values would in fact be very neat
  • the ability to join on two input nodes would ensure I can get an up-to-date value from a configuration store; in the case the previous point is not implemented (right now it would trigger multiple runs)
  • the ability to run the software in a distributed fashion; maybe https://github.com/rqlite/rqlite together with a shared mutex that is equivalent to the leader, who should perform the side-effecting calls, would be a good idea
  • don’t store config and the database next to each other; have two folders instead, so they can be individually mounted into the running container in k8s, and allow for customization of the configuration location with an env var so I can place it at e.g. /etc/n8n/config whilst data lives in /var/lib/n8n/data
  • provide kustomization manifests to support k8s out of the box; happy to help here, I have them
  • provide a /metics prometheus endpoint
  • provide both a liveness (missing) and healthcheck (/healthz but is undocumented) endpoint for the software
  • specify how n8n acts during timeouts and long-running nodes; can it be killed? How parallel are nodes?
  • missing per-execution ids, which must be available to the nodes, in order to provide some basic idempotence garantuees
  • would be great if you have a correlation id that is set up at the start of the running of the workflow
  • would be great if you also make this a full-on opentelemetry trace so I can inspect the workflows as they are running
  • would be great to be able to ship logs over JSON, and even better, as jsonnl over tcp, udp or zmq

/haf

Welcome to the community @haf!

Thanks a lot. Is very helpful! Here some comments:

  • the ability to specify some configuration items via env vars is a good one

Yes, that should already be possible

  • this ability could be extended to support at-runtime configuration items that affect the running of the nodes; I’d rather have the ability to do "endpoint": {{ env.RUNTIME_ENV === 'prod' : 'https:// ..." : "http://localhost:3000" }}
  • secondly, having an abstraction for a configuration bucket, which is configured via the above methods (env vars and pass-through env vars) and also files would be great; similar to a generic key-value store with string/number/json object values would in fact be very neat

Agree, that would be interesting, and something like that is planned.

  • the ability to join on two input nodes would ensure I can get an up-to-date value from a configuration store; in the case the previous point is not implemented (right now it would trigger multiple runs)

Sadly do not understand.

  • the ability to run the software in a distributed fashion; maybe https://github.com/rqlite/rqlite together with a shared mutex that is equivalent to the leader, who should perform the side-effecting calls, would be a good idea

Agree, we are already working on scaling n8n

  • don’t store config and the database next to each other; have two folders instead, so they can be individually mounted into the running container in k8s, and allow for customization of the configuration location with an env var so I can place it at e.g. /etc/n8n/config whilst data lives in /var/lib/n8n/data

They are in two different files so they can be mounted separately.

  • provide kustomization manifests to support k8s out of the box; happy to help here, I have them
  • provide a /metics prometheus endpoint
  • provide both a liveness (missing) and healthcheck (/healthz but is undocumented) endpoint for the software

Agree, would be great!

  • specify how n8n acts during timeouts and long-running nodes; can it be killed? How parallel are nodes?

Agree. Documentation gets improved on a daily basis but will take its time.
Some basic information:

  • What happens if something times out like a HTTP then it depending on the setting on the node. It will it either retry, continue or fail the workflow.
  • Nodes can theoretically run as long as they want (but n8n did not really get designed for that) but if n8n gets killed the whole workflow data is currently lost. As soon as n8n gets a signal to shut down, it checks if something is still running. If there are active workflows it waits up to 30 seconds for existing workflows to stop. If they do not stop in time then they get killed.
  • Depends on the nodes. But currently do most not process data in parallel. A notable exception is the HTTP Request node which does parallel requests.
  • missing per-execution ids, which must be available to the nodes, in order to provide some basic idempotence garantuees
  • would be great if you have a correlation id that is set up at the start of the running of the workflow

Agree. Is planned and will come together with a future change that will save the initial data on workflow start.

  • would be great if you also make this a full-on opentelemetry trace so I can inspect the workflows as they are running
  • would be great to be able to ship logs over JSON, and even better, as jsonnl over tcp, udp or zmq

Agree would be great and is planned.

So in short, most things you did mention are planned and much more. The problem is just that we are still a small team and it will so obviously take its time. But we do our best.

Welcome to the community @haf!

Thank you Jan! :slight_smile:

They are in two different files so they can be mounted separately.

Yes, this is true, but it’s more to write and configure, so it would be better to have two separate folders, like all other server software has it. Work with the flow, not against it.

the ability to join on two input nodes
Sadly do not understand.

I.e. a join, as in fork-join; gating execution of the node on both verticies going into it having values, rather than executing the node any time one of the verticies has a value. This would mean I can connect the trigger to a lookup function, e.g. get me the customer-id for this event, and pass along both the event and the customer id to the downstream node.

Agree, we are already working on scaling n8n

Yey! I’m primarily interested in failover though as I don’t imagine there will be any scalability problems with a single node for the type of processing that n8n does.

So in short, most things you did mention are planned and much more. The problem is just that we are still a small team and it will so obviously take its time. But we do our best.

Great to hear; but it would be even greater with a public road map that people in the OSS community can orient around when planning their own development. As such, if I need a feature in n8n in the future (we’re currently just evaluating it), I can implement it for you.

I also discovered that


So it would seem that DATA_FOLDER is not honoured.

… despite

So DATA_FOLDER is not honoured.

Yes, that is correct. That variable does not get used by n8n. Why do you think it is?

What is honored is N8N_USER_FOLDER as documented here: