`Execute Workflow` node causes 3 seconds overhead in "regular" execution process

Djennez · June 30, 2022, 2:12pm

Hiya all! First post here. Let me start off by giving props to everyone working on n8n. We’re actively using it for automation and general quality of life improvements in our day to day work. Sometimes we’re running into small issues that we can either work around or we look for other ways. However, we’re now running into an issue, and I can’t find a reason why this is happening. So I thought it might be a valid bug to report.

Describe the issue/error/question

Summary:

The Execute workflow node seems to cause a lot of overhead, even if not actively used.

More information:

When setting up a workflow with the following nodes:

Webhook trigger (connects to:)
Answer to webhook trigger

Separate node (doing nothing):

Execute workflow

If set the webhook trigger to immediately respond, I see response times of 62 ms on simple GET requests.

If I set the hook to respond with the respond to webhook node, the response times immediately shoot up to 2.10 seconds. This is still manageable, though not optimal.

However, if I have a Execute workflow node anywhere in my flow, not even connected or configured with a valid flow ID, the response times shoot up to around 6 seconds. If I disable the Execute workflow node, the response times drop again.

Context:

I am looking to respond to Slack interactive hooks, and they require, for example for opening a modal, that you perform your actions within 3 seconds. This is impossible with n8n in our instance due to these response times. Even the 2.10 second response times come dangerously close to the 3 second limit.

Invalid / tried solutions:

Set the execution to main instead of own worker
Valid, but impractical in production due to the large number of flows and the possibility to crash the full process (while having a lot of people depending on it).
Don’t use the execute flow node
Since Slack only allows one callback URL for interactivity, we’d have to build ALL interactive elements into one flow. This will turn into a way too large of a flow to keep organized.

If you need more information, please let me know. I’m hoping we can get to a cause / solution here regarding the execute workflow node overhead, though any tips to get the general response times down are welcome as well

What is the error message (if any)?

N/A

Please share the workflow

Share the output returned by the last node

N/A

Information on your n8n setup

n8n version: 0.178.2
Database you’re using (default: SQLite): Postgres 13.7
Running n8n with the execution process [own(default), main]: own
Running n8n via [Docker, npm, n8n.cloud, desktop app]: cloud kubernetes

Jon · June 30, 2022, 2:30pm

Hey @Djennez,

Welcome to the community

I suspect not using main is going to be a big part of this as it has to do more work in the background to start up the task, Do you have n8n running in queue mode as well for scaling or is it just the one instance going? What is the database connection / speed like as well that also has a bit of potential for causing a bit of a slow down.

Are these times from the UI by pressing the execute button or are they enabled / active and you are testing that way?

Djennez · June 30, 2022, 9:39pm

Hey @Jon, thanks for taking the time to look into this!

Are these times from the UI by pressing the execute button or are they enabled / active and you are testing that way?

No, these are times reported by the request tool (in my case Postman). While I can’t access our instance from where I am now, I am pretty sure all times reported in the execution overview are all at about 0.200 seconds (even when the request times are more than 1 second).

Your other questions I’ll have to investigate once I have access to our instance at the office. Just wanted to quickly reply to the question above. I’ll report back on those.

Djennez · July 1, 2022, 6:28am

A bit more information:

With the Execute workflow node enabled, but not configured / connected:

The HTTP request takes an average of 6 seconds from GET to answer.
The Execution overview says that the flow finished within 0,12 seconds

That’s quite the difference Especially since the reply is near instantaneous if I let the webhook trigger reply immediately, instead of via the reply to webhook node.

What is the database connection / speed like as well that also has a bit of potential for causing a bit of a slow down.

Any suggestions to check this? If I create a flow with a simple Postgres node to get some data from a table, the flow takes 0.05 seconds to execute, according to the execution overview.

Do you have n8n running in queue mode as well for scaling or is it just the one instance going?

Just one instance.

Jon · July 1, 2022, 8:58am

That sounds like the DB is fairly quick, This could be down to the speed that the OS is allowing the new process to be created. Have you tried setting up a test instance using the latest 184 release to see if that is quicker?

Djennez · July 1, 2022, 11:16am

Have you tried setting up a test instance using the latest 184 release to see if that is quicker?

Not yet, we’ll probably update production to the latest version soon. Though I could not find any changelog entries between our and the latest version that would suggest that that would fix the 3 - 4 seconds overhead that a single unused node would introduce. Will report on that later.

Jon · July 1, 2022, 11:25am

Sometimes things can be fixed by another change, I do still need to test this one though to see if I hit the same issue.

When you say unused node… Do you mean the node is not attached to anything and is just on the canvas?

Jon · July 1, 2022, 12:00pm

Just done a quick test and I get the same result when using Own mode.

Own Mode
With Node Disabled

> time curl -X 'GET' https://n8n/webhook/comm/15354
{"status":"OK"}
curl -X 'GET' https://n8n/webhook/comm/15354  0.01s user 0.01s system 0% cpu 2.355 total

With Node Enabled

> time curl -X 'GET' https://n8n/webhook/comm/15354
{"status":"OK"}
curl -X 'GET' https://n8n/webhook/comm/15354  0.01s user 0.01s system 0% cpu 5.651 total

With Node Removed

> time curl -X 'GET' https://n8n/webhook/comm/15354
{"status":"OK"}
curl -X 'GET' https://n8n/webhook/comm/15354  0.01s user 0.01s system 0% cpu 2.327 total

Main Mode
With Node Disabled

> time curl -X 'GET' https://n8n/webhook/comm/15354
{"status":"OK"}
curl -X 'GET' https://n8n/webhook/comm/15354  0.01s user 0.01s system 14% cpu 0.150 total

With Node Enabled

> time curl -X 'GET' https://n8n/webhook/comm/15354
{"status":"OK"}
curl -X 'GET' https://n8n/webhook/comm/15354  0.01s user 0.01s system 11% cpu 0.212 total

With Node Removed

> time curl -X 'GET' https://n8n/webhook/comm/15354
{"status":"OK"}
curl -X 'GET' https://n8n/webhook/comm/15354  0.01s user 0.01s system 14% cpu 0.118 total

It looks like for my environment it is maybe a 2 second start time for each process to start so when I have an execute workflow in a workflow that is going to take 2 seconds for the intial workflow then another 2 for the second workflow plus a little bit of time to dig it out of the database as I am using SQLite.

This kind of matches with what we document on Main vs Own here: Execution modes and processes - n8n Documentation although we have it down as 1 second but I suspect that is very dependent on hardware and I suspect n8n was a lot lighter back then as well with less to load.

I think the best solution for this if you are worried about the instance crashing would be to set up n8n in queue mode, You can then have your webhook workers running in main mode and normal workers in own mode. Then as an instance can restart fairly quickly and you would have the ability to run more than one you can have the webhook workers auto restart and there should be minimal if any downtime as it has been scaled out.

I will add this to our internal tracker as something to look at in the future though as there is possibly things that could be done to improve this.

Djennez · July 1, 2022, 12:17pm

Thanks @Jon for the additional investigation! I do indeed think that it is time that we start looking into queueing and scaling our instance, as I do think that we have enough active workflows to warrant that.

Since we now know this is an issue that we have to work around (and as you’ve already moved it to an internal tracker), I guess this topic is resolved

PS: I just noticed that the title of this topic was completely unrelated to the problem. I think it was somehow an autofilled value from when I wanted to create a topic a few months ago (but didn’t). So I changed it.

jan · July 1, 2022, 4:29pm

Just fyi, normally if n8n crashes, does it do so because it runs out of memory. “main” mode does need less memory generally (as there is just one process) and will also use less memory as it is able to execute workflows faster and so reduces the number of workflows executing in parallel. Means a much lower chance of n8n crashing.
I would only use “own” mode if you do very CPU-intensive things, you have a lot of RAM and you do not care about speed.

Topic		Replies	Views
Over 6 seconds delay with EXECUTIONS_PROCESS=own? Questions executions	4	545	May 11, 2023
Long time before starting webhook execution Questions core	2	1460	August 4, 2021
Slow execution performance Questions core	3	845	August 9, 2023
Understanding Performance of APIs Questions core , webhook	3	843	August 18, 2022
Question re: "Execute Workflow" node Questions execute-workflow	5	365	January 3, 2024