Should workflows calling child workflows wait for the child workflow to complete whatever the child workflows needs before continuing e.g. Childworkflow is waiting on a webhook (Resume on Webhook call) ?
I’ve trying to construct a common messaging module ’ Send Message ’ , and in the messaging module it’s asking the end user to click on a webhook link to respond to a question but I’m finding that the Calling module in the is finishing even thought the webhook in ‘Module Send Message’ is still waiting.
I know that I’ve had issues with slack ‘messages’ causing the webhook to complete, but in this case I can see the ‘webhook’ is still waiting for a response.
What is the error message (if any)?
None
Please share the workflow
I’ll try and create a standalone version of the issue but I’ve got quite a bit of checks in the message module to paste it fully… so it’s just screenshots for now.
@Jon thanks I was just thinking about adding fork at the point of the Webhook into the sub-workflow and putting a ‘wait’ node to see what that would do to the calling workflow.
I also tried it with a simple wait in the sub-workflow, de-coupling the wait for webhook’ and it does the same.
I can see the sub workflow still waits but the parent carries on…
I have managed to reproduce the same issue, It looks like this issue happens when a wait needs to go into the background to wait so anything over a certain amount of time also triggers the same issue.
I assume this is because it goes into some state table to be picked up later but we should probably think of a better way to handle things like this.
@Jon - Thanks for reproducing this… sorry for the ‘notification’ of the like and unlike … my fingers aren’t what they use to be and it asked me to wait before I liked the post again lol.
Are you aware if there’s a work-around this issue or will it be marked as a bug, ideally I’m trying to have re-usable sub-workflows that I don’t have to copy around the ‘bulk’ of the flow into the main workflows, even flows that will re-execute themselves.
So I have asked internally and it looks like the current behaviour you are seeing is partially expected for anything that may take over 70 seconds to complete this is because we offload it to the database as mentioned earlier.
It looks like we do have a dev ticket open (PAY-80) to change how this works but it currently has a fairly low priority. The only thing I can think of to work around this would be to not use the sub workflow and have it in the same one or… Maybe use a database to store the states and put the parent into a loop until the sub workflow updates the state to say it has finished.
@Jon - must be a lot less than 70 seconds, maybe 70ms ?, the parent re-gains control fairly quickly when the child is using a wait node. So I thought I’d try to understand the time-limit by putting a Unix sleep node in which did delay it.
So I thought this might be a workaround in the subworkflow by doing a split/fork - unix execute command instead of wait node, as the parent does appear to wait for the timeout to complete:
sleep $(expr 60 * {{wait-time}} )
but it seems the workflow doesn’t actually fork… the execute command node sat waiting during the sleep and then the webhook executed after.
Thanks for taking the time to understand this issue.
This almost was my subflow workaround but alas no, it waits in the Unix command to complete before setting up the webhook node
I’ll have to re-think ( using state checking as you suggested) / wait for a future fix
I put a wait in the sub workflow for 1 minute and it did wait as expected, Using the wait for webhook will do it instantly like a wait node that is waiting for 70+ seconds. I am not sure why it is 70 seconds for a normal wait though.
For future debugging / reproduction of this issue:
Simplified Parent workflow just modify the ‘StartNode’ depending on which path you want the ‘sleep’ to execute e.g. webHook, waitNode, unixSleep
The startTime and endTime will be visible in the last ‘CheckResponse’ node of the parent.
“waitNode” and “webHook” show fast completion of execution in parent. But if you enable manual saves execution in the child workflow - you can see that both types are waiting.
“unixSleep” shows delayed execution and waits as expected.
All types use the ‘notificationTimeout’ in the ‘Set’ node.
Simplified Parent workflow
Simplified Child workflow that will execute based upon ‘waitType’
There is another issue. If the mode is changed to EXECUTIONS_PROCESS=main, and there is a ‘wait webhook’ node in the sub-process, the parent process will directly throw an error. The solution I thought of is to use the ‘wait’ node in combination with triggering the sub-process through http.
I just had a read of Execution modes and processes - n8n Documentation
and thinking about the implications… I wonder if the ‘throw’ when in main mode because of the ttl that @jon indicated because in main mode I’m seeing the words: Single process can result in a bottleneck. and one crashed executions causes all others to fail.
I’m just speculating of course, as my docker container is using the default of ‘own’ - I actually can’t think how my workflows would operate in ‘main’ mode, maybe the queuing would be interesting to see.