In an n8n workflow with multiple stages and nodes, how can I ensure that after an unexpected server crash, the workflow execution automatically resumes from the exact node where the failure happened, instead of starting the execution from the very first node?
Note: I am using my local server/docker not cloud.
My honest answer you can’t guarantee automatic resume from the exact failed node after a server crash with default settings in a self hosted instance.. i don’t think n8n is designed as a fully resilient, transactional workflow engine like apache air workflow with checkpointing
Hi @mohamedelnady-406, Im not sure how n8n for enterprise works for your use case, however i dont think it is a core feature of n8n as a platform. I feel your requirement is more of an architectural design requirement to cater for volatile situations. Have a read up on event driven architecture and pub/sub.
Essentially you would place a queueing system in place (MQ) to keep track of things which needs to be processed and then your workflows would pick tasks from the queue and process them either concurrently or in serial. This way when the system goes down for whatever reason, it can essentially continue where it left off.
Hi @Wouter_Nigrini , That’s really insightful. but I am thinking of a scenario.
when an event has been consumed from MQ and n8n server fails while processing that event. how to know the last point of execution so that i can continue after recovery at the point where it failed not from the very beginning of the workflow.
In other words, The main focus is how to let n8n server recover and re-execute from the point it failed at.