Automatic Retry of Workflows

Hi!

Problem:
Some of our connected systems / APIs are sometimes not reachable for a couple of minutes. Hence I have to restart the failed executions by hand as soon as possible.

Idea / Solution Proposal:
It would be cool to have some kind of “Retry Policy” on a workflow basis to define per Workflow …

  • if it should be retried automatically or not and
  • how often and with what delay it should be retried

Unfortunately there is no CLI command to retry an execution, so I cannot build it myself.

Is it a valid idea? Anybody with same or similar issue?

Best regards from Hamburg
Benjamin

This is a very relevant use-case for us as well. Would love such a feature!

I see how this can be very helpfull. You have my vote.

1 Like

I have only just realised that there is a “Retry on Fail” in the nodes. Do you think it is possible to use this so that the API is requested again three times at five-minute intervals?

Edit: OK, the maximum Wait Time is 5 seconds :confused:

Maybe use an IF node (If failed), Function node for a wait of 5 minutes, Loop back to the original request? just an idea

good idea, but how do you build the “wait for 5 minutes” function? With a sleep function?

{
  "nodes": [
    {
      "parameters": {
        "functionCode": "function sleep(milliseconds) {\n  return new Promise(\n    resolve => setTimeout(resolve, milliseconds)\n  );\n}\n\nawait sleep(30000);\n\nreturn items;"
      },
      "name": "Await 30 Seconds1",
      "type": "n8n-nodes-base.function",
      "position": [
        -280,
        1230
      ],
      "typeVersion": 1
    }
  ],
  "connections": {}
}
1 Like

I think n8n should get to a point where certain process elements run on a meta-level. If I need companion nodes for each API node for …

  1. data validation
  2. data preparation
  3. error handling
  4. retry in case of error

then my workflow increases very quickly. A retry should not be part of the operational process, because you would copy an awful lot of nodes and the process would become very confusing.

2 Likes

While it’s possible at the moment with if + function nodes for shorter periods, totally agree that flexible retrying should be more intuitive to setup.

In the shorter term, would improving the “Retry on fail” node functionality be a solution to these pain points @BenW ? It would likely be simpler at first to improve node-level retrying before assessing workflow-level strategies.

A wait node capable of extended pauses is in the works, and those core changes will make it easier to add extended waiting in other contexts. With long waiting, we could make “Retry on fail” functionality configurable - perhaps parameters for # of retries and retry time interval (in sec, mins etc). Would also be interesting to explore functionality like exponential backoff (retry 1min later, then 2min, then 4 etc). to help in rate limiting scenarios.

For me, the existing retry mechanism has surprisingly solved most of my workflow errors. At the moment I only have to restart very few workflows by hand.

2 Likes