Automatic Retry of Workflows

Hi!

Problem:
Some of our connected systems / APIs are sometimes not reachable for a couple of minutes. Hence I have to restart the failed executions by hand as soon as possible.

Idea / Solution Proposal:
It would be cool to have some kind of “Retry Policy” on a workflow basis to define per Workflow …

  • if it should be retried automatically or not and
  • how often and with what delay it should be retried

Unfortunately there is no CLI command to retry an execution, so I cannot build it myself.

Is it a valid idea? Anybody with same or similar issue?

Best regards from Hamburg
Benjamin

This is a very relevant use-case for us as well. Would love such a feature!

I see how this can be very helpfull. You have my vote.

2 Likes

I have only just realised that there is a “Retry on Fail” in the nodes. Do you think it is possible to use this so that the API is requested again three times at five-minute intervals?

Edit: OK, the maximum Wait Time is 5 seconds :confused:

Maybe use an IF node (If failed), Function node for a wait of 5 minutes, Loop back to the original request? just an idea

good idea, but how do you build the “wait for 5 minutes” function? With a sleep function?

{
  "nodes": [
    {
      "parameters": {
        "functionCode": "function sleep(milliseconds) {\n  return new Promise(\n    resolve => setTimeout(resolve, milliseconds)\n  );\n}\n\nawait sleep(30000);\n\nreturn items;"
      },
      "name": "Await 30 Seconds1",
      "type": "n8n-nodes-base.function",
      "position": [
        -280,
        1230
      ],
      "typeVersion": 1
    }
  ],
  "connections": {}
}
1 Like

I think n8n should get to a point where certain process elements run on a meta-level. If I need companion nodes for each API node for …

  1. data validation
  2. data preparation
  3. error handling
  4. retry in case of error

then my workflow increases very quickly. A retry should not be part of the operational process, because you would copy an awful lot of nodes and the process would become very confusing.

5 Likes

While it’s possible at the moment with if + function nodes for shorter periods, totally agree that flexible retrying should be more intuitive to setup.

In the shorter term, would improving the “Retry on fail” node functionality be a solution to these pain points @BenW ? It would likely be simpler at first to improve node-level retrying before assessing workflow-level strategies.

A wait node capable of extended pauses is in the works, and those core changes will make it easier to add extended waiting in other contexts. With long waiting, we could make “Retry on fail” functionality configurable - perhaps parameters for # of retries and retry time interval (in sec, mins etc). Would also be interesting to explore functionality like exponential backoff (retry 1min later, then 2min, then 4 etc). to help in rate limiting scenarios.

2 Likes

For me, the existing retry mechanism has surprisingly solved most of my workflow errors. At the moment I only have to restart very few workflows by hand.

2 Likes

There is another fitting thread for this:

In order to Help to exponential backoff on Nodes, here is some interesting docs

Exponential backoff should be usefull to avoid troubles with google Sheets and other external APIs with throotling that blocks you when you retry.

Should be nice that you can choose Fixed Values or Exponential backoff on this screen

image

Oddly enough I raised a request internally a while back to allow an expression in the wait time which would allow this without needing another package.

1 Like

Excelent!!! Please give us visibillity on this topic when this evolves.

Jumping in here to say that being able to retry a workflow at flexible intervals automatically would be super useful. It would be nice to not have to select it on every node—I like the way the workflow-level replay works. 5 seconds is not long enough, especially for Google Sheets.

3 Likes

same here. Work with lots of external APIs which are sometimes not accessible for a couple of minutes.

1 Like

I’ve been thinking about this problem a lot, and decided I needed a way to store all webhook requests and replay them as needed. This thought eventually led me to the discovery of hookdeck.com

So my plan moving forward is to send all webhooks into hookdeck.com which then forwards them to n8n

It would be amazing if this extra step wasn’t needed though…

1 Like

I use Hookdeck for some of my own workflows where I need the webhooks to always work.