Health-Check / Circuit-Breaker Workflow

The idea is:

Allow a Health-Check workflow to be attached to any other workflow, similar to the way an Error workflow is attached, in the workflow settings.

  • The Health-Check workflow would contain nodes that make lightweight, “everything good?” requests to downstream services, or lightweight checks for other resources, and return an overall healthy / not-healthy status.
  • The workflow to which the Health-Check workflow is attached would prevent executions, or accept-and-queue (i.e. immediately pause/wait) executions while health-checks are failing. Resume of in-process workflows in a wait state would also be stopped until health-checks are “passing” again.
  • For workflows that can be activated (or not) the attached Health-Check workflow would be active only when the workflow is active.
  • For workflows that cannot be activated, the attached Health-Check workflow would run on every execution “up front” before attempting to execute any workflow steps.
  • Notifications, or other secondary actions (like a restart call to a process-manager service), could be built into the Health-Check workflow.
  • Intervals for how often a health-check is performed, and how long the workflow execution is suspended, would also be specified in the settings of the workflow to which the Health-Check workflow is attached.

My use case:

Workflows sometimes fail and stop, after partial completion, because a downstream service or resource required in a later step is temporarily unavailable.

I think it would be beneficial to add this because:

Long-running workflows and/or workflows with “hard to reverse” or “hard to repeat” (non-idempotent) steps would benefit from a separate, background process that checks availability/health of all downstream service / resource dependencies, and prevents the workflow from executing or resuming from a wait until the dependencies are again available/healthy.

Any resources to support this?

Are you willing to work on this?

Could help with testing. Could possibly help with refining the design/approach. Could possibly help with development (less likely).