[HELP NEEDED] Webhooks Randomly Stop - Require Workflow Toggle to Resume (Not Sustainable)

Okay, so nice implementation cloudflare and upstash side, but am wondering whats causing this too, I’ve not faced it, you could setup some alerts, in grafana N8n + Grafana Full Node.js Metrics Dashboard (JSON Example Included!)

say if you notice idleness, am sure you can implement some other methods too checking, but hope the dashboard might help.

Am wondering about ure setup, I see you mentioned sqlite as the db, it could be a bottleneck in the system, and switching to postgres I would recommend yes, my next few questions,

Do you have webhook nodes? and worker nodes? or just single setup atm?

If you system is being overloaded it could bug out webhook processing, and separating the main node, from webhook overloading is possible

Webhook link above

If you still see issues, after sending the traffic to the webhook nodes then it could suggest it’s not just bottleneck situation from single instance, but tbh 200 webhooks calls aday may suggest it’s an error elsewhere, do you see any errors in the logs, around the same time it stops? This would help dig deeper as we may see stacktraces or some error which would help.

You could try enabling debug logs further,

I don’t see this as a common issue in the forum, it could be network side issues too with host. But hopefully the above helps to dig deeper into the issue.

Hope this helps,