I have n8n webhook reliability issues where webhooks randomly stop being processed and require manual workflow toggling to resume. This happens repeatedly and is not manageable for a production environment.
Important Context: I implemented a webhook forwarding architecture (Cloudflare + Upstash) specifically to try to solve n8n’s webhook reliability issues, but the core problem persists.
Why This Setup: Originally tried direct webhooks (Zoom → n8n) but n8n kept losing webhooks. Implemented the forwarder as a reliability buffer, but n8n is still the weak link.
Key Issue: The problem is specifically with n8n’s webhook processing - it stops consuming from the queue and requires workflow toggling to resume.
Theories on Root Cause (n8n-Specific)
1. n8n Webhook/Polling Engine Issues
n8n stops polling Upstash Redis queue after time/load?
say if you notice idleness, am sure you can implement some other methods too checking, but hope the dashboard might help.
Am wondering about ure setup, I see you mentioned sqlite as the db, it could be a bottleneck in the system, and switching to postgres I would recommend yes, my next few questions,
Do you have webhook nodes? and worker nodes? or just single setup atm?
If you system is being overloaded it could bug out webhook processing, and separating the main node, from webhook overloading is possible
Webhook link above
If you still see issues, after sending the traffic to the webhook nodes then it could suggest it’s not just bottleneck situation from single instance, but tbh 200 webhooks calls aday may suggest it’s an error elsewhere, do you see any errors in the logs, around the same time it stops? This would help dig deeper as we may see stacktraces or some error which would help.
You could try enabling debug logs further,
I don’t see this as a common issue in the forum, it could be network side issues too with host. But hopefully the above helps to dig deeper into the issue.
Hey @Declup, this line made me pause, can you expand on the use case here? Do multiple workflows share the exact same webhook path, ie. do you expect multiple workflows to trigger from just one request?
If so, that might be the source of the issue as webhook paths must be unique per workflow, or else, just the last activated workflow will trigger. This was enforced with a fix in 1.91.0here.
hello my friend, same is happening with me, when it sudenly stops after working without erorr loginig , since Webhook triggers not reliable after n8n restart
try to dectivate and activate the workflow, is it going to work?
@AI_Blueprint can you please share more about your n8n setup? Are you self-hosting or using n8n cloud? What n8n version? Are you talking about a webhook-based trigger for an app or the n8n webhook trigger?
have you try to set a workflow specific for error? and you are sure the sever have enough memory and cpu for handle all your automation? Are you sure the entire workflow dont go in strange loop behavior?
I am having the same issue, did you ever find a solution?
Self hosted, I have a webhook trigger that fires on a Google Chat script. It works for about a day. I’ve noticed sometimes it responds with workflow started (but the workflow does not run) and sometimes it responds with 404 workflow not active.
If I toggle the workflow inactive and back to active it works fine (for another day or so).
@Derrick_GTC its always worth looking deeper into n8n logs, u can enable debug also and test the workflow on a seperate isolated instacne if that helps too. But I would check logs around time fails, see if its infra issue or workflow/connection issue. It could be timing out, I would also add some waits just to make sure not calling apis to much, do you see any errors when looking at past executions on the gui?
I am also having the same issue, did you find a solution? I also tried to create a Cron Job that calls an Edge Function in Supabase that makes an request to the n8n API to deactivated it
But, still, after 1 day or more, unless is manually deactivated and activated, the webhooks stop working. And since the workflow stops receiving any requests, doesn’t generate any logs whatsoever.