Webhook workflow with FAQ and AI (Ollama) paths. FAQ path returns response instantly and correctly. AI path completes successfully (all nodes green including Respond to Webhook) but curl gets no response back — just hangs until timeout.
Chain is: Answer — AI Service → Restore Session ID → Save Session Memory → IF — AI Escalate → Format — AI Answer → Build Response → Log — Outcome → Respond to Webhook → Log — Influx
Webhook node is set to “Using Respond to Webhook Node”. Respond to Webhook shows green tick and correct output in execution view. curl never receives the response.
FAQ path chain is: FAQ Lookup → IF FAQ Hit → Build Response → Log — Outcome → Respond to Webhook → Log — Influx — this works perfectly.
Only difference is the AI path takes 30-40 seconds due to Ollama. Could the webhook response context be timing out before Respond to Webhook fires on long-running paths?
Hi @andrew2016 not straight away gonna say this is a cloud limitation but yes, the n8n cloud has a webhook timeout limit 100sec as you can see and if the service did not responded the connection gives 524, now to get away with this what you can do is that make sure your respond to webhook node is right after the webhook node so that there is no delay, and continue running your rest of the flow asynchronously.
the green tick on Respond to Webhook means n8n did fire the response, so the problem is upstream. if youre self-hosted behind nginx or traefik, check your proxy_read_timeout — default is 60s which is tight when ollama takes 30-40s. adding proxy_read_timeout 120s; sorted it for me in a similar setup.
Adding to what @Benjamin_Behrens said — if you’re using Ollama, there’s another common culprit: model cold-start time.
By default, Ollama unloads models from GPU memory after 5 minutes of inactivity. The first request after unload triggers a full model reload (can take 10-30s depending on model size), and this pushes your total response time past the proxy timeout.
Fix: Set OLLAMA_KEEP_ALIVE=-1 in your Ollama environment to keep models loaded indefinitely:
# If running Ollama via systemd:
sudo systemctl edit ollama
# Add:
[Service]
Environment="OLLAMA_KEEP_ALIVE=-1"
Then restart: sudo systemctl restart ollama
Also worth checking:
num_ctxparameter — if your prompt + context is large andnum_ctxis set high (e.g. 8192+), generation takes significantly longer. Try reducing it to 2048 for FAQ-type responses.- Model choice — for webhook-triggered workflows where latency matters, smaller quantized models (e.g.
qwen2.5:7b-instruct-q4_K_M) respond much faster than larger ones.
I’ve been building n8n + Ollama workflow templates and webhook timeout is consistently the #1 issue people hit with local AI setups.
@andrew2016 The fix for production will be to immediately repond and keep the heavy AI processing in the background. This pattern would close the HTTP connection very fast and let your workflow to do its job.