With response streaming enabled, if the client disconnects (browser refresh) while the AI Agent is still streaming, the execution is aborted and shows { “isArtificialRecoveredEventItem”: true } with “Execution stopped at this node.” I’d like the execution to finish server-side even if the client disconnects, while keeping streaming on.
Minimal reproduction (vanilla n8n, no custom nodes)
Chat Trigger (or Webhook with Response Mode: Streaming).
AI Agent node with enableStreaming: true, connected to:
any Chat Model (e.g. OpenAI Chat Model),
any one tool that takes a few seconds (e.g. an HTTP Request tool hitting a slow endpoint — anything that keeps the agent busy long enough to refresh).
Open the built-in chat, send a message that makes the agent call the tool.
Refresh the page immediately, while it’s still streaming.
Open the execution → it’s stopped, AI Agent output = { “isArtificialRecoveredEventItem”: true }.
Environment
n8n: Cloud, version 2.27.4
AI Agent node v3.1, Webhook v2.1 (responseMode: streaming)
Expected vs actual
Expected: client losing the stream shouldn’t abort the run; the agent (and any tool side-effects) should complete server-side.
Actual: the run is aborted the moment the streaming connection drops → recovered/artificial item.
What I’ve tried
Disabling enableStreaming on the AI Agent avoids it — but then there’s no streaming (disabling streaming on either node falls back to request-response).
I know the immediate-response / two-webhook async pattern exists, but that drops live streaming.
Question
Is there a supported way to decouple execution lifetime from the streaming HTTP connection — i.e. keep streaming enabled but let the execution complete (not abort) when the client disconnects mid-stream? Any setting, or recommended pattern?
Would you like me to:
Export a minimal workflow (Chat Trigger + AI Agent + 1 slow mock tool + LLM) as JSON for you to attach — can reproduce it right away, greatly increases the chances of getting help?
Or keep this version and you fill in version + attach screenshots?
With response streaming enabled, if the client disconnects (browser refresh) while the AI Agent is still streaming, the execution is aborted and shows { “isArtificialRecoveredEventItem”: true } with “Execution stopped at this node.” I’d like the execution to finish server-side even if the client disconnects, while keeping streaming on.
Minimal reproduction (vanilla n8n, no custom nodes)
Chat Trigger (or Webhook with Response Mode: Streaming).
AI Agent node with enableStreaming: true, connected to:
any Chat Model (e.g. OpenAI Chat Model),
any one tool that takes a few seconds (e.g. an HTTP Request tool hitting a slow endpoint — anything that keeps the agent busy long enough to refresh).
Open the built-in chat, send a message that makes the agent call the tool.
Refresh the page immediately, while it’s still streaming.
Open the execution → it’s stopped, AI Agent output = { “isArtificialRecoveredEventItem”: true }.
Environment
n8n: Cloud, version
AI Agent node v3.1, Webhook v2.1 (responseMode: streaming)
Expected vs actual
Expected: client losing the stream shouldn’t abort the run; the agent (and any tool side-effects) should complete server-side.
Actual: the run is aborted the moment the streaming connection drops → recovered/artificial item.
What I’ve tried
Disabling enableStreaming on the AI Agent avoids it — but then there’s no streaming (disabling streaming on either node falls back to request-response).
I know the immediate-response / two-webhook async pattern exists, but that drops live streaming.
@sawsew467 no built-in knob for any of the three, streaming binds the execution to the live connection so a disconnect tears it down (that recovered item is n8ns generic “killed before finishing” marker, not real OOM), and theres nothing documented to decouple it or re-attach. the fix is to not run the must-survive work in the streamed execution, kick the agent + memory write into a separate non-streaming execution that finishes server-side, and let the client re-read history on reload. and pull side-effecting tools like send email out of the stream so they cant fire on a turn that never persists.
That isArtificialRecoveredEventItem marker is a sign the execution crashed rather than being cleanly cancelled. When the browser disconnects, the TCP socket closes, and the next res.write() call inside n8n’s streaming chunk sender throws an EPIPE error. That propagates up through the LangChain callback chain into the workflow engine, which crashes the execution mid-flight. The recovery service then fills in that artificial placeholder for any node that started but never persisted its output.
So the SSE connection and server-side execution are architecturally coupled in n8n’s current streaming implementation. There’s no config flag to decouple them.
Your actual options:
1. Disable streaming on the AI Agent node: what you already tried, but worth naming clearly: the execution runs to completion server-side with zero disconnect risk, and the full response returns when the workflow finishes. The UX tradeoff is real but the execution is reliable.
2. Respond immediately, then poll: Use a Webhook node set to “Respond Immediately.” It returns an executionId straight away, the workflow runs fully in the background with no HTTP response attached (no SSE connection to drop), and the client polls GET /api/v1/executions/{id} for the result. Slow tools will always complete. You lose streaming UX but gain deterministic execution. On Cloud you’re within the 5-minute execution timeout as long as your HTTP Request isn’t absurdly slow.
3. Queue mode workers: self-hosted only, not available on Cloud. Even then, it’s unclear whether streaming write failures on the main process fully decouple from worker execution state.
For your setup specifically, option 2 is probably the better path if you need tools to always complete. The polling UX is less elegant than SSE but a lot more predictable than hoping the client stays connected.
If streaming with disconnect-tolerance is critical, this would need a change in how n8n handles SSE write errors, specifically not propagating them back into the execution engine.
Both replies above are right that n8n’s built-in agent streaming binds the work to the live request, so a disconnect tears the execution down, and there is no flag to decouple them today. But you can still get streaming UX and guaranteed server-side completion at the same time. The trick is to stop streaming through n8n’s request and move the stream onto a channel the client subscribes to independently.
A shape that works:
Treat the turn as a durable background job. Take PurveshGandhi’s option 2 as the base: the webhook responds immediately with a turn id, the agent runs to completion server-side with no SSE attached, and the memory write happens no matter what. That alone makes the must-survive work disconnect-proof.
Get streaming back by writing chunks to an external channel, not the HTTP response. As the turn produces output, write it to something the browser can subscribe to on its own: a Supabase realtime row, Redis pub/sub, Ably, or even a partial_response column the client polls a couple times a second. The browser reads from that channel, never from n8n’s SSE. Now a disconnect only drops the read side, the execution keeps running, and on reload the client re-subscribes and catches up from the stored partial. That is the have-both answer: the work is bound to the job, the stream is bound to the store.
If token-level smoothness really matters, do the model streaming in a thin streamer outside n8n (a small edge function that streams model tokens to the client and posts the final transcript back to n8n for the durable memory write and any tools). n8n stays the system of record, the edge function is just the pipe.
One more, in the spirit of achamm’s point about side-effecting tools: keep every side effect, send email, write to CRM, anything that costs money, gated behind the committed turn and keyed by the turn id, so a retried or half-finished turn can’t fire it twice. Streaming should never be what decides whether a side effect happened.
So you do not actually have to choose. Bind the execution to a durable job, bind the stream to an external channel, and the client disconnect stops mattering.
Hi
I think I understand the issue — this is a pretty common edge case with n8n AI Agent + streaming when the client connection drops (browser refresh / reconnect).
What’s happening here is basically:
the stream lifecycle is tied too closely to the execution context, so when the frontend disconnects, the backend execution either gets interrupted or rehydrated incorrectly.
I would treat this as two lifecycles that are currently coupled:
client streaming lifecycle
The browser/client connection wants partial tokens/events now.
server execution lifecycle
The agent run may need to finish even if the client disconnects.
If those are bound to the same HTTP connection, a disconnect can become an execution-cancel signal. For long-running agents, I would usually decouple them:
request starts a job and returns a job_id;
server-side workflow continues independently;
stream endpoint only subscribes to job events;
if the stream disconnects, the job remains running;
client can reconnect with job_id and fetch current status/events/result;
terminal states are success, failed, cancelled_by_user, timed_out.
The important distinction is “client disappeared” vs “user intentionally cancelled.” Those should be separate states.
If n8n’s current streaming path ties disconnect to abort, the safer pattern is to put the long-running agent behind a durable job layer and treat streaming as an observer, not the owner of execution.
One non-sensitive question: do you need the browser to receive every intermediate token, or is it enough to stream status/events and fetch the final result when the job completes?
This is a current n8n architectural constraint: when streaming is enabled, the execution lifecycle is tied to the push connection, so a disconnect signals an abort. There’s no built-in setting to decouple them right now.
The closest working pattern in n8n today: use a regular Webhook (no streaming) as the entry point, immediately return a job_id to the client using the “Respond to Webhook” node, then fire the actual agent work as a sub-workflow using “Execute Workflow” set to “Run in Background” mode. The client polls a second GET webhook with the job_id to fetch the result once it’s written to a database or static data store. You lose real-time token streaming, but the agent execution completes regardless of client state - which sounds like what you actually need here.