Separate internal agent reasoning/tool calls from streamed response output

Separate internal agent reasoning/tool calls from streamed response output

Subcategory: node (AI Agent)

The idea is:

Add event type separation to AI Agent streaming output. When streaming is enabled, the output should distinguish between:

  • “response” - Actual AI text meant for users
  • “tool_call” - Internal tool invocation logs
  • “reasoning” - Agent thinking/planning steps

This could be exposed via metadata.eventType in the streamed JSON objects, or as a node option: “Stream only final response.”

My use case:

I’m building a SaaS with an AI chatbot that uses multiple tools (vector database search, internet search, API calls, etc.). When tools are called, internal LangChain logs stream directly to users:

Calling fallback_search with input: {“input”:“user query here”}

This raw text appears in the chat mixed with actual responses, completely breaking the user experience. I’m forced to build fragile regex filters in edge functions to strip these logs - which breaks whenever the format slightly changes.

I think it would be beneficial to add this because:

  • Production-ready streaming - Any chat application needs clean output without internal debug logs
  • Event-driven handling - Developers can show tool-specific loading states, log calls separately, or hide them entirely
  • API parity - OpenAI, Anthropic, and Gemini all separate tool calls from text in their streaming APIs
  • Eliminates workarounds - No more regex filtering in middleware that breaks unpredictably
  • Better debugging - Developers can still access tool calls and reasoning when needed, without polluting user-facing output

Any resources to support this?

  • OpenAI Streaming API: Tool calls sent as delta.tool_calls separate from delta.content
  • Anthropic API: Uses content_block with distinct type: "text" vs type: "tool_use"
  • Google Gemini: Separates functionCall objects from text parts
  • LangChain itself has callbacks that distinguish these events - n8n just needs to expose them
  • Vercel AI SDK: Provides separate streams for text and tool calls

Are you willing to work on this?

Yes, I’m actively building with n8n AI Agents in production and happy to test any implementation. This is a blocker for anyone shipping real chat applications with tool-using agents. I can provide detailed feedback.

I’m building conversational agents with HubSpot and other tools, and the current streaming behavior creates a frustrating UX issue: internal reasoning and tool calls leak into the user-facing chat.

Right now, my workaround is parsing the stream manually and filtering out anything that looks like a tool invocation or thinking step. It’s fragile and breaks whenever the output format changes slightly.

What would help is a structured event stream with typed chunks. For example, each chunk could have a type field like “user_response” for the actual answer to display, “tool_call” and “tool_result” for internal tool invocations, and “reasoning” for chain-of-thought steps.

This way, the frontend could display only user_response chunks to the end user, log tool_call and tool_result for debugging purposes, and optionally show reasoning in a “thinking” UI similar to what Claude does.

My use case: I’m using the AI Agent as a backend for a client-facing chatbot. Users should never see raw tool calls or internal chain-of-thought, only the final polished response.

Would love to see this natively supported rather than hacking around it.

1 Like

Hey, thanks for breaking this down — really helped me see the issue more clearly.

I’m thinking the move is to handle the AI logic in code (direct API calls, proper chunk filtering) and just use n8n for orchestration. Best of both worlds that way.

1 Like