Feature Request: Add MCP return control to avoid returning binary/base64 in workflow execution results

Describe the problem/error/question

I am using the new Instance-level MCP integration in n8n together with OpenAI’s Developer Mode (Tools / MCP host).

  • When I call text-only workflows through MCP, everything works perfectly.

  • The problem appears when I call a workflow that includes an HTTP Request node calling Google Gemini’s Nano Banana image model, which returns base64-encoded image data.

Inside my n8n workflow, I:

  1. Receive the base64 image result from Gemini,

  2. Process it (convert/upload it to an image hosting service),

  3. And ensure that the final node output is only a small JSON with an image URL, for example:

{
  "image_url": "https://…/generated.png",
  "prompt": "some christmas card prompt"
}

From the workflow designer’s perspective, there should be no large binary data left in the final output.

However, when this workflow is executed via Instance-level MCP from OpenAI Developer Mode, I observe:

  • The OpenAI MCP tool call response becomes extremely slow,

  • The conversation context in OpenAI is effectively “blown up”,

  • The LLM quickly hits context limits and begins with hullucination answers.

By contrast, the same MCP setup with text-only workflows does not show this behavior.

Because I don’t have direct visibility into the exact JSON that n8n returns to the MCP host, I can’t definitively prove it, but:

Based on the behavior, it is a strong hypothesis that the MCP execution response currently includes full workflow run data, including intermediate nodes with base64/binary content (e.g. the raw Gemini image result), even though the last node returns only a small JSON.

For workflows that involve image generation or any binary payload, this behavior makes the MCP integration practically unusable with LLMs, because the host (OpenAI) tries to load all of that internal data into the model’s context window.

What is the error message (if any)?

There is no explicit error from n8n.

On the OpenAI side, the symptoms are:

  • Tool responses that appear too large,

  • Context being exhausted very quickly after a single tool call,

  • In some cases, behavior consistent with truncated responses or the model silently dropping parts of the tool output.

So this is not a runtime crash in n8n, but rather a payload-size / context-limit issue on the LLM host side, very likely caused by the size of the MCP response.

Please share your workflow

Share the output returned by the last node

Information on your n8n setup

  • n8n version: Version 1.121.3
  • Database (default: SQLite): SQLITE
  • n8n EXECUTIONS_PROCESS setting (default: own, main): main
  • Running n8n via (Docker, npm, n8n cloud, desktop app): Hostinger, docker
  • Operating system: Ubuntu 24

Just a short clarification for the request:

When using Instance-level MCP, workflows that include image-generation or other binary steps (like Gemini Nano Banana) cause the MCP response to include the entire binary payload from intermediate nodes — even if the final workflow output is only a small JSON value (such as an image URL).

This makes the response extremely large and will overflow the MCP host context (e.g., OpenAI).

So the suggestion is:

Binary data generated inside the workflow should not be returned through MCP unless explicitly requested.

A clean mode like “return only final node output” or “strip binary from runData before returning” would solve this.

If my understanding is wrong, dont hesitate to correct me. thanks…

Thanks :folded_hands: