Models dont see images from Tool

Expectation:

AI get image via tool and process it.

Problem:

No matter what use, valid MCP what return type: image and base64 or http tool, AI dont see image, in MCP case model think its just big text input (base64), in case HTTP AI just try to explain image by file name.

Note:

When I use chat file input it works, If I use http call → AI Agent it works.
Automatically Passthrough Binary Images - Enabled.
I probably miss something, or dont understand concept of n8n tools.

Small example:

{“nodes”: [{“parameters”: {“options”: {“responseMode”: “lastNode”}},“type”: “/n8n-nodes-langchain.chatTrigger”,“typeVersion”: 1.4,“position”: [-480,0],“id”: “38f83655-0b3b-4f3c-a5e2-045f5b90d969”,“name”: “When chat message received”,“webhookId”: “6f52bc6f-425b-4db5-85e0-ba663a58e162”},{“parameters”: {“model”: “qwen3-vl:8b-instruct-q8_0”,“options”: {}},“type”: “/n8n-nodes-langchain.lmChatOllama”,“typeVersion”: 1,“position”: [192,208],“id”: “b7a1161b-3cf4-4bb9-b3f1-fb9922fd9f85”,“name”: “Ollama Chat Model”,“credentials”: {“ollamaApi”: {“id”: “xeemINQw4q4AcFoa”,“name”: “Ollama account”}}},{“parameters”: {“promptType”: “define”,“text”: “={{ $json.chatInput }}”,“options”: {“systemMessage”: “You are a helpful assistant.\nYour task is describe screenshot.”}},“type”: “/n8n-nodes-langchain.agent”,“typeVersion”: 3.1,“position”: [208,0],“id”: “61d457bf-c26f-4edf-a3ff-550b867b2f44”,“name”: “AI Agent”},{“parameters”: {“url”: “https://t3.ftcdn.net/jpg/06/16/17/72/360_F_616177256_CcgxcLB0b3hUrlgNCK00yZk5w7knoylQ.jpg”,“options”: {}},“type”: “n8n-nodes-base.httpRequestTool”,“typeVersion”: 4.4,“position”: [432,208],“id”: “84295353-e1e9-4089-9a83-e79560a6d235”,“name”: “HTTP Request”}],“connections”: {“When chat message received”: {“main”: [[{“node”: “AI Agent”,“type”: “main”,“index”: 0}]]},“Ollama Chat Model”: {“ai_languageModel”: [[{“node”: “AI Agent”,“type”: “ai_languageModel”,“index”: 0}]]},“HTTP Request”: {“ai_tool”: [[{“node”: “AI Agent”,“type”: “ai_tool”,“index”: 0}]]}},“pinData”: {},“meta”: {“templateCredsSetupCompleted”: true,“instanceId”: “993516b7db18233e8eeed92b17b97af6d64d1b58a1e7d5eb3ba1cc252972ea9a”}}
1 Like

Hi @solsay Welcome to the community!

The situation you are facing is a current limitation of n8n’s AI agent tools, agent tools only work with JSON/text, not binary, so the model never “sees” an actual image when it comes from a tool.

Possible workaround is Using HTTP node connected before AI agent and get your file downloaded from that and in the AI agent enable Automatically Passthrough Binary Images = ON so the AI agent would be able to see the image.

2 Likes

Yeah this is just how the tool system works right now, tools return text/JSON to the agent and the model tries to interpret that as if it were the actual content which obviously doesn’t work for images. The passthrough binary option only works when the image is already in the input data flowing into the agent node, not when a tool fetches it during execution.

What you need to do is flip your approach, have the HTTP node fetch the image before the agent runs and pass it in as binary data on the main input rather than trying to have the agent call a tool to get it. If you need the agent to decide which image to fetch based on conversation that’s trickier, you’d basically need to do a two-step thing where first the agent outputs what URL it wants and then a second execution actually fetches and analyzes it, but for most use cases just fetching the image upfront and letting the agent describe what it sees works fine.

1 Like