Hey @Duarte_Palha, a few things that might help:
1. Ollama can’t read PDF binary data directly
The most common issue here is sending the raw PDF to Ollama. You need to extract the text first. If you’re using n8n v1.30+, there’s an Extract from File node that handles PDF text extraction natively — add it between your HTTP Request (that downloads the PDF) and the Ollama node.
2. Workflow structure that works
Webhook (receives PDF URL via Insomnia)
→ HTTP Request (downloads PDF, set Response Format to "File")
→ Extract from File (extracts text from PDF binary)
→ Basic LLM Chain + Ollama Chat Model (summarizes)
→ Respond to Webhook (returns result)
3. Use the built-in Ollama nodes instead of raw HTTP
Instead of calling http://localhost:11434/api/generate manually, use the Ollama Chat Model node (under AI > Language Models). It handles streaming, timeouts, and response parsing automatically. Connect it as a sub-node to a Basic LLM Chain node.
For the prompt in the Basic LLM Chain:
Summarize this resume. List key skills, years of experience, and education:
{{ $json.text }}
4. Docker networking
If n8n runs in Docker and Ollama on the host, make sure you started n8n with --add-host=host.docker.internal:host-gateway and set the Ollama credentials URL to http://host.docker.internal:11434 instead of localhost.
5. Timeout settings
Ollama on CPU can be slow for the first request (model loading). If you get timeout errors, go to the Ollama Chat Model node settings and increase the timeout to 120 seconds.
What specific error are you seeing? The screenshot shows “The service refused the connection” which usually means either Ollama isn’t running or the URL/port is wrong from inside Docker.