The idea is:
Support the new thinking mode
option from Ollama in the AI (LLM) node. This would allow users to retrieve only the final response from the model, without the intermediate reasoning steps that are usually included when thinking
is enabled. And allow to enable/disable thinking.
My use case:
I’m using n8n to create AI-driven workflows. With Ollama’s thinking: true
, the output includes both reasoning and the final result. For most workflows, I only need the final, clean answer.
Currently, I have to manually parse or trim the output to get just the response. Having native support for thinking mode
would make this much more efficient and reliable.
I think it would be beneficial to add this because:
- It simplifies data processing: no need to strip reasoning from the response.
- It produces cleaner outputs for external services (email, SMS, databases, APIs).
- It reduces memory/context bloat in conversational agents or history tracking.
- It improves performance in production workflows by reducing parsing overhead.
Any resources to support this?
- Ollama blog post about the feature: Thinking · Ollama Blog
- Official documentation: ollama/docs/api.md at main · ollama/ollama · GitHub
Are you willing to work on this?
Unfortunatly only for testing.