Support new thinking mode in Ollama node — Allowing to activate and filter thinking

The idea is:

Support the new thinking mode option from Ollama in the AI (LLM) node. This would allow users to retrieve only the final response from the model, without the intermediate reasoning steps that are usually included when thinking is enabled. And allow to enable/disable thinking.

My use case:

I’m using n8n to create AI-driven workflows. With Ollama’s thinking: true, the output includes both reasoning and the final result. For most workflows, I only need the final, clean answer.

Currently, I have to manually parse or trim the output to get just the response. Having native support for thinking mode would make this much more efficient and reliable.

I think it would be beneficial to add this because:

  • It simplifies data processing: no need to strip reasoning from the response.
  • It produces cleaner outputs for external services (email, SMS, databases, APIs).
  • It reduces memory/context bloat in conversational agents or history tracking.
  • It improves performance in production workflows by reducing parsing overhead.

Any resources to support this?

Are you willing to work on this?

Unfortunatly only for testing.

Closing this topic.
Implemented and works fine.
Thanks to the devs.

EDIT:
Nop, working only with gpt-oss, my bad.
Still need this feature.

1 Like

Is there anything needed to do to use this feature in gpt-oss? Is the thinking suppressed by default?

I can confirm work is still needed since qwen3 pollutes the output with lots of thinking.

+1

I dont see any way to activate non-thinking mode in self-hosted version.

2 Likes

Hopefully this feature can be added soon.