Issues with Ollama while trying to add to the chat model for the n8n AI Agent

Sangavi_Programmed · June 2, 2025, 4:16pm

Hi,

I am trying to create a AI Agent with Ollama Mistral 7B in the Chat Model. The problem that I facing is the responses are not coming in consistently. Either there is too much delay like more than 5 minutes for just a simple “Hi” message, or the node is not working and the entire path comes as red line. Sometimes the credentials not found error comes up. Can I get any steps of resolution for this? Is there any hardware requirement for Ollama useage?

coldrush · June 2, 2025, 11:56pm

Have you tested Ollama in a local environment alone, and how was the speed? (For example: ollama run mistral:7b)

Andrew_Wi · June 3, 2025, 7:34am

Hi,

I would assume it’s running on your CPU instead of GPU. Like Coldrush suggested, can you try running CMD and then:

ollama run mistral:7b

and write a message and see the response time.

Running this command will show you as well:

ollama ps

Here an example from me: 22 % CPU, 78 % GPU

NAME ID SIZE PROCESSOR UNTIL
gemma3:27b a418f5838eaf 22 GB 22%/78% CPU/GPU Stopping…

Sangavi_Programmed · June 5, 2025, 7:17pm

Hi, thanks for the reply. Its very quick maximum I see the response to come in 1 sec, starts in 1 sec but loads word by word. I see the problem when connecting to N8N AI Node. Any recommendations further to improve this ?

Sangavi_Programmed · June 5, 2025, 7:20pm

Hi I see the text to load up word by word. It takes 1.5 minutes to load the complete message. But I can see 1 word by 1 word. mistral:7b f974a74358d6 8.3 GB 20%/80% CPU/GPU 4 minutes from now

Its still using 80% GPU