Ollama chat Node has limited the 5 mins timeout

WTeeth · April 2, 2024, 7:45am

Ollama provided 'keep_alive" parameter in API call, while N8N Ollama node doesn’t support that. As a user of Continue with the ollama LLM serving backend, I frequently experience long delays in responses in my workflow as ollama unloads the model and weights after 5 minutes by default. Ollama recently added support for the keep_alive parameter in requests which can prevent unloading or make the model in-memory persistence configurable. Please add support for configuring the keep_alive parameter and adding it to inference requests sent to the ollama backend. The parameter has been added to ollama 0.1.23 through the merged pull request here: add keep_alive to generate/chat/embedding api endpoints by pdevine · Pull Request #2146 · ollama/ollama · GitHub (edited)

WTeeth · April 8, 2024, 5:51am

Dear team, any findings for this issue?

oraimond · September 21, 2024, 11:27pm

@WTeeth I’ve had issues with this in my own environment. They seem to have added the option to configure the keep_alive parameter to longer than 5 minutes, however the LLM modules seem to request the data after 5 minutes regardless of this setting. If the model is still working after 5 minutes, I just get an error in the workflow, even though the model is still “alive”.