VRAM concerns when using AI Agents and loading models for different workflows and workflow paths

Recently I encountered issues where my AI Agent Nodes stopped because I was out of VRAM.
My workflow is processing email attachments through 2 paths in my workflow. One path for excel files and the other path for images. These paths use 2 different models.
If my email only had one type of attachment it was fine.
The problem arose when It first tried to process one type (in this case excel) it was OK but then an image attachment in the same email it ran out of VRAM trying to allocate for the next needed model.
I applied some brute force by executing code to kill the ollama process so it cleared all VRAM each time it processed an attachment.
This “works” but obviously this isn’t any sort of robust solution.
I also read that if i made each of these paths a separate sub workflow then it would naturally take care of it. It did not :frowning: (hence my brute force approach)
Now I am wondering what happens with other workflows and workflows that are triggered by external events. Does N8N handle this under the hood? I Assume these resources get deallocated after a workflow finishes. But then the question still lingers in my mind about what happens when different workflows compete for the same resources?
I’m hoping for some sort of definitive answer.
Sorry for the rambling question! :slight_smile:

Information on your n8n setup

  • n8n version: self hosted

Hey @msterra !

n8n itself doesn’tt manage GPU / VRAM for you… the CPU/RAM and workflow memory usage, not VRAM. I think you misunderstood…

n8n releasses it s own proccess memory after an execution or sub‑workflow finisheds, especilly when you split work into sub‑workflows that “do the heavy work” an only return small result. This is a recommended pattern fr reducing RAM usage and avoiding out‑of‑memory crashes.

In n8n youcan increase the old‑space size with NODE_OPTIONS=–max-old-space-size=2048

Cheers!

which part did i misunderstand?

This part.

N8n has nothing to do with your AI agent that is running locally.

If you want to know about parallelism and concurrency in n8n execution, that is a different story.

:slight_smile: