How to best handle large amounts of data?

Hi there,

I want to build workflows where custom presentations are generated on a per lead/contact basis which adds up large amounts of binary data quite quickly.
Now, I self host N8N with the following settings:

sudo docker run -d --restart unless-stopped -it \
  --name n8n \
  -p 5678:5678 \
  -e N8N_HOST="myn8n.your-domain.com" \
  -e WEBHOOK_TUNNEL_URL="https://myn8n.your-domain.com/" \
  -e WEBHOOK_URL="https://myn8n.your-domain.com/" \
  -e N8N_ENABLE_RAW_EXECUTION="true" \
  -e NODE_FUNCTION_ALLOW_BUILTIN="crypto" \ # adding Javascript Package Crypto just to show how the packages would be added
  -e NODE_FUNCTION_ALLOW_EXTERNAL="" \ # needed for external 
  -e N8N_PUSH_BACKEND=websocket \
  -v /home/mygoogleaccount/.n8n:/home/node/.n8n \
  n8nio/n8n

My question is what would be the best way to setup my instance in a way (which has limited RAM and Storage) that it does not store the data in the filesystem (N8N_DEFAULT_BINARY_DATA_MODE ) but also does only load it in the RAM for the time it needs it within a flow and get immidiately replaced by the next binary document keeping the instance from crashing instantly.

Information on your n8n setup

  • n8n version: 1.93
  • Database (default: SQLite): SQLite
  • n8n EXECUTIONS_PROCESS setting (default: own, main): own
  • Running n8n via (Docker, npm, n8n cloud, desktop app): Google Cloud Project
  • Operating system: Windows10

If the issue is that your memory isn’t big enough to hold all of the binary data for all of the presentations at once, you might be able to limit that by doing generating each single presentation in a sub-workflow (so that when that sub-workflow execution ends, the memory is released).

Another strategy that might work is to create a workflow that reads, processes, and removes a single item from a list, and then re-triggers itself until all items are gone from the list. To get things started, initiate it periodically with a schedule trigger. The current state of the list could be kept externally in a database or could be modified and passed along each time the execute sub-workflow is called. Be sure to add, and toggle-off, the Wait for sub-workflow Completion option on the Execute Sub-Workflow node, or you may end up with everything still in memory anyway. (I haven’t tested this to see if it actually limits memory usage. It’s just an idea to try.)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.