How to handle large data/files in n8n?

Describe the issue/error/question

How to handle incoming/processing/outgoing of large data/files (~ 40MB)?

For example, for a workflow like this:


where,

  1. HTTP Request receives json array containing around 30k objects
  2. Function node modifies each of the 30k objects
  3. Spreadsheet node converts the incoming json to a CSV

Note:

  1. The docker container has been allocated 3GB of RAM.

Issue faced:
n8n consumes huge amount of memory (~ 2GB) & 100% cpu and subsequently crashes, every time.

What/how to configure in n8n/environment so that it can easily handle such workflows without causing any issues?

Information on your n8n setup

  • n8n version: 0.167
  • Database you’re using (default: SQLite): default
  • Running n8n with the execution process [own(default), main]: main
  • Running n8n via [Docker, npm, n8n.cloud, desktop app]: docker

Hi @shrey-42

You could increase the RAM and such of the environment. But that isn’t the most efficiënt route.

Best way in my opinion is to create a subworkflow to handle the processing. Make sure to start this workflow in your main workflow with smaller batches, using the split in batches node.

Hope this helps.

edit: The image below shows what I am talking about. Hope the Idea is clear.

edit: added a set node to clear data by keeping only set. Otherwise it will return the data and you will still have to deal with all the data processed in the main node.

1 Like

Hi @BramKn , thanks a lot for your input.
I do already use this technique, wherever possible.

But, i’m currently trying to identify a solution, specifically, where a single workflow can handle large data.
It’s not very efficient, and often feasible, to break up incoming/outgoing data/file into small chunks and then rearrange them elsewhere.

I imagine that just using the splitinbatches node will alleviate memory, so that the function is processing only one at a time. Imagine having 100 function calls going on simultaneously, of course does that stretch out your memory!

Welcome to the community @Martin_Neumann!

As n8n keeps always all the data of the workflow in memory will the SplitInBatches Node alone not have a positive impact on memory use. It is really required to split it up into different workflows for it to work.

If you work with large binary data, you can also try setting the environment variable N8N_DEFAULT_BINARY_DATA_MODE=filesystem. It will then save the binary data on disk rather than memory and DB and will so require much less RAM.

Missed on mentioning this earlier:

I’m already using
N8N_DEFAULT_BINARY_DATA_MODE=filesystem
:confused:

Is there anything else that i can try?

Nothing really except what got mentioned above.

Something else however to consider is not running it manually via the UI. Because If you do that n8n has to send always all the data to the browser. That will cause two problems:

  1. It will maybe crash your browser as it has not been built to handle such large amounts of data
  2. n8n has to always make a copy of the data to send it to the browser which temporarily increases the memory required a lot.

So I would suggest adding an HTTP Request node, activating the workflow, and then calling the production URL. It will use then less memory and it will maybe not crash anymore.

Tried this now. Still crashes.

Logs
14T16:44:37.457Z | verbose  | Execution for workflow My workflow 26 was assigned id 9786 {"executionId":"9786","file":"WorkflowRunner.js","function":"runMainProcess"}

2022-03-14T16:44:37.459Z | verbose  | Execution for workflow My workflow 26 was assigned id 9786 {"executionId":"9786","file":"WorkflowRunner.js","function":"runMainProcess"}

2022-03-14T16:44:37.480Z | debug    | Execution ID 9786 will run executing all nodes. {"executionId":"9786","file":"WorkflowRunner.js","function":"runMainProcess"}

2022-03-14T16:44:37.483Z | verbose  | Workflow execution started {"workflowId":"35","file":"WorkflowExecute.js","function":"processRunExecutionData"}

2022-03-14T16:44:37.490Z | debug    | Executing hook (hookFunctionsPush) {"executionId":"9786","sessionId":"knbo08wjijc","workflowId":"35","file":"WorkflowExecuteAdditionalData.js","function":"workflowExecuteBefore"}

2022-03-14T16:44:37.491Z | debug    | Send data of type "executionStarted" to editor-UI {"dataType":"executionStarted","sessionId":"knbo08wjijc","file":"Push.js","function":"send"}

2022-03-14T16:44:37.494Z | debug    | Start processing node "Start" {"node":"Start","workflowId":"35","file":"WorkflowExecute.js"}

2022-03-14T16:44:37.495Z | debug    | Executing hook on node "Start" (hookFunctionsPush) {"executionId":"9786","sessionId":"knbo08wjijc","workflowId":"35","file":"WorkflowExecuteAdditionalData.js","function":"nodeExecuteBefore"}

2022-03-14T16:44:37.496Z | debug    | Send data of type "nodeExecuteBefore" to editor-UI {"dataType":"nodeExecuteBefore","sessionId":"knbo08wjijc","file":"Push.js","function":"send"}

2022-03-14T16:44:37.497Z | debug    | Running node "Start" started {"node":"Start","workflowId":"35","file":"WorkflowExecute.js"}

2022-03-14T16:44:37.499Z | debug    | Running node "Start" finished successfully {"node":"Start","workflowId":"35","file":"WorkflowExecute.js"}

2022-03-14T16:44:37.501Z | debug    | Executing hook on node "Start" (hookFunctionsPush) {"executionId":"9786","sessionId":"knbo08wjijc","workflowId":"35","file":"WorkflowExecuteAdditionalData.js","function":"nodeExecuteAfter"}

2022-03-14T16:44:37.502Z | debug    | Send data of type "nodeExecuteAfter" to editor-UI {"dataType":"nodeExecuteAfter","sessionId":"knbo08wjijc","file":"Push.js","function":"send"}

2022-03-14T16:44:37.504Z | debug    | Start processing node "HTTP Request" {"node":"HTTP Request","workflowId":"35","file":"WorkflowExecute.js"}

2022-03-14T16:44:37.507Z | debug    | Executing hook on node "HTTP Request" (hookFunctionsPush) {"executionId":"9786","sessionId":"knbo08wjijc","workflowId":"35","file":"WorkflowExecuteAdditionalData.js","function":"nodeExecuteBefore"}

2022-03-14T16:44:37.508Z | debug    | Send data of type "nodeExecuteBefore" to editor-UI {"dataType":"nodeExecuteBefore","sessionId":"knbo08wjijc","file":"Push.js","function":"send"}

2022-03-14T16:44:37.509Z | debug    | Running node "HTTP Request" started {"node":"HTTP Request","workflowId":"35","file":"WorkflowExecute.js"}

2022-03-14T16:44:37.514Z | debug    | Send data of type "sendConsoleMessage" to editor-UI {"dataType":"sendConsoleMessage","sessionId":"knbo08wjijc","file":"Push.js","function":"send"}

2022-03-14T16:44:37.516Z | debug    | Proxying request to axios {"file":"NodeExecuteFunctions.js","function":"proxyRequestToAxios"}

2022-03-14T16:44:38.326Z | debug    | Remove editor-UI session {"sessionId":"knbo08wjijc","file":"Push.js"}

2022-03-14T16:44:40.612Z | debug    | Add editor-UI session {"sessionId":"d6k1ac210v","file":"Push.js","function":"add"}

2022-03-14T16:44:44.475Z | debug    | Remove editor-UI session {"sessionId":"d6k1ac210v","file":"Push.js"}

2022-03-14T16:44:48.529Z | debug    | Running node "HTTP Request" finished successfully {"node":"HTTP Request","workflowId":"35","file":"WorkflowExecute.js"}

2022-03-14T16:44:48.530Z | debug    | Executing hook on node "HTTP Request" (hookFunctionsPush) {"executionId":"9786","sessionId":"knbo08wjijc","workflowId":"35","file":"WorkflowExecuteAdditionalData.js","function":"nodeExecuteAfter"}

2022-03-14T16:44:48.531Z | error    | The session "knbo08wjijc" is not registred. {"sessionId":"knbo08wjijc","file":"Push.js","function":"send"}

2022-03-14T16:44:48.532Z | debug    | Start processing node "Function" {"node":"Function","workflowId":"35","file":"WorkflowExecute.js"}

2022-03-14T16:44:48.533Z | debug    | Executing hook on node "Function" (hookFunctionsPush) {"executionId":"9786","sessionId":"knbo08wjijc","workflowId":"35","file":"WorkflowExecuteAdditionalData.js","function":"nodeExecuteBefore"}

2022-03-14T16:44:48.533Z | error    | The session "knbo08wjijc" is not registred. {"sessionId":"knbo08wjijc","file":"Push.js","function":"send"}

2022-03-14T16:44:48.534Z | debug    | Running node "Function" started {"node":"Function","workflowId":"35","file":"WorkflowExecute.js"}

2022-03-14T16:44:49.847Z | debug    | Running node "Function" finished successfully {"node":"Function","workflowId":"35","file":"WorkflowExecute.js"}

2022-03-14T16:44:49.848Z | debug    | Executing hook on node "Function" (hookFunctionsPush) {"executionId":"9786","sessionId":"knbo08wjijc","workflowId":"35","file":"WorkflowExecuteAdditionalData.js","function":"nodeExecuteAfter"}

2022-03-14T16:44:49.849Z | error    | The session "knbo08wjijc" is not registred. {"sessionId":"knbo08wjijc","file":"Push.js","function":"send"}

2022-03-14T16:44:49.850Z | debug    | Start processing node "Spreadsheet File" {"node":"Spreadsheet File","workflowId":"35","file":"WorkflowExecute.js"}

2022-03-14T16:44:49.851Z | debug    | Executing hook on node "Spreadsheet File" (hookFunctionsPush) {"executionId":"9786","sessionId":"knbo08wjijc","workflowId":"35","file":"WorkflowExecuteAdditionalData.js","function":"nodeExecuteBefore"}

2022-03-14T16:44:49.851Z | error    | The session "knbo08wjijc" is not registred. {"sessionId":"knbo08wjijc","file":"Push.js","function":"send"}

2022-03-14T16:44:49.852Z | debug    | Running node "Spreadsheet File" started {"node":"Spreadsheet File","workflowId":"35","file":"WorkflowExecute.js"}

Ok, then either sub-workflows or more RAM.