Does anyone know how I can save a list of files to external storage in n8n in bulk? In my workflow, I have 1500 files to save, and currently, I’m saving them one by one, which is taking over 5 hours.
According to the Bunny API, you can’t upload multiple files at once.
However, you can upload them in parallel, but you will eventually get the 429 error (throttling) because of their API limitations. You have to implement a retry behavior for that error.
The general logic would be to use a sub-workflow to upload files and in the parent workflow disable the Execute Workflow node’s option “Wait for sub-workflow completion”
@barn4k Thanks a lot for the feedback. Do you think there is a way for n8n to FTP everything to Bunny at once? How does n8n handle FTP and is there no memory crash for large amounts of files?
There is no way to upload all at once, as the Bunny doesn’t support that, it’s not related to the n8n.
About the memory - use a sub-workflow. After each execution of the sub-wf will be completed, the memory for it gets released.
However, you shouldn’t run all 1500 files in parallel, the memory will be overloaded very fast. Instead, pass item as batches with the Loop node and create a wait node to wait sometime between loop iterations.
Your main WF should contain only the Loop node, and the nodes needed to get the URLs (not the URL content itself) in order to pass the url to the sub WF.
the SUB wf should download the url content and pass it to Bunny API, then exit.
So the main WF will look like
and the sub WF will looks like this
That’s however not optimal as the next iteration will be executed regardless of the throttling.
The best approach is to use a message broker (like RabbitMQ) for storing the urls, then you may configure the RabbitMQ trigger to run for no more than X items at once, thus at a time there will be no more then X executions
@barn4k
Thank you for the explanations. In your suggested workflow, wouldn’t it end up being the same? It would still send files to Bunny one by one, and considering I have 1500 files, it would take around 5 hours to save everything.
My idea was to take these 1500 files, split them into batches of about 50 files, and create “branches” that would run “replicated sub-workflows” dedicated to saving each batch file by file. This way, I would have 30 sub-flows, each saving around 50 files (even if they save one at a time).
The BIG DIFFERENCE here is that I would have 30 sub-workflows running in parallel! Do you see any downside to this approach, aside from managing Bunny’s API rate limits? In terms of n8n, would this approach create significant performance issues?