Writing large (csv) Files using Append

Describe the problem/error/question

I found the “write files to disk” with the APPEND option in the Options :wink:
I am trying to handle larger csv files by extracting data from SQL and after updating i need to create a number of large product feeds.
goint the staight forward route obviusly will not work when operating with 50k items. and writing the respective files.

So i was hoping that someone here can point me to a safe and reliable way to achieve this?
I thozght the APPEND option is the best way but it does not append anythign in my case.

So maybe you have some ideas?

BR
Seb

Please share your workflow

Share the output returned by the last node

Information on your n8n setup

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:
1 Like

It appears that the loop is not doing anything in your workflow. The way it is setup right now, you can safely remove the loop node and nothing will change.

Regarding the Write to file node - it expects a binary input. I don’t believe your SQL node returns a binary, so that would be why there is nothing is written to the file.

As for “the best way” to do it - I’d start with “a way” and only then move to “the way”.

You could start with saving the result of the query to a file right from MySQL node.

See the docs about this. For this if your mysql is running in the container, you want to mount a folder, which would be accessible both from mysql as well as from n8n. This will allow you to populate a file without having to pipe your data through n8n.

Okay - let me appologize for the crappy input and thank you for the attempt to help.

here is the updated Flow:

as you can see the file SHOULD grow bigger and bigger because i loop and write append the file.
but the file does not increase:

Maybe that makes more sense now.
BR
Seb

Ok, this is what happens in the above workflow:

  • SQL: Reads up to 100 records from MySQL DB (returns 1 item for each record)
  • Edit Fields (for each item): manually composes an object with 9 fields
  • file: creates a binary file from input data
  • Loop Over Items: loops over each input binary
  • Write Files to Disk: writes input binary to disk

There are a few things that you should consider.

  1. file node will wait for all inputs and only create a single binary, so looping after that node isn’t going to achieve anything.

the wf could be simplified to

  1. if you are going to split the whole database into batches, you will need to use the headers only for the first batch.

At this point it could be just easier to go with what I have proposed earlier - and store the file to CSV right from MySQL itself, this is going to be more efficient, faster and not much more difficult to setup.

Deal @jabbson
i am not sure i understand your proposal, but the issue is that i need to loop / batch / iterate the SQL becuase the results are >40k items.
I cannot handle this many records, transform them and write them to a file. This simply xoes not work if its happeninig in-memory.
hence i need to find a way to work with smaller batches by using LIMIT and OFFSET in the SQL to operate with i.e. 1000 items. Transform them and APPEND them to a pre-written file.
this is what i am trying to achieve. But i am not sure the “write filed” node does this?! Or at least i would like to see proof of that.

Cheers
Seb

And yet I would still encourage you to go with storing the records to CSV using the native SQL tools instead of trying to splitting/batching the records in n8n. This will allow you to write records to CSV faster and way more efficiently.

hi @jabbson i am still not sure what you mean.

Are you saying to run the initial SQL query unbatched INTO a CSV file Like this: INTO OUTFILE ‘/tmp/employee_data_1.txt’??

But i need to transform the data inside the file and then write into another file.
IF i only need to DUMP the SQL to CSV then you approach MAY work. But as i need to work with the items (hence n8n) i need a differend approach.

But maybe i misunderstood and am overlooking something.
BR
Seb

No, unless you can do transformations in SQL, I guess you are right, that would need to be batch reading. What about the resulting file? Does it have to be a CSV? I think it is possible (if you confirm that CSV is desired output I will try to create it), but append-writing to a google sheet would be much easier, for instance.

Hi Jabbson,

thank you so much for your engagement and willingness to help.

Basically i have stored “real time” Product data inside an SQl DB and keep that updated very frequently.

This flow now is extracting all products from the DB, transforms the data and generates FEEDS to suit the specific requirements of different markets. (Google Products, Affiliates, Marketplaces) - in the end i need approx 6 updated Feeds.

As we are looking at a large number of products and the biggest feed will be >100mb in Filesize - its super-shaky to do in one straight flow.

Hence i need to find a more stable way by iterating. The INPUT iteration is what iam working on right now, but there is not clean way for OUTPUT iteration.

My current work around which is not that bad, is to create a TABLE in my SQl for each feed, and iteratively write items inside that table. Once finished i can try your SQL > FILE command (or maybe i should try this now :wink:

So that is the status. What do you say?

  • any thoughts on the Input loop (struggling a bit)
  • how would that SQL > File work? have you done that?
  • any idea what to do in order to have the “write binary” node working in Append mode? (that would be the alternative approach)

Again - so many thanks to you

General recommendation on loops is - avoid them. Instead of loops you can use sub-flows, which in my opinion not only works better, but also makes it easier to read your workflows by segregating the execution into self-sufficient modules.

For the SQL to File, you need to mount the same volume into SQL container and to other containers where you need that file, like n8n. Then after writing results into a file on that mounted volume you can then read that file elsewhere.

As for how to append binary data to an existing csv file - you цшдд need to work around two things - headers, - you need them appended on the very first iteration and new lines - you will need to add them. I will try to report and fix the latter this week.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.