How to split a spreadsheet into multiple file and send them by email

mitchum · January 20, 2022, 5:43pm

Describe the issue/error/question

Hi, I’m pretty new to n8n and thanks for this awesome tools.
I got a new “workflow” i want to achieve and a little bit stuck.

my goal is to “split” this spreadsheet into disctinct file per user and send it to email to dedicated email

like in that case there will be 2 files generated, one for user A, one for user B and sent to their respective emails.

the part to load the binary file and generate the spreadsheet is not an issue but i’m a little bit lost on how to “split” content per key (here user)

I was thinking about the “item list” node but i can’t make anything from it (like my field user is not array)

If anyone get an idea

Thanks

MutedJam · January 21, 2022, 9:57am

Hi @mitchum, welcome to the community

This is indeed a tricky one. Are there only two possible emails? Or is there an infinite number of emails?

If there’s only two emails, I’d simply recommend using the IF node to check whether the email field equals [email protected] (in that case all rows for this email can be further processed after the true output of the IF node, and rows for [email protected] fo to the false output).

If there can be more emails, the scenario would be a bit harder to implement in n8n. The only approach I can imagine would be to transform your input data into a single item for each email address using customized code through the Function node. Then split up the workflow in batches of 1 using the Split In Batches node.

Finally, for each individual email spread out all matching items again, create the spreadsheet, send out the email and loop until all emails are processed.

For example like this:

Example Workflow

mitchum · January 21, 2022, 12:06pm

Thanks that was it ! it’s working as expected on small dataset but get a memory issue on large dataset (40K row on the main xsl file)

So will see if i can “split” the xls file before but the JS part was good.
Thanks for the help

MutedJam · January 21, 2022, 12:26pm

Ah, this can be quite memory hungry indeed. The Function node is tough on memory, but also the binary files generated inside the loop can become large from the looks of it.

Are you self-hosting n8n? If so, you could try setting the environment variable N8N_DEFAULT_BINARY_DATA_MODE=filesystem on the latest version. This will mean binary data is held on the filesystem rather than in memory (this is still being tested at this stage, so not documented yet) and might already improve the situation for you.

mitchum · January 21, 2022, 12:38pm

I will try that for sure as i got the self hosted app but also the “desktop” one.
I make small try on desktop before mooving to server

will try this one on server with this setting see what happen.

Else I have an other “idea” in mind is to add something link retrieve the file content > create sql table > insert data > get disctinct email > batch of query for each email (generate file / send email)

maybe overkill but can be an option too

MutedJam · January 21, 2022, 12:44pm

Unfortunately this variable won’t have any effect on the n8n desktop version yet. This is because the desktop version is only being built every few weeks based on the available n8n version at that time. Currently, the desktop app uses n8n version 0.153.0 while the latest version available for self-hosting and n8n.cloud is 0.159.1. The binary data handling improvements have started with version 0.156.0.

So if you do have a SQL database available, this road is definitely worth a shot

mitchum · January 21, 2022, 2:01pm

So i’m on the good way with the SQL process i think
I made the start like Read binary file / import into DB / select distinct MAIL but now i’m stuck in the “batch” mode

i’m not sure how to retrieve the batch value to “fill” my query (and replace the XXXX here by the retrieved emails)

select * from prixmarches where MAIL = ‘XXXXXXX’;

(i try to look at the doc, but didn’t find any sample / expalaination, i might be wrong…)

MutedJam · January 21, 2022, 2:12pm

This could be achieved using an expression. To add one, first make sure to run your workflow up until the SplitInBatches1 workflow (this is just to generate some example data making working with expressions easier).

You can then add the actual expression by clicking the cogwheel next to the query field and then Add Expression:

In the expression editor, instead of the XXXXX pick the field you want to use through the browser on the left side (the data from the screenshot is just some dummy data I use, your fields will of course look different):

Also, to save memory you might want to split the workflow part up until the insert and the select part into two separate workflows running after each other.

mitchum · January 21, 2022, 2:59pm

awesome !
i’m almost there

Got my expression working (green)
Capture d’écran 2022-01-21 à 15.55.18

but something goes wrong after
Capture d’écran 2022-01-21 à 15.54.38

For the memory part for now with SQL it’s not an issue the whole process (almost green everywhere as you can see)
take about 1 or 2 minutes maximum on my laptop

mitchum · January 21, 2022, 5:59pm

So now i got the email but not the attachment

I used here also the expression

Capture d’écran 2022-01-21 à 18.57.23

seems to be good but i can’t find the attachments in my email. Maybe related to the SMTP used ?

MutedJam · January 21, 2022, 6:30pm

Sounds like we’re getting there. The attachments field wouldn’t require the filename but the name of the binary property. In my example that’d be simply data:

It’s the value you’d set in the previous Spreadsheet File node generating the file:

mitchum · January 21, 2022, 6:47pm

you’re the man !!!
got my email with attachments
Capture d’écran 2022-01-21 à 19.43.57

it’s solve a lot of pain on my side to manualy cut / send by email each subset
your’re awesome

MutedJam · January 21, 2022, 6:55pm

Awesome, glad to hear this worked out, thanks so much for confirming and happy automating

mitchum · January 21, 2022, 7:01pm

Just have to fine tune cause right now i got a bunch of email but every time the same content, it’s seem the bach get the same indice (email) in my case and not use the next one.
But i’m almost there !

Edit : I use the wrong indice in my 2nd sql query, with the right indice all goods