Extract from CSV is skipping final rows

AymericM · February 20, 2025, 6:01pm

Describe the problem/error/question

Let’s say I get this file from HTTP GET: https://drive.google.com/uc?export=download&id=17guqAW0E0KPndDmzj849qV-A8zYpxEx0 (response : file, output: data) this file has 106 rows. 7 columns but only 6 with headers and the 7th column only has a few rows of data. (see the CSV to understand the setup, it has voluntary malformed data, the goal being to identify that malformed data in the next steps)
Next node : Extract to CSV > returns only 93 items (so 94 rows including the headers.)

What is the error message (if any)?

None

Please share your workflow

Share the output returned by the last node

Expected output: the full CSV , thus 106 rows (or 105 without counting the headers)

Information on your n8n setup

n8n version: 1.80.0
Database (default: SQLite): SQLite
n8n EXECUTIONS_PROCESS setting (default: own, main): no idea
Running n8n via (Docker, npm, n8n cloud, desktop app): npm
Operating system: MacOS Sonoma 14.6.1

nickv · February 20, 2025, 11:34pm

my guess would be it is the comments at the end of the rows that are causing the problem. For example:

102,Mouse,,15.20,125,3  <-- Missing product_name
103,Keyboard,Electronics,,75,1  <-- Missing Price

I know you are trying to deal with dodgy CSV, but it would be interesting to see if it works without them.

JamesP · February 21, 2025, 11:25am

Looks like the problem it is having is line 95/96 where you have an additional field added at the end. If you remove the additional field it gets the whole list, seems it doesn’t like it when the columns are different from the header row.

Mohammed · February 21, 2025, 2:39pm

You should remove the , that are after AAA and AA on line 95-96
And test again

AymericM · February 21, 2025, 2:54pm

removing the comments did fix it. so it worked without the comments.
nevertheless, the idea here is to have malformed data and such comment could be it.

AymericM · February 21, 2025, 3:14pm

That is part of the intended malformed data. also this did not have any impact.

AymericM · February 21, 2025, 3:28pm

Oddly enough, I think the comments had nothing to do with it. I re-created the initial csv, and updated the file to products.csv - Google Drive
content looks exactly the same and now it’s picking up properly all the rows.

ScottBui · February 21, 2025, 3:42pm

You have to add an additional “,” at the end of the header row and any row that does not have malformed data in the new csv file though. That kind of defeats the purpose, no?

nickv · February 21, 2025, 5:36pm

That’s good!
I think if your goal is to sanitize CSV that may break a parser somewhere along the line, then feeding it into an parser is a good way to find out what things may get broken.
I think if you wanted this to be part pipeline to identify the malformed bits, you will probably need to do something more manual and complex with a code node.

AymericM · February 22, 2025, 3:42am

Honestly, I didn’t add anything. The initial data was in a google sheet, I duplicated the sheet once to remove the comments (as suggested by @nickv ) and downloaded as .csv , tried in the workflow and the extract node was properly reading all rows.
Then I tried to re-download the initial sheet as .csv and it just worked as well.

Here is the full final workflow. We can close this topic.

system · March 1, 2025, 3:43am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.