Extract from CSV is skipping final rows

Describe the problem/error/question

Let’s say I get this file from HTTP GET: https://drive.google.com/uc?export=download&id=17guqAW0E0KPndDmzj849qV-A8zYpxEx0 (response : file, output: data) this file has 106 rows. 7 columns but only 6 with headers and the 7th column only has a few rows of data. (see the CSV to understand the setup, it has voluntary malformed data, the goal being to identify that malformed data in the next steps)
Next node : Extract to CSV > returns only 93 items (so 94 rows including the headers.)

What is the error message (if any)?

None

Please share your workflow

Share the output returned by the last node

Expected output: the full CSV , thus 106 rows (or 105 without counting the headers)

Information on your n8n setup

  • n8n version: 1.80.0
  • Database (default: SQLite): SQLite
  • n8n EXECUTIONS_PROCESS setting (default: own, main): no idea
  • Running n8n via (Docker, npm, n8n cloud, desktop app): npm
  • Operating system: MacOS Sonoma 14.6.1

my guess would be it is the comments at the end of the rows that are causing the problem. For example:

102,Mouse,,15.20,125,3  <-- Missing product_name
103,Keyboard,Electronics,,75,1  <-- Missing Price

I know you are trying to deal with dodgy CSV, but it would be interesting to see if it works without them.

Looks like the problem it is having is line 95/96 where you have an additional field added at the end. If you remove the additional field it gets the whole list, seems it doesn’t like it when the columns are different from the header row.

You should remove the , that are after AAA and AA on line 95-96
And test again

removing the comments did fix it. so it worked without the comments.
nevertheless, the idea here is to have malformed data and such comment could be it.

That is part of the intended malformed data. also this did not have any impact.

Oddly enough, I think the comments had nothing to do with it. I re-created the initial csv, and updated the file to products.csv - Google Drive
content looks exactly the same and now it’s picking up properly all the rows.

1 Like

You have to add an additional “,” at the end of the header row and any row that does not have malformed data in the new csv file though. That kind of defeats the purpose, no?

That’s good!
I think if your goal is to sanitize CSV that may break a parser somewhere along the line, then feeding it into an parser is a good way to find out what things may get broken.
I think if you wanted this to be part pipeline to identify the malformed bits, you will probably need to do something more manual and complex with a code node.

Honestly, I didn’t add anything. The initial data was in a google sheet, I duplicated the sheet once to remove the comments (as suggested by @nickv ) and downloaded as .csv , tried in the workflow and the extract node was properly reading all rows.
Then I tried to re-download the initial sheet as .csv and it just worked as well.

Here is the full final workflow. We can close this topic.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.