Csv to json conversion size limitations?

Describe the problem/error/question

Hi, I am trying to convert a csv file (extraction from another solution) and wanted to convert it in json to manipulate the date in a workflow.

The file is 100MB huge with “;” delimiter.

Each time I tried to convert, it make my N8N instance broke and I need to restart it.

I have tried with a smaller file and it works. So I think the size is the issue but I would like to know if someone know something about it and how to manage it (what is the root cause, do I need to change something on my instance configuration? split the file (not easy in production)?)

The isntance is running onto docker compose under windows

Thanks

What is the error message (if any)?

Docker instance crash error msg

# Fatal error in , line 0

# Check failed: has_exception().

#

#

#

#FailureMessage Object: 0x7ffe33199230

(very helpful)

Please share your workflow

Share the output returned by the last node

Information on your n8n setup

  • n8n version: 1.109.2
  • Database (default: SQLite): default
  • n8n EXECUTIONS_PROCESS setting (default: own, main): default
  • Running n8n via (Docker, npm, n8n cloud, desktop app): Docker Compose
  • Operating system: Windows 11

Hey @Viper95 hope all is well. It appears you may be running out of memory there, especially since smaller files do work. You options could be to either offload the task to a third party service (external or internal) or attempt to chunk up the input so that you only work with a part of the data at a time.

Since the resulting JSON will be even larger in size, I guess the big question is what are you planning to do with this data after you extract it? How are you going to use that huge JSON?

Hi yes, the use case is to analyse the data from the csv on a daily basis and manage actions depending on the content (typically group data by type and assign action in consequences)

I mean a bit more literally - what is the next action you will do with the data? Can you work on a chunk of data at a time or do you have to have all data at once?

yes I need it because I have to identify all events (rows) linked to same usernames. The actions after will be to regroup data by username (and after that troncate them in different subfiles)

If you have to hold and operate the whole data at once, I would think of offloading this to third party data processors. N8N is more of a automation orchestrator with data processing features, not a heavy-duty big data engine.

can you please tell me how do this (at least guide me to the right docs)?

This can wildly range from a self-hosted python script with API wrapper around it to using data processing engines (using Hadoop here would be on overkill).

What I encourage you to think about though is whether you use the right tools. Is CSV file the best option? Most of the time the answer is no if the size exceeds a few megabytes. You really want to see this data in some sort of a database, where extracting and grouping is a native functionality.

If getting the CSV is the only option (sometimes you have no choice) feeding it into a database always always yields best results if you want to work on this data using default methods for fetching with filters, sorting or grouping.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.