Running large excel workflows

Describe the problem/error/question

Help with Large Excel File Processing in n8n Workflows

What is the error message (if any)?

We’re currently evaluating n8n for automating Excel-based workflows within our organization. We’ve developed a sample workflow that uses standard Excel nodes along with some custom nodes (executing python code by spawn) for data transformation.

While the workflow performs well with smaller files (around 1,000 to 5,000 rows), we’re running into issues when processing larger files — for example, Excel sheets with 300,000 rows (file size for this excel is 37MB < 200mb). In these cases, the workflow doesn’t throw an error, but it also doesn’t complete — it just continues executing indefinitely.

For context, we are currently hosting n8n community edition using Docker.

Custom Node Details

Some of our custom nodes are implemented in JavaScript, and it calls a Python process for complex data transformation. The JavaScript code spawns a Python process like this:

return new Promise((resolve, reject) => {
  //pass form parameters to python code
  const tempOutputPath = path.join(os.tmpdir(), `n8n-output-${Date.now()}.json`);
  const py = spawn('python3', ['-c', pythonCode, form_inputs, tempOutputPath]);

  // After Python script finishes, read output from a temp file
  py.on('close', (code) => {
	  const output = fs.readFileSync(tempOutputPath, 'utf8');
	  const parsed = JSON.parse(output);
	  resolve(this.prepareOutputData(parsed));
	  fs.unlinkSync(tempOutputPath);
  });
});

In our python code we are reading input from previous node as:

pythonCode = """
input_json = json.load(sys.stdin)
df = pd.DataFrame(input_json)

# data processing on dataframe
# ...

# write processed data in a file
records = df.to_dict(orient='records')
n8n_output = [{"json": record} for record in records]
with open(tempOutputPath, 'w') as f:
                json.dump(n8n_output, f)
"""

This approach involves:

  • Running the Python code via spawn.
  • Writing the processed data to a temporary JSON file.
  • Reading and parsing the JSON output in the node.
  • Returning the transformed data to the workflow.

Questions:

  1. Are there recommended ways to handle large datasets efficiently in n8n workflows, especially with custom nodes like ours? For example, should we consider batch processing, streaming, or chunking?
  2. Would upgrading to the Enterprise version help with performance and scaling issues, such as better vertical scaling or optimized execution of long-running workflows?
  3. Are there any best practices or architectural recommendations for processing very large Excel files reliably and efficiently in n8n?

Please share your workflow

Share the output returned by the last node

Information on your n8n setup

  • n8n version:1.93.0
  • Database (default: SQLite):SQLite
  • n8n EXECUTIONS_PROCESS setting (default: own, main):main
  • Running n8n via (Docker, npm, n8n cloud, desktop app):Docker
  • Operating system:linux