N8n Scalability & Performance

We’re currently leveraging n8n at scale for both batch and real-time data processing, and we’d appreciate your guidance on a few architectural and performance-related considerations we’re encountering in production-like workloads.

Below is a high-level overview of our use cases and the specific challenges where we’d value your insights:


1. High-Volume Data Normalization & Deduplication

We have multiple workflows responsible for:

· Pulling data from external data sources

· Normalising the payloads

· Performing deduplication logic

· Persisting the final data into MongoDB

While we are already:

· Using bulk inserts

· Processing data in controlled batches

…we are still observing long execution times for some workflows.

Key question:
Are there recommended workflow patterns, node configurations, or architectural best practices in n8n to optimise large-scale normalisation and deduplication workloads beyond batching and bulk writes?


2. WebSocket-Based Live Data & Latency

We also operate real-time workflows driven by WebSocket feeds.

In certain scenarios, even for a single incoming event, we’ve observed:

· End-to-end processing delays of up to ~13 seconds

Key question:
Are there known bottlenecks or best practices when using n8n with WebSockets to minimise latency—especially for near-real-time event processing?


3. Handling High-Frequency Data Streams

Some of our incoming data streams are:

· High-frequency

· Time-sensitive

· Continuous in nature

Key question:
What is the recommended approach in n8n for:

· Maintaining performance under high-frequency event loads?

· Horizontally or vertically scaling such workflows?

Are there patterns where n8n should primarily orchestrate rather than process every event synchronously?


4. Large Batch Inserts Taking >2 Minutes

In some cases, when inserting very large datasets into MongoDB (even in batches), workflows are taking 2+ minutes to complete.

Key question:
Are there any known n8n-level limits, execution time thresholds, or configuration optimisations (execution mode, queue mode, worker setup, etc.) that could help reduce execution time for large batch operations?


5. Redis for Frequently Accessed Data

We are already using Redis to cache frequently accessed data and reduce DB load.

Key question:
Are there recommended n8n patterns or anti-patterns when integrating Redis for:

· Caching

· Deduplication

· Rate-limiting

· High-throughput workflows?


What We’re Looking For

We’re primarily seeking:

· Best-practice architectural guidance

· Scaling recommendations

· Performance optimisation strategies

· Any real-world references or examples where n8n is used at similar scale

Hi @ShyamNama07,

Before I spend a lot of time answering your questions in detail, I have a few short questions and comments for each point, then we can dive in a little more detail.

High-Volume Data Normalization & Deduplication

What is the amount of records you process in a day or per batch and how big is the data payload on average?

Generally nodejs and javascript, which n8n is built on is not very performant with large data processing tasks and you’re better off building something in a more performant technology such as Java, Rust, Scala etc. However you could try and implement some architectural changes which could help this process perform a little faster using pub/sub (for queueing and handling volatility, retries, etc) and in memory dbs such as redis and so on. More on this later.


WebSocket-Based Live Data & Latency

How have you implemented this already as I have not seen or used sockets in n8n before. Is this something you have done or want to do in n8n?


Handling High-Frequency Data Streams

This one is probably related to the websocket question.


Large Batch Inserts Taking >2 Minutes

I would be interested to see one of these workflows to see if there are any inefficiencies currently which can reduce the insert time. Again, this is related to my first question, how big are these batches and the payload size

Hi @Wouter_NIgrini,

Thanks for the follow-up questions. Adding a bit more concrete context below.

Data Ingestion Patterns

Our platform processes data in three distinct forms:

  1. Static data – ingested via a pull-based model at defined intervals.

  2. Semi-static data – a hybrid model where part of the data is periodically pulled, with incremental updates pushed when changes occur.

  3. Live data – delivered in real time via a push model using WebSockets.

Static & Semi-Static Data Processing

  • Static and semi-static payload sizes vary by feed, but for static data we typically receive ~2k–4k records per execution.

  • To control execution time and memory usage, we split these into smaller insert batches, usually ~50–150 records per batch, before persisting to the database.

  • n8n currently orchestrates ingestion, normalization, deduplication, and persistence, with caching mechanisms in place to reduce redundant processing.

Live Data & WebSockets

  • WebSocket connections are not maintained inside n8n.

  • We operate a separate Node.js service that connects directly to the provider’s WebSocket feed and handles real-time ingestion.

  • n8n is triggered downstream from this service to manage orchestration tasks such as routing, transformation, persistence, and retry logic.

  • The latency observed in some scenarios appears to be related to workflow execution and downstream processing, rather than the WebSocket transport layer itself.

For the live workflow I will share a screenshot here

Hi @ShyamNama07,

Ill reply in more detail when I have som spare time.

In the meantime, I did some research and for websockets you can handle this directly in n8n using the below community node, which exposes a websocket from your n8n instance. This way you can avoid having a separate nodejs websocket server which then pushes data to n8n via REST.

npm community node package:

https://www.npmjs.com/package/n8n-nodes-websocket-standalone

Sample Workflow:

Nodejs ws client:

const WebSocket = require('ws');

// Configuration
const HOST = 'localhost';
const PORT = 5681;
const PATH = '/ws/test';
const WS_URL = `ws://${HOST}:${PORT}${PATH}`;

console.log(`Attempting to connect to ${WS_URL}...`);

// Create WebSocket connection
const ws = new WebSocket(WS_URL);

// Connection opened
ws.on('open', () => {
  console.log(`✓ Connected to WebSocket server at ${WS_URL}`);

  // Send a test message to the server
  ws.send('Hello from Node.js WebSocket client!');
});

// Listen for messages from server
ws.on('message', (data) => {
  console.log('Received from server:', data.toString());

  // Optionally close the connection after receiving data
  // ws.close();
});

// Handle connection close
ws.on('close', () => {
  console.log('WebSocket connection closed');
});

// Handle errors
ws.on('error', (err) => {
  console.error('WebSocket error:', err.message);
});

// Graceful shutdown on CTRL+C
process.on('SIGINT', () => {
  console.log('\nClosing WebSocket connection...');
  ws.close();
  process.exit(0);
});

Results:

SERVER:

CLIENT:

1 Like

Hello @Wouter_Nigrini Thank you for your suggestions, I need your help in one more scenario regarding scalibility that can we horizontally scale n8n workflows to run them on multiple different pods?

Hi @ShyamNama07,

I would be happy to discuss this further over a call. Please can you DM me with your email address, then we can set something up