Compute Sizing for Syncing 100k Contacts from Salesforce to DB on n8n

Hi All,

We need to compute the requirements for syncing 100k contacts daily from Salesforce to a database. How can we estimate or benchmark the compute sizing for self-hosted n8n?

Thanks
@hackerway

Hey @hackerway,

Welcome to the community :tada:

There is no real answer to this one, We do have a bit of benchmarking info but it isn’t specific to what you are doing.

Best thing to do would be to start off with say 2 cores and 4GB of memory and a Postgres database then start running the data through and monitoring the output and tweaking as needed.

Appreciate your support. Thanks for the info, Jon.

Could you please guide me on the suitable n8n mode (standard or queue) to run for the above use case? I am not clear on which mode to use.

Hey @hackerway,

That would all depend on how you build your workflow. Are you doing all the syncing at once on a schedule, is it in chunks, does it happen from a webhook maybe?

To be honest you can start from one instance and if needed you can implement scaling without really impacting anything. If you can share more information about how you will be dealing with the data I can let you know what I would do.

The flow will be using the built-in n8n app nodes for Salesforce and Postgres, and will sync approximately 100k contacts daily based on a daily schedule.

Hey @hackerway,

So is that 100k contacts at say 9am once a day rather than spacing it out over the 24 hour period?

As a sidenote, it would be very helpful to the community if you’d share some of your benchmark findings. Even a high level rollup would be very helpful, in case you don’t have time for a detailed share :slight_smile:

Another thing to keep in mind is that while most nodes are efficient in terms of how they use RAM, each instance of the Code node does have to make a copy of it’s incoming data in memory. That’s not usually a problem for single item, or even 100 item workflows but might be a factor in a usecase like this if you’re expecting to do all 100k in one exec.

So if you are reaching some bottlenecks and are looking at ways to reduce cost; a great first step is to try and limit # of code nodes.

1 Like

Hi @Jon Yes, 100k contacts at say 9am once a day

In that case if that is your only task using queue mode won’t help as the process will happen on the same instance. I would recommend processing the data in smaller chunks though just to spread the load over the course of the day but for now the one instance is the place to start.

1 Like

Understood Thanks @Jon .

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.