Working execution shows as failed

Hi,

I have a workflow that ingests data from Mailchimp to Bigquery and Updates Mailchimp profiles with data from a Bigquery table. The execution lasts around 10 minutes and ends with a memory error even though all nodes are succesfully executed and all batches seem processed. The error shows up when the batch node goes to the done output.

The last part that uploads data BQ to Mailchimp works if I execute it separately

I have tried different batch sizes and the behaviour is the same.

Would a subworkflow help here? Use a Set node with no items after ingesting data in Bigquery and then trigger the read from Bigquery and update Mailchimp workflow.

Do you see any other issue with the wokflow in total?

Thank you in advance!

Information on your n8n setup

  • n8n version:1.6.1
  • Running n8n via: cloud

Hi @xristos :wave: Breaking this up into subworkflows would definitely be recommended if you’re getting memory errors :+1: It might just be slightly too much data at once :bowing_man:

A quick, sneaky edit:

You may also want to ensure the sub-workflow doesn’t return all of its data to the parent - you can do that with a Set node, like this:

The “Execute Once” will return only a single (very small or even empty) item :+1:

1 Like

Hi @EmeraldHerald,

thank you for your response! I actually tried putting a set node in the end of the subworflow so that no data is transferred to the parent one but still get an error in the parent workflow. Subworkflow finishes successfully and parent workflow fails.

Parent:

Subworkflow:

Hi @xristos - thanks for sharing that! Out of curiosity, how much data is this processing? :thinking:

@EmeraldHerald currently it is importing ~30k contacts from Mailchimp and is updating 2700 contacts. The data size is not big. In Bigquery the whole contacts table is 5MB and the Update Table is less than 20MB.

In production there will be higher amounts of data in the update node (~10x) so the workflow should be robust. Does it make sense to split the workflow in more subworkflows i.e. after IF Node?

@xristos that might be too many calls at once - you could split into another sub-workflow or split this in batches if you want to make absolutely certain :bowing_man:

1 Like

@EmeraldHerald could you elaborate more on this?
I am already using split in batches in the update node to update 100 contacts each time. I have tried with less and it just takes more time.

In the HTTP request there is also an offset for pagination and I am requesting 500 contacts each time. There is a loop with IF node. I guess only the 500 contacts are stored in memory on each run right?

Hi @xristos - sorry, I didn’t mean to be confusing!

The sub-workflow may not be working to help with memory issues as it’s still processing a large amount of data (such as thousands of items), so this may need an overall change in workflow architecture to ensure that you’re processing everything in smaller chunks :bowing_man:

One issue that you may want to take a peek at - it looks like you’re using the HTTP Request node to call the Mailchimp API in your parent workflow, which then loops until you have all the items in the parent, which might be where some of the memory issues start.

A suggestion here would be fully processing only one page of data from Mailchimp at a time in the sub-workflow - as in downloading the Mailchimp data, inserting it into BigQuery, querying the profiles in BigQuery, and then updating Mailchimp per page. You could then return a single response item with information such as the processed page (and perhaps the number of total results to the parent) and then have the parent workflow decide if another sub-workflow execution is needed, if that makes sense?

I’m not sure what n8n cloud plan you’re currently using, but an upgrade to that plan (if possible) might help too, as each tier of n8n cloud has double the resources of the previous tier. :bowing_man:

2 Likes

Hi @EmeraldHerald ,

I am already in the Pro plan :slight_smile:
Unfortunately querying and udpating in batches does not really fit my case but I splitted the workflow into 2 subworkflows and it is working when updating ca. 3k profiles.
Parent


Workflow 1

Workflow 2

The issue I am having now is that the profiles table can be quite large i.e. 30k-50k Profiles that need to be updated in production. After 45 min Workflow 2 fails which I suppose is some timeout error because it takes too long. 3k Profiles take around 15 minutes so if the size is x10 the execution will last very long.

Do you have any Idea how to speed it up? is there a sweet spot in the update batches in Mailchimp node? currently I am using 50. I found out that with higher batches i.e. 500 it took longer. I fear this can be a dealbreaker for using n8n.

Hi @xristos :wave: Thanks for going into that detail.

Is there any possibility for you to break this down further into smaller batches? By that I mean having the workflows handle things in even smaller batches than they do currently - possibly even through an additional sub-workflow that has the empty Set node at the end as well?

If you truly can’t break this down further, then looking into self-hosting may unfortunately be a good shout :bowing_man:

I see a couple of options here. One option is to use a global variable to store the offset value (of the pagination). The logic would be:
Master flow that will receive one page with contacts at a time (e.g. first/next 500 contacts) and pass those contacts to the next workflow to process it. Then that master flow saves the next offset value (or next page url) to the global variable and exits (and thus releases the memory). That master flow should be run with a schedule trigger with enough time to not overlap the executions (e.g. run every 15 minutes if the flow works for 10-12 minutes)

So the overall time for 10k contacts would be:
10 000 / 500 x 15 = 300 minutes in that setup
and the memory will be limited to 500 contacts + processing of them (two workflows at a time)

It can be adjusted to pass X contacts to the sub-flow and exit.

It has two disadvantages:

  1. It is bound to the fixed time (every 15 minutes in the example above) even if the sub-flow can process contacts faster.
  2. If the contacts will be processed for a longer time (more than 15 minutes), then the overlapping may occur (e.g. the first execution is still running and the second one is started with the same set of data).

The other option is to use a queue service like RabbitMQ to store the contacts. And a workflow with a trigger node to process the contacts. The main problem here is how to get all 30-50k contacts as the workflow will definitely run out of memory at some point. I use an AWS lambda function with s3 storage for such cases