Help/Advice - How to use Batch to reduce API call load

spessex · August 3, 2023, 7:40am

Hi All

It would be great to receive some advice on the ‘experimental’ process I have below (see image).

What I’m trying to do is:

Call in a Table of unstructured addresses from an AirTable Base - THIS IS SUCCESSFUL AND CALLS IN JUST UNDER 9000 fields. This is the first node in the image (Airtable1)
I want to pass these to a Geocoder (HERE.com) to pass the addresses back to me in a Structured format (which I have done successfully by testing one field), but I don’t want to hammer their API with all 9000 at once, so thought I would need something like the Batch node, but I’m struggling to understand how I should implement it. Ideally I would like to Batch them to send 500 addresses to geocode at a time. This is the ‘experimental’ part of the process nodes 2 & 3 in the image (SplitinBatches & IF).
After Batching I imagine sending them to the Geocoder (shown in Node 4 - HTTP Request) - THIS IS SUCCESSFUL AND CORRECTLY SPITS OUT MY ADDRESSES IN THE NEW STRUCTURED FORMAT.
I then Set the fields in the Node 5 (Set) - THIS IS SUCCESSFUL
And then Update the Airtable with the new structured address data for those original Airtable IDs - THIS IS SUCCESSFUL

So my main issue is point 2, how to actually Batch the Airtable List, and ensure it (1) loops through all the 9000 addresses until they are complete and (2) only processes addresses/airtable IDs that have not already been processed.

It would be useful to know (1) if there is an API failure at some point (maybe because of contravening Here.com’s API rate limit for instance) and (2) what then happens? Do I then have to run the whole thing again? How do I ensure that those that were successful are not included in the re-run?

Any help or advice would be greatly appreciated.

PS I’m not sure how to add the actual n8n workflow without exposing my confidential data, so if there’s a way, and it’s easier to help me (as opposed to looking at an image) please advise

Many thanks

Stephen

MutedJam · August 3, 2023, 10:57am

my main issue is point 2, how to actually Batch the Airtable List, and ensure it (1) loops through all the 9000 addresses until they are complete

Hi @spessex, it seems you’re on a good path already. The only thing you’d need to add in order to loop through the rest of the items (and not just the first batch) would be to close your loop.

So something like this:

This of course uses dummy data and an IF condition that won’t work for your specific data structure, but the basic idea should of this flow should still work for you.

To avoid hitting rate limits you can reduce the batch size and increase the Wait time as needed. As for retrying, n8n allows to restart a failed execution from the failed node onwards through the execution list. So you wouldn’t have to re-run everything:

Hope this helps

spessex · August 3, 2023, 1:31pm

Many thanks @MutedJam. I’ve copied your flow and am going to take a look to see what I can learn.

spessex · August 4, 2023, 10:05am

Hi @MutedJam

I’ve used your Workflow to try and suit it to my own Workflow. See image attached.

It looks like it works, but my question is, what is the Merge node actually doing? Do I need it in my case or can it just go straight to set data?

So to be sure and explain current Workflow it goes like this:

List all the Airtable Addresses (from Address Field). Just under 9000 are returned.
Send those to a Batch process to split it into 500s.
Check IF any of them have the Address Field completed (in hindsight maybe I could put this before the Batch and after the Airtable List).

3A. If no Address is found (FALSE) it goes to DO NOTHING.

3B. If an Address is found (TRUE) it goes to the HTTP Request of the Here.com Geocoder for processing (in the 500 batch).

It then goes to MERGE node (and again I’m not sure if I need this?).
And then goes to Set node to set all the fields against the returned Here.com location data e.g. Country, State, Postcode, etc, etc.
And then updates those Airtable fields with the new data, for those associated Airtable IDs.
It then WAITS (for a 1 min) before it loops back to the BATCH process point 2.

Thanks

Stephen

MutedJam · August 7, 2023, 8:15am

It looks like it works, but my question is, what is the Merge node actually doing? Do I need it in my case or can it just go straight to set data?

Hi @spessex, I suspect this workflow wouldn’t work if everything goes into the false branch - the workflow would just stop in this case. Connecting the false branch to the Merge node should ensure that looping continues even in this scenario.

spessex · August 7, 2023, 8:37am

Thanks @MutedJam, so just to confirm, is this what I should be doing (see image)?

MutedJam · August 7, 2023, 8:47am

Yep, that’s what I had in mind. You can also connect the NoOp node to the Merge node instead in case you want to keep it for workflow readability/documentation.

spessex · August 7, 2023, 8:59am

So to confirm the following is also viable?

MutedJam · August 7, 2023, 12:07pm

Yep, that’s how I think it should work, though without knowing your actual data and n8n version it’ll be hard to say with certainty. I tested my example with [email protected] fwiw

spessex · August 8, 2023, 12:02pm

Thanks @MutedJam.

PS Also for some reason I’ve just started to get an error, even though it was all working okay before.

Eerror has suddenly started occurring on the Batch process and now won’t run (even though it did before). Error is ’ Problem running workflow

Cannot read properties of undefined (reading ‘0’)’ See image below. I’ve even restarted the N8n instance to see if that would sort it but to no avail.

systemaddict · August 8, 2023, 12:06pm

Couldn’t you just do this with the native batch option in the http node?

spessex · August 8, 2023, 12:28pm

Hi @systemaddict. I wasn’t even aware you could do this, but thanks for bringing it to my attention. I’ll have a play with it.

system · November 6, 2023, 12:28pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.