Struggling with pagination

Hi,

I’ve been working with the Hubspot API, using the http node, and struggling to combine my paginated results for further processing.

I’ve put together a prototype flow that uses the Hubspot search endpoint to find “Companies” that have changed recently. I’ve managed to figure how to use the pagination, and have an IF node that is storing up each of the pages, but I’m struggling to merge the pages together.

I’ve cobbled together something from other forum topics, but it isn’t working, and I’m at the limit of my javascript capability (as in I have none :slight_smile:).

Here is the workflow

and here is a visual of the flow:

Any guidance on how to make the final function work to combine the pages would be helpful.

Thanks
Scott

Hey @scottjscott,

I don’t have the answer to your question but why not use the Hubspot node which has an option to get recently created / modified companies from a certain timestamp?

Hey @Jon, the short answer is that I will probably utilise the HubSpot connectors somewhere in my workflows.

The longer answer, for another time, is that I need to get an http style solution working that will allow me to deal with pagination because of data volumes: and it won’t be enough to rely upon webhooks / triggers.

Regards
Scott

1 Like

That makes sense, just wasn’t sure if taking the longer route was worth the effort if it is something you need now but for learning how to deal with the pages and merging it makes sense.

What does the data look like that is hitting the function node is it just the last page or everything and it just needs cleaning?

I’ve got a working solution which I’m just pulling together and will post here shortly - as ever, I have borrowed from a solution in another forum post :smiley:

1 Like

Ok, I’ve got myself a solution that works now - I’m sure there are quicker / more optimal ways of doing this, but this will do me for now, and it was a learning exercise too :smiley:

My aim was to setup a 1 way sync of new Companies created in Hubspot, to a staging table in our data warehouse (Postgres). To make this work in this workflow I’ve created an offsets table in the data warehouse to store a timestamp offset and a batch number.

The pagination works nicely now - I’m pulling 50 items per page when I search for new companies, and I’ve managed to combine the pages using a Function node (thanks @RicardoE105 , @jan for the inspiration in other forum posts).

Before I insert the items into the data warehouse, I also assign them a uuid, and I associate them with the batch number created earlier in the workflow (will help with data processing downstream).

There are a couple of points where I introduce waiting time so I avoid hitting Hubspot API limits (150 every 10 seconds on professional edition, lower on the entry level Hubspot, so I’ve gone for 100).

This data is going to be part of a data model build that uses dbt, so I’ve just inserted the Hubspot properties retrieved from the API into a single json column in the database (I’ll disassemble it later).

Now I’ve gone back over this I can see a couple of areas where there is room for improvement, especially when the workflow returns no items from the API - it just stops at the Combine Companies node, which still works, but I need to improve upon that in the future.

Anyway, I’ve learnt more about n8n in the process and I’ve got a working solution that with a few tweaks will get us synchronising hubspot entities to our data warehouse for use in real customer analytics.

I also learnt that when you see some workflows you want to try from the forums, you can simply copy them and then past them on to the workflow canvas! That has saved a lot of time!!!

5 Likes

wow this is super helpful. And the copy function is so useful!

1 Like