HTTP Request : Looping over REST JSON paginated results

Hi,

I use HTTP Request node to get JSON from a rest API looking like the following :

{
  "count": 0,
  "first": "string",
  "prev": "string",
  "next": "string",
  "api_version": "string",
  "data": [
    {...},
    {...},
    ...
  ]
}

This API replies with paginated results and expose the URL of the next results page in a “next” property of the current JSON page. If I want to get all results I need to loop over all pages using the URL located in “next” property

I looked at the topic Looping http request node but it seems does not fit this case.

What would be the good approach in my case ?

Thanks a lot.

In this case you can simply create an actual loop like this:

So you first set the “next” property yourself, for the first request/iteration and then you keep on looping over as long as it returns a “next” property (in the example bellow it never returns it as the API does not work that way, really just an example).

Here the example workflow:

Great @jan thank you,

Here’s a full working example :

The only problem I stumbled upon from your example is that the IF node expression …

 "value1": "={{$node[\"Do Something with data\"].json[\"response\"].hasOwnProperty('next')}}",

… looked to need to return an acutal boolean to work :

 "value1": "={{$node[\"Do Something with data\"].json[\"response\"].hasOwnProperty('next') !== null}}",

Thanks a lot for your great work

1 Like

Thank you for this example and @slavielle for updating.

I have a follow-up question: Is there currently a nice way to aggregate/append/push/merge the data?

Let’s say HTTP Request runs three times, returning this each iteration:

{
  "data": [
    {
      "id": "1",
      "attributes": {
        "title": "Post 1"
      }
    },
    {

      "id": "2",
      "attributes": {
        "title": "Post 2"
      }
    }
  ]
}

Can you think of an easy way to combine those into a JSON list of 6 titles to be used further down the workflow?

Since we are dealing with a large (in reality unknown) number of iterations, the Merge node does not seem to be the right thing. I tried to rig something together using $items in a functions node to get the response data from all the runs, but couldn’t get it to work.

I would greatly appreciate a pointer in the right direction! :slight_smile: Thanks for your awesome work!

1 Like

I actually use a DB server to just hold all data (have not managed to do this w/ only Merge as the workflow stops as the data is very big), and once all data is captured, process all of it from the DB

Hey @simon, I did it with a function node. Check the example below.

{
  "nodes": [
    {
      "parameters": {},
      "name": "Start",
      "type": "n8n-nodes-base.start",
      "typeVersion": 1,
      "position": [
        250,
        300
      ]
    },
    {
      "parameters": {
        "functionCode": "return [\n  {\n    json: {\n      url: 'https://mockup-pj5l0yxsjrbr.runkit.sh/'\n    } \n  },\n      {\n    json: {\n      url: 'https://mockup-pj5l0yxsjrbr.runkit.sh/'\n    } \n  },\n    {\n    json: {\n      url: 'https://mockup-pj5l0yxsjrbr.runkit.sh/'\n    } \n  }\n];"
      },
      "name": "Function",
      "type": "n8n-nodes-base.function",
      "typeVersion": 1,
      "position": [
        450,
        300
      ]
    },
    {
      "parameters": {
        "url": "={{$node[\"Function\"].json[\"url\"]}}",
        "options": {}
      },
      "name": "HTTP Request",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 1,
      "position": [
        650,
        300
      ]
    },
    {
      "parameters": {
        "functionCode": "const response = [];\nfor (const item of items) {\n    for (const results of item.json.data) {\n      response.push({\n        json: {\n          title: results.attributes.title\n        }\n      })\n    }\n}\n\nreturn response;"
      },
      "name": "Function1",
      "type": "n8n-nodes-base.function",
      "typeVersion": 1,
      "position": [
        850,
        300
      ]
    }
  ],
  "connections": {
    "Start": {
      "main": [
        [
          {
            "node": "Function",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Function": {
      "main": [
        [
          {
            "node": "HTTP Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "HTTP Request": {
      "main": [
        [
          {
            "node": "Function1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Hey everyone, thanks for your responses!

@tony: yeah, I figured I could do something like that but it just seems to be unnecessarily complicated. But thanks for posting an example schema, will certainly help if I have to go down that route…

@RicardoE105: Let’s say we have a similar API like jan and slavielle were discussing above that responds with links to the next page of entries. Jan suggested the looping between HTTP Request and If, which certainly works but I couldn’t figure out how to merge the resulting six responses into one. I’ve prepared an example. If you have any idea I would love to hear it! :slight_smile:

Best, Simon

{
  "nodes": [
    {
      "parameters": {},
      "name": "Start",
      "type": "n8n-nodes-base.start",
      "typeVersion": 1,
      "position": [
        250,
        300
      ]
    },
    {
      "parameters": {
        "url": "={{$json[\"links\"][\"next\"]}}",
        "options": {},
        "queryParametersUi": {
          "parameter": []
        }
      },
      "name": "HTTP Request",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 1,
      "position": [
        860,
        300
      ]
    },
    {
      "parameters": {
        "conditions": {
          "boolean": [],
          "string": [
            {
              "value1": "={{Object.keys($node[\"HTTP Request\"].json[\"links\"]).join(', ')}}",
              "operation": "contains",
              "value2": "next"
            }
          ]
        }
      },
      "name": "IF",
      "type": "n8n-nodes-base.if",
      "typeVersion": 1,
      "position": [
        860,
        530
      ],
      "alwaysOutputData": false
    },
    {
      "parameters": {
        "values": {
          "string": [
            {
              "name": "links.next",
              "value": "https://discuss.flarum.org/api/discussions?filter[q]=+tag:blog&fields[discussions]=title&page[size]=10"
            }
          ]
        },
        "options": {}
      },
      "name": "Set Initial URL",
      "type": "n8n-nodes-base.set",
      "typeVersion": 1,
      "position": [
        560,
        300
      ]
    },
    {
      "parameters": {},
      "name": "All pages fetched",
      "type": "n8n-nodes-base.noOp",
      "typeVersion": 1,
      "position": [
        1220,
        550
      ]
    },
    {
      "parameters": {
        "functionCode": "const response = [];\nfor (const item of items) {\n    for (const results of item.json.data) {\n      response.push({\n        json: {\n          title: results.attributes.title\n        }\n      })\n    }\n}\n\nreturn response;"
      },
      "name": "merge results ??",
      "type": "n8n-nodes-base.function",
      "typeVersion": 1,
      "position": [
        1220,
        300
      ]
    }
  ],
  "connections": {
    "Start": {
      "main": [
        [
          {
            "node": "Set Initial URL",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "HTTP Request": {
      "main": [
        [
          {
            "node": "IF",
            "type": "main",
            "index": 0
          },
          {
            "node": "merge results ??",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "IF": {
      "main": [
        [
          {
            "node": "HTTP Request",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "All pages fetched",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Set Initial URL": {
      "main": [
        [
          {
            "node": "HTTP Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

I opened up my own topic discussing this, and the only solution I could come up with was to have the If node that does the pagination loop on the True branch, linked to the aggregation node from the False branch. Then the code in the aggregation branch can refer to all the executions of the HTTP Request node.

It’s non-intuitive, but it works.