Need Looping Idea

Hi Everyone,

My team and I created a workflow to scrape webpage data using the ScrapingBee API key. The issue we’re facing is that the website we’re scraping uses a bot detection mechanism. To bypass this, we use cookies.

Here’s the problem: for the first URL, we provide our own cookie, and the scraping is successful. In the output, we receive a cookie from ScrapingBee. For the next URL, the workflow should use the cookie from the previous output instead of our original cookie.

In simple terms:

  1. Get a cookie from the browser for the first request.
  2. Use the cookie returned from the first request in your second request.
  3. Use the cookie from the second request in the third one, and so on.

Anybody please share how to set that up!

n8n version: 1.71.3
Running n8n via: n8n cloud

It looks like your topic is missing some important information. Could you provide the following if applicable.

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

Hey @Prem7 , try adding an additional “hanging” node after HTTP Request node (say, coled “Cookie”) to collect the returned cookie to be used in the next itration. Then if it is not the first run you can reference that cookie with an expression like this {{ $runIndex ? $('Cookie').first(0, $runIndex - 1).json.cookie : 'INITIAL_COOKIE' }}.

Here’s just a visual to demostrate the idea. You can run to observe how session cookie gets updated taking value from the “Cookie” node stored in the previous iteration.

1 Like

Thanks for the Help @ihortom !

from the output the cookies appears at different index each time so i used the name of the cookie to find it each time irrespective of its position but its returning as null/undefined what to do here?

@ihortom I have Figured how to get the specific cookie but now the scrapingbee cookie (set node) is not executing

used this expression to get the sepcific cookie i needed:

{{ $item(“0”).$node[“ScrapingBee - Get page screenshot”].json[“cookies”].find(cookie => cookie.name === ‘datadome’ && !cookie.session).value }}

Here is the expression used in set fields node:

datadome={{ $runIndex ? $(‘Scrapingbee Cookie’).first(0, $runIndex - 1).json.cookie : “INITIAL COOKIE” }}

getting this error:

Referenced node is unexecuted
An expression references the node ‘Scrapingbee Cookie’, but it hasn’t been executed yet. Either change the expression, or re-wire your workflow to make sure that node executes first.

The solution I offered was due to your statement “for the first URL, we provide our own cookie”. That is what “INITIAL COOKIE” is for. Your screenshot does not correspond to my solution with “hanging” node. For some reason you looped it back to the Set node. Moreover, it is significant for that hanging node to be the top branch.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.