Can't loop site scrape workflow

Describe the problem/error/question

I have a basic site scraping process set up that does work. Here is the workflow code:

However, when I ask the process to loop back through, it fails on the HTTP request every time with an error.

The way the workflow works is it pulls in a list of URLs via Google Sheets. The Edit Field node sets those URLs as a new field called “ScrapeURL.”

The Wait node waits 30 seconds in between requests and then leads you into the HTTP Request node which is set to execute once.

The HTTP Node pulls the URL in from the ScrapeURL field.

Like I said, it works once but not when it loops back into the HTTP Request node upon completing the first item on the list of URLs.

What is the error message (if any)?

This is the error message:

{
  "errorMessage": "Bad request - please check your parameters",
  "errorDescription": "Bad Request",
  "errorDetails": {
    "rawErrorMessage": [
      "400 - \"{\\\"success\\\":false,\\\"error\\\":\\\"Bad Request\\\",\\\"details\\\":[{\\\"validation\\\":\\\"url\\\",\\\"code\\\":\\\"invalid_string\\\",\\\"message\\\":\\\"Invalid url\\\",\\\"path\\\":[\\\"url\\\"]},{\\\"code\\\":\\\"custom\\\",\\\"message\\\":\\\"URL must have a valid top-level domain or be a valid path\\\",\\\"path\\\":[\\\"url\\\"]},{\\\"code\\\":\\\"custom\\\",\\\"message\\\":\\\"Invalid URL\\\",\\\"path\\\":[\\\"url\\\"]}]}\""
    ],
    "httpCode": "400"
  },
  "n8nDetails": {
    "nodeName": "HTTP Request",
    "nodeType": "n8n-nodes-base.httpRequest",
    "nodeVersion": 4.2,
    "itemIndex": 0,
    "time": "7/2/2025, 12:47:25 PM",
    "n8nVersion": "1.100.1 (Cloud)",
    "binaryDataMode": "filesystem",
    "stackTrace": [
      "NodeApiError: Bad request - please check your parameters",
      "    at ExecuteContext.execute (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-nodes-base@file+packages+nodes-base_@[email protected]_asn1.js@5_1af219c3f47f2a1223ec4ccec249a974/node_modules/n8n-nodes-base/nodes/HttpRequest/V3/HttpRequestV3.node.ts:780:15)",
      "    at processTicksAndRejections (node:internal/process/task_queues:105:5)",
      "    at WorkflowExecute.runNode (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@[email protected][email protected][email protected][email protected]_/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:1193:9)",
      "    at /usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@[email protected][email protected][email protected][email protected]_/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:1542:27",
      "    at /usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@[email protected][email protected][email protected][email protected]_/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:2108:11"
    ]
  }
}

Basically, it keeps acting like there’s an error with the code for the URL it’s pulling in. But the basic workflow works, so I know that’s not the problem.

The AI assistant keeps suggesting there is an error with the URLs.

Please share your workflow

I pasted the code above… here is a picture of the basic workflow:

(Select the nodes on your canvas and use the keyboard shortcuts CMD+C/CTRL+C and CMD+V/CTRL+V to copy and paste the workflow.)

Share the output returned by the last node

Information on your n8n setup

  • n8n version: 1.100.1
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

Why do you need the loop over items in the workflow?
Can you provide a few urls to test with?

Obvious follow up question!

I have a list of URLs I want to scrape, but I don’t want to process them all at once for a few reasons:

  1. I have a free account with Firecrawl, so I’m limited in my requests
  2. I want to stand up a prototype of this scraper before I subscribe to Firecrawl

Would it not be more likely to be successful with the scraping if I spaced the scrapes out? That’s the main reason I want to loop it and process the URLs one at a time.

Well the error from the original post says
URL must have a valid top-level domain or be a valid path

so this isn’t likely to be caused by rate limiting.

As for spacing the requests out - there is an option for that in HTTP Request node.

This configuration will send request in batches of 1, with 30s gaps in between.

2 Likes

At first your suggestion resulted in just ONE URL being scraped… but then I remembered I had that “Execute once” setting turned on. Turned that off, and boom! Working like a charm.

Thanks so much!