HTTP Request Node Encountering Error When URL Returns 3XX Redirect Status

Make a workflow with the HTTP Request node and use any of the below websites for the URL:

If you inspect the network when making a request to these website in the browser, you’ll notice that the first request to the website returns 3XX status, but the website loads successfully in the browser.

In the n8n HTTP node, when making a request to these websites, we get the following error:

ERROR: Forbidden - perhaps check your credentials?
403 - “\r\n403 Forbidden\r\n\r\n403 Forbidden\r\ncloudflare\r\n\r\n\r\n”

We have the default settings of the follow redirects set to true with 21 redirects max.

It looks like your topic is missing some important information. Could you provide the following if applicable.

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

For me both links work fine. Maybe sites have IP restrictions

Hey @First_Spark_Digital - Can you provide the json output for your workflow or for specifically that node? I can take a look and see if there are any configuration issues that would be causing that.

Hi @workflowsy,

Here’s the JSON output for that node:

[
  {
    "error": {
      "message": "403 - \"<html>\\r\\n<head><title>403 Forbidden</title></head>\\r\\n<body>\\r\\n<center><h1>403 Forbidden</h1></center>\\r\\n<hr><center>cloudflare</center>\\r\\n</body>\\r\\n</html>\\r\\n\"",
      "name": "AxiosError",
      "stack": "AxiosError: Request failed with status code 403\n    at settle (/usr/local/lib/node_modules/n8n/node_modules/axios/lib/core/settle.js:19:12)\n    at RedirectableRequest.handleResponse (/usr/local/lib/node_modules/n8n/node_modules/axios/lib/adapters/http.js:537:9)\n    at RedirectableRequest.emit (node:events:529:35)\n    at RedirectableRequest.emit (node:domain:489:12)\n    at RedirectableRequest._processResponse (/usr/local/lib/node_modules/n8n/node_modules/follow-redirects/index.js:398:10)\n    at ClientRequest.RedirectableRequest._onNativeResponse (/usr/local/lib/node_modules/n8n/node_modules/follow-redirects/index.js:91:12)\n    at Object.onceWrapper (node:events:632:26)\n    at ClientRequest.emit (node:events:529:35)\n    at ClientRequest.emit (node:domain:489:12)\n    at HTTPParser.parserOnIncomingClient (node:_http_client:700:27)\n    at HTTPParser.parserOnHeadersComplete (node:_http_common:119:17)\n    at TLSSocket.socketOnData (node:_http_client:541:22)\n    at TLSSocket.emit (node:events:517:28)\n    at TLSSocket.emit (node:domain:489:12)\n    at addChunk (node:internal/streams/readable:368:12)\n    at readableAddChunk (node:internal/streams/readable:341:9)\n    at TLSSocket.Readable.push (node:internal/streams/readable:278:10)\n    at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23)\n    at TLSWrap.callbackTrampoline (node:internal/async_hooks:128:17)\n    at Axios.request (/usr/local/lib/node_modules/n8n/node_modules/axios/lib/core/Axios.js:45:41)\n    at processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at requestFn (/usr/local/lib/node_modules/n8n/node_modules/n8n-core/dist/NodeExecuteFunctions.js:551:33)\n    at proxyRequestToAxios (/usr/local/lib/node_modules/n8n/node_modules/n8n-core/dist/NodeExecuteFunctions.js:554:26)\n    at Object.request (/usr/local/lib/node_modules/n8n/node_modules/n8n-core/dist/NodeExecuteFunctions.js:1917:50)",
      "code": "ERR_BAD_REQUEST",
      "status": 403
    }
  }
]

And here’s a sample of the flow that I’m using for this:

{
  "meta": {
    "templateCredsSetupCompleted": true,
    "instanceId": "cec86eeddef9ce9510df5fa3d831594fc68385b4da5fcc5a2c862e33863b9ffe"
  },
  "nodes": [
    {
      "parameters": {
        "operation": "appendOrUpdate",
        "documentId": {
          "__rl": true,
          "value": "",
          "mode": "list",
          "cachedResultName": "Website Clarity Analyzer",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d//edit?usp=drivesdk"
        },
        "sheetName": {
          "__rl": true,
          "value": "gid=0",
          "mode": "list",
          "cachedResultName": "Sheet1",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d//edit#gid=0"
        },
        "columns": {
          "mappingMode": "defineBelow",
          "value": {
            "URL": "={{ $json.data['Website Address'] }}",
            "Email": "={{ $json.data['Email Address'] }}",
            "Name": "={{ $json.data.Name }}",
            "_id": "={{ $json._id }}",
            "Source": "={{ $json.data.utm_source }}",
            "Medium": "={{ $json.data.utm_medium }}",
            "Campaign": "={{ $json.data.utm_campaign }}",
            "Date": "={{ $json.d }}"
          },
          "matchingColumns": [
            "_id"
          ],
          "schema": [
            {
              "id": "Date",
              "displayName": "Date",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true,
              "removed": false
            },
            {
              "id": "_id",
              "displayName": "_id",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true,
              "removed": false
            },
            {
              "id": "Name",
              "displayName": "Name",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true,
              "removed": false
            },
            {
              "id": "Email",
              "displayName": "Email",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true,
              "removed": false
            },
            {
              "id": "URL",
              "displayName": "URL",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true,
              "removed": false
            },
            {
              "id": "Source",
              "displayName": "Source",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true,
              "removed": false
            },
            {
              "id": "Medium",
              "displayName": "Medium",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true,
              "removed": false
            },
            {
              "id": "Campaign",
              "displayName": "Campaign",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true,
              "removed": false
            },
            {
              "id": "Slug",
              "displayName": "Slug",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true,
              "removed": false
            },
            {
              "id": "Scores",
              "displayName": "Scores",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true,
              "removed": false
            }
          ]
        },
        "options": {}
      },
      "id": "",
      "name": "Google Sheets",
      "type": "n8n-nodes-base.googleSheets",
      "typeVersion": 4,
      "position": [
        560,
        500
      ],
      "alwaysOutputData": true,
      "credentials": {
        "googleSheetsOAuth2Api": {
          "id": "",
          "name": "Google Sheets account"
        }
      }
    },
    {
      "parameters": {
        "authentication": "oAuth2",
        "site": "65424675d0dca456228f1713"
      },
      "id": "",
      "name": "Webflow Trigger",
      "type": "n8n-nodes-base.webflowTrigger",
      "typeVersion": 1,
      "position": [
        380,
        500
      ],
      "webhookId": "",
      "credentials": {
        "webflowOAuth2Api": {
          "id": ",
          "name": "Webflow account 2"
        }
      }
    },
    {
      "parameters": {
        "jsCode": "return items.map(item => {\n  if (item.json.URL) {\n    let url = item.json.URL; // Corrected to 'URL'\n    // 'www.' in the URL causes cloudflare 403 error\n    url = url.replace('www.', '');\n    if (!url.startsWith('http://') && !url.startsWith('https://')) {\n      url = 'https://' + url;\n    }\n    return { json: { ...item.json, URL: url } }; // Updated to 'URL'\n  } else {\n    // If URL is not provided, return the item as is\n    return { json: item.json };\n  }\n});\n"
      },
      "id": "fef9970f-4f86-4a49-b924-ed1b24203dc9",
      "name": "Code",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        780,
        700
      ]
    },
    {
      "parameters": {
        "url": "={{ $json.URL }}",
        "options": {}
      },
      "id": "",
      "name": "HTTP Request",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.1,
      "position": [
        960,
        700
      ],
      "retryOnFail": true,
      "alwaysOutputData": false,
      "onError": "continueRegularOutput"
    }
  ],
  "connections": {
    "Google Sheets": {
      "main": [
        [
          {
            "node": "Code",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Webflow Trigger": {
      "main": [
        [
          {
            "node": "Google Sheets",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Code": {
      "main": [
        [
          {
            "node": "HTTP Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "pinData": {}
}

Hey @barn4k, did you access via a simple HTTP Request? Because if you do and you use URL = https://majorcustomcable.com/ it will fail.

But if you input that into your URL bar (browser), it will work.

I’ve used the HTTP node and it works. If you check the error, you will see a message in the body, related to cloudflare. So it seems it blocks your n8n’s IP or it somehow knows that it is a programmatic request

Hey @First_Spark_Digital - So I did a bunch of digging and @barn4k is right that it is Cloudflare’s bot protection that is causing the issues here.

Essentially, Cloudflare is blocking the axios library that the HTTP request node is using. I thought setting a user agent would do the trick (as I tested a similar method using python and it worked) but it appears changing the user agent via headers in n8n still doesn’t do the trick… and i’m testing from a local self hosted docker version of n8n so I know it’s not an IP issue.

My next thought was to use python’s request library to perform this, but n8n’s site explicitly says you can’t make http requests through python code and that you need to use the HTTP node to do that.

So, where does that leave you. There are a few options. There are web scraping services that you could use to bypass / get around Cloudflare’s bot protection but most do have a cost associated with them, or you could create something like an AWS Lambda function that runs your code via an API call, passes the response back and then you use that in your flow. Overengineered? Yeah, probably but there’s no subscription pricing or anything of the sort for something like that.

Either way, I’m sorry this is such a hassle and I’d be happy to answer questions about how to get the AWS (or more generally external function setup) or how to use a third party scarping service to get this done as well!

Reference Working (just not in n8n) Python Request Code:

import requests

url = 'https://www.majorcustomcable.com/'

headers = {
    'content-type': 'application/json',
    'accept': '*/*',
    'sec-fetch-site': 'same-origin',
    'accept-language': 'en-US,en;q=0.9',
    'sec-fetch-mode': 'cors',
    'origin': 'https://google.com',
    'user-agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 16_4_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.4 Mobile/15E148 Safari/604.1',
    'referer': 'https://google.com/',
    'sec-fetch-dest': 'empty',
}

response = requests.get(url, headers=headers)

# To check if the request was successful
if response.status_code == 200:
    print("Request was successful.")
    # To print the content of the response
    # print(response.text)
else:
    print(f"Request failed with status code: {response.status_code}")
    print(response.text)

Yeesh what a headache just for a simple GET. Alright the AWS option sounds enticing if I get help, thank you.

And we’re sure this doesn’t have to do with any https://www.sampleurl.com vs. https://sampleurl.com where the site is being blocked merely because it’s a redirect issue causing the problem?

The reason I ask is because we thought it might be it since when testing with HTTP we get different results.

This fails: https://majorcustomcable.com/ and gives a 403 ERROR
This succeeds: https://www.majorcustomcable.com/

So we had built an IF/THEN node that would check to see if the HTTP Request failed issuing a 403 error and then if it did, it would retry on another HTTP Request node, inserting www. This works but only when I manually “push” the node.

It doesn’t work when it’s running automatically. Any thoughts?

Here’s that workflow broken out:

@First_Spark_Digital It honestly just depends on how they have their cloudflare bot protection rules configured. www may work for some sites, but may not for others based on how they’ve got things setup.

I think it really comes down to how critical it is that this things consistently works as opposed to a best effort kind of attempt where if it works, great, if not, then move on to the next site in the sheet.

In regard to AWS, there’s a fair amount of setup. If it’s a path you want to go down, let me know and I can try and put together a high level guide of what that looks like.

Ok. Well any reason that IF/Else wouldn’t work correctly? And only when I manually click “play” on it, vs when it’s in automation?

Otherwise this is unfortunately the cornerstone of the tool we’ve built. Being able to ping the site and then extract the data is the basis of it so it needs be the one thing that has a 100% hit rate.

Yea if you can help with the AWS and that would be more like 100%, that would be excellent. Let me know where to send the coffee :wink:

@workflowsy, what if I up it to 2 coffees? :slight_smile:

Ahh shoot! My apologies, let me see what I can put together in the next day or so if that works!

@First_Spark_Digital - haven’t forgotten about you! I was able to get the whole thing up and running! That said, it’s not for the faint of heart. At it’s core, it takes the URL and passes it to an AWS Lambda which runs a get request and returns the response back from the get request. This lambda invocation can be integrated into any workflow in n8n pretty easily.

Workflow:

Response from Lambda in n8n:

AWS Components:

I created a github repo (GitHub - Workflowsy-io/spark-digital-lambda-n8n-poc) with all the necessary code that outlines all the steps I took (beyond creating an AWS account initially).

I’d be happy to help get you all setup with this if it’s of interest as AWS in itself can be a bit much. Just let me know if you have any questions or how I can help and we can go from there!

You’re the man! I’ve created the AWS account to start. Going to see if I can get my coder friend to assist. Will let you know if I need more assistance (or he can’t help). Thanks again and will report back when I get it working!

1 Like

Love it, definitely keep me posted how it all goes and don’t hesitate to reach out!

So looks like we’re getting some errors when we’re trying to upload the zip file.

This part of the .yaml file seems to be an issue:
LambdaPolicyAttachment:
Type: ‘AWS::IAM::ManagedPolicyAttachment’
Properties:
PolicyArn: ‘arn:aws:iam::aws:policy/AWSLambda_FullAccess’
Target: !Ref LambdaUser

Any ideas why we might be encountering this error?

Ahh shoot, I’m sorry. It looks like my last commit to github didn’t go up, I’ve committed the last code and it should work. Just note, if you’re doing this through the AWS console, it probably will not work as the AWS S3 bucket and lambda layer are created with that shell script in the repo and without those the lambda function will not run properly.

@First_Spark_Digital - I’m happy to deploy it for you all if you are able to provide credentials for the AWS account. Feel free to message me directly here if that is easiest.

@First_Spark_Digital - What’s the good word? Were you able to get it all figured out?