Trying to execute a GET request within a HTTP node to scrape a website. This has been working flawlessly for as long as I can remember, and I have been able to scrape data off 99% of websites in this manner.
Randomly last night I started receiving 403 errors when trying to execute this, and since this point I cannot seem to get this node working no matter what I try. i have troubleshooted with Claude for hours but have not come to a solution.
Whats even weirder is the exact same flow on Make.com seems to work flawlessly, making me believe it is an N8N specific issue.
What is the error message (if any)?
403 Error
Share the output returned by the last node
Shown above.
Information on your n8n setup
n8n version: 1.76.1
Database (default: SQLite): Not necessary for this
n8n EXECUTIONS_PROCESS setting (default: own, main): not necessary for this
Running n8n via (Docker, npm, n8n cloud, desktop app): docker through Digital ocean
Site owners generally don’t like their sites being scraped and in most cases come up with ways to protect themselves from this happening as much as possible. Different system try their best to hide being scrapers from websites because of that, but bot detection system are getting better. If you share the website, we can take a look, but I am pretty sure this is going to be exactly that - bot detection.
is this possible that you share the HTTP Get requests with us, so we can check using our setup. If its working on make.com than it might be some issue with IP Address.
You’re using Static or dynamic IP on your Macbook M3 Pro? most chances is that you’re IP got blocked by website as you’re using Personal IP address and sending request in bulk.
Try to refresh you’re IP address from terminal and try again. or try to use VPN for testing.
I get that no doubt but it had been flawless for an extremely long time (several months) and just randomly stopped working. Will try a VPN and see if that fixes it, hoping this isnt something that will be an annoyance long term though.
a lot of things are,… until they aren’t. For instance, when the amount of bot traffic makes the business question their approach to securing their pages and endpoints.
this may of may not help tbh, the ways how they detect bots is with a range of different techniques, not only the source ip.
If you’re using it past months and got issue recently than please take a look at HTTP Get request again. HTTP Get request might be introduced with new parameters or headers. Try this:
Go to website → Open dev console
Send get request → look for request in Network tab
Right click the request and click “copy as powershell”
Ask AI to generate a python code based on power-shell code, including headers & cookies.
Send a request again with generated code.
I am confident that above step will solve your 403 problem. Thanks