Http Request Node w/ curl --compressed

evenrain · December 1, 2022, 3:21pm

Describe the issue/error/question

I’m trying to scrape a website. I can get the content by using curl with compressed parameters, but when I set up an HttpRequest node and import curl, it always runs without any results.

curl 'https://www.kobo.com/' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:107.0) Gecko/20100101 Firefox/107.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8' -H 'Accept-Language: zh-TW,zh;q=0.8,en-US;q=0.5,en;q=0.3' -H 'Accept-Encoding: gzip, deflate, br' --compressed

Please share the workflow

Information on your n8n setup

n8n version: 0.200.1
Database you’re using (default: SQLite): SQLite
Running n8n with the execution process [own(default), main]: own
Running n8n via [Docker, npm, n8n.cloud, desktop app]: Docker

MutedJam · December 5, 2022, 10:32am

Hi @evenrain, welcome to the community

I just tried out running your workflow on n8n and never got a response from the server when doing so on my web server, but got the expected response when running it on a local n8n instance:

So, it seems the server you are trying to scrape data from is rather picky about the endpoints from which it accepts requests. Is there a chance you ran your curl command locally but have your n8n instance running on a remote server?

evenrain · December 8, 2022, 3:06am

Hi @MutedJam ,

You are right! If I ran the curl command on the remote server, I couldn’t get any response…

It looks like I should try via VPN or find the other way…

Thanks a lot!

MutedJam · December 14, 2022, 10:05am

So this is by no means a recommendation for a specific product, but I have used Webshare proxies in the past to scrape certain websites which have geo-blocking in place. This worked reasonably well with n8n using a value such as socks5://$username:[email protected]:8780 in the “Proxy” field of the HTTP Request node. I can’t promise this will work for you, but it might be worth a shot.

system · December 21, 2022, 10:05am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.