Scraping with n8n?

Hi,

I tried to scrape this page : https://www.malt.fr/s?q=growth doing a http request but I couldn’t do it.

Then, I add some filters on the page and in the web console I found this api call : https://www.malt.fr/search/api/profiles/v2?q=growth&exp=ENTRY&p=1&origin=https%3A%2F%2Fwww.malt.fr%2Fs that returns me only the first 24 profiles; but I would like to scrape this request with pagination. Do you have any idea if it’s possible with n8n ? I did it with another tool but was wondering cause I would like to make some segmentation on it. When I put the api call into a http request in n8n, I have an error, but when I put it in my browser, I can see the json. Thanks !

Hey @jeremy_FRANCOIS,

I tried making a request to the API URL in Postman, and it also doesn’t return the JSON data. I am assuming that it requires some headers that are sent by the browser, and that we need to send with our request as well.

Thanks Harshil,

does it exist any solutions to find the headers ? Thanks

When you’re accessing the URL on the browser, are you signed in to the platform? I can’t access the links anymore, and it is asking me to sign in.

Can you try accessing the URL when you’re not signed in? I am assuming that you need to be signed in to access the URL.

Without being signed, I can do it one time, and then, not anymore

I guess, that’s the issue. You need to be authenticated to access it. If they have an official public API, I suggest you use that. If not, you can try with passing the cookies in the headers. However, this is not a secure method, and when the cookies expire, you will have to set them again.

okay, thank you @harshil1712 !

Hey @jeremy_FRANCOIS,

I am closing this topic since we know what the real issue is. For anyone who is viewing this topic, the URL we are trying to scrape requires us to log in to access the data. A possible solution is to log in and use the cookies. However, this is not secure and one will have to update the values often. Another approach is to use nodes like Phantombuster or other scraping services.

1 Like

Yes, thank you @harshil1712

1 Like