I am unable to generate the correct order of pagination. The baseurl pagination generates the correct pages in input but in output it still uses the baseurl 1 pages.
My goal was to get Baseurl 1, get agents’ profiles until they’re done, and then move to Baseurl 2 and repeat the same process. Can anayone help me?
@jabbson Thank you soo much for your help. I have a little question, what if I want to add a wait after each base url complete so it doesn’t block the request?
@jabbson Can it is possible to use any open-source web scraper with it? since I am self-hosting n8n so I am curious if I can use any web scraper api open soruce instead of native http request
Well, you see people who run these services like their data and they like when you can’t get it, at least easily. While one group of smart people is thinking about how to scrape all the data and make it available and make money off of it, the other group of people is thinking of how to protect themselves from this happening. Bot detection is getting as sophisticated as web scraping and it is a never ending battle.
While on this topic, both services you’ve mentioned strictly prohibit data scraping and the use of automated tools to access or extract data from their platforms without explicit written permission. Doing so is unethical and can bear legal consequences.
And that is exactly the point - if you want to have the data - they want you to pay for it, and this is exactly why they will try their best to detect and stop any bot activity on their resources.