Hey @Ahmed_M_Ibrahim hope all is good. Welcome to the community.
The short answer is “it depends”.
The long answer is…
It very much depends on the website. Most of the time it is possible, but sometimes the service can take extreme measures against scraping, and then it is not as easy. So
Generally you would configure an http request node with pagination to iterate over pages and collect html, then you would pass it through HTML node, to extract parts that are of interest.
Another challenge could be if the page uses javascript to render the pages. If this is the case then just getting the page with HTTP Request node is not going to be possible and you will need to use one of the solutions which allow to process raw html, and run it through js engine to populate the page and only then you can scrape it…
Hey, @Ahmed_M_Ibrahim so you wanna scrape a product list with prices from a website that has multiple pages? That’s doable! One way to do it is by using a tool like n8n. You can create a workflow that fetches each page, extracts the product info and prices, and then saves it to a spreadsheet or something.
I’ve done something similar before, and it’s pretty straightforward. You’d use the HTTP Request node to fetch the pages, and then the HTML Extract node to pull out the product info. You can even use the Pagination node to handle multiple pages.
If you’re not comfortable with n8n, you could also try using a web scraping tool like Scrapy or BeautifulSoup. They can handle multiple pages and extract the data you need.
Just make sure to check the website’s terms of use and robots.txt file to ensure web scraping is allowed. Don’t wanna get blocked or anything!
if that helped, kindly mark the answer as solution.