Dear Experts,
I am new to n8n and i want to scrape data from a site and that site contains multiple post link let say i have saite that is main site where i visit using hyperlink write in browser now inside of that site there are more links i want to scrape data from that main and multiple post link data to google sheet with one button click can someone support me
Describe the problem/error/question
What is the error message (if any)?
Please share your workflow
(Select the nodes on your canvas and use the keyboard shortcuts CMD+C/CTRL+C and CMD+V/CTRL+V to copy and paste the workflow.)
Thanks for your reply , (It is just i am doing for practice purpose) if you check the image colored with yellow highlighter are post links as i click on these will open other site or detail page now, i want to scrape this main site and these all post link sites as well to google sheet with click trigger
As i understand is that you want to scrap the website along with post link on each page and wanna save it in Google Sheet?
What is the issue you’re facing? Just write a scraper in Python, run with flask API and than send a request using N8N for further actions according to your need….
Look for the exact CSS class that contains the data you need
Add ‘Extract HTML Content’ node in n8n (built-in) and configure the extraction values
Attach a ‘Code’ node to filter data and select only the URL’s (Generate using AI if needed)
This is the easiest way to scrape sites. Not AI-driven so it doesn’t waste any resources and it’s precise. I prefer this way rather than paying for overpriced external scraper API’s.
Thanks to both of you “As i understand is that you want to scrap the website along with post link on each page and wanna save it in Google Sheet?” this is what i want actually, can you tell me step by step please.
If the website triggering the API request in back-end when you open the page [as stated by Mookie_Lian] just call that API with caches inside your N8N or write a scraper in Python [Flask endpoint] and use that in N8N using HTTP Request node.