Need help scraping product specs

Hey everyone!
So, I have this big school project where I need to grab product details from a bunch of different websites. Basically, I’m trying to look up specs using either the product name or the EAN code (that barcode number).
The tricky part is, I have a huge list—like, 300 to 1,000 products—and I need to do them all at once, not one by one (I’d be here forever :sweat_smile:).
I need to get a really clean list of these specific specs for each product:
EAN number
Depth, Height, and Width
Color and Material
Seat depth, Seat height, and Seat width
Armrest height
Backrest height
Maximum load capacity
How many people it holds
Does it swivel? (Yes/No)
Product weight
Has anyone here done anything like this before, maybe using some kind of scraping tool or script? Any tips on the best way to handle this big batch of lookups would be a lifesaver! :folded_hands:

Describe the problem/error/question

What is the error message (if any)?

Please share your workflow

(Select the nodes on your canvas and use the keyboard shortcuts CMD+C/CTRL+C and CMD+V/CTRL+V to copy and paste the workflow.)

Share the output returned by the last node

Information on your n8n setup

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

Hey @logidy The best way to handle your massive product spec retrieval task getting data for 300 to 1,000 products all at once is by setting up an automated system. This is much faster and more reliable than trying to copy-paste.

You have two main technical options, both using Python:

1. For simple, static websites: Use a library like Scrapy to quickly grab data. It’s designed specifically for high-speed, large-scale scraping, letting you process many products in parallel.

2. For complex, dynamic websites (like modern e-commerce stores): Use a headless browser tool like Playwright. This simulates a real web browser, allowing it to “see” and interact with all the JavaScript elements that load the specifications, even if they aren’t visible right away.

To manage the list of 1,000 products and the final output, you’d integrate your scraping script with a workflow automation tool like n8n. You’d start by putting all your product EANs/titles into a Google Sheet. The n8n workflow would read this list, feed each product into your scraping script (Scrapy or Playwright), and then automatically write the clean, extracted data (Depth, Color, Max Load, etc.) back into the corresponding rows of your Google Sheet.

This method handles the batch processing and actuallly ensures your final data is neatly organized and standardized.

2 Likes

Tnx mate, will look into it. thanks

You are welcome @logidy let me know how things goes and kindly mark as solution if you found the info helpful

1

Do this websites offers a REST API ?

Do they have a limit on requests?

2

If you want to “scrape” data from html docs and the site is dinamic, hope your skills and nerves are strong.

Cheers!

The Apify Store has many e-commerce store scrapers with a free tier, and also general e-commerce scrapers.

And you can call it directly from the Apify node in n8n. IMO this is the simplest way.

this is so true, i once checked the ampify store.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.