Does the 'Extract HTML content'-node support advanced CSS selectors?

LanderC July 17, 2025, 3:25pm 1

Hello,

I’m trying to bring in data from a website using the ‘Extract HTML Content’-node.

I’m using a CSS selector that selects a data-attribute using:
.class[data-list=‘XXXX’]

This doesn’t seem to result in any data.

I’m forced to work like this as the website uses generic classes that repeat themselves multiple times.

Anyone have experience with this? Any workarounds?

jabbson July 17, 2025, 3:30pm 2

Hey @LanderC hope all is good!

Would you like to share the website and what you are trying to scrape?

LanderC July 18, 2025, 1:45pm 3

Sure Jabbson!

It’s a Dutch news website. They have a dedicated section for ‘most read’ news articles:
→ hln.be

The section with title ‘Meest gelezen’ uses generic classes found on different parts of the homepage.

This CSS selector should work:
.col–secondary .fjs-sticky-container .widget__content .ankeiler__link[data-list=‘meest gelezen’]

But it doesn’t in n8n somehow.

Thx for the response!

jabbson July 18, 2025, 2:11pm 4

It appears the website have a firewall which kicks in after several attempts of getting the page. Is this something you came across already?

Forbidden - perhaps check your credentials?
Access Denied Access Denied
Your request was blocked by DPG Media's Web Application Firewall.

Nonetheless, if you are after the articles in the Meest gelezen, the selector you are looking for is simply

a[data-list='meest gelezen']

Topic		Replies	Views	Activity
HTML Extract Problem Questions workflow-building	3	222	May 16, 2024
Getting not HTML element itself with attributes in HTML Extract Questions	3	1089	April 6, 2023
Proper ways to set up CSS selector in HTML node Questions workflow-building	6	2120	June 21, 2024
Scrape Blog Content from random website Questions data-transformation , workflow-building	3	317	March 4, 2025
Pulling historical stock data Questions workflow-building , external-data , html-extract	11	1990	August 21, 2022