Getting not HTML element itself with attributes in HTML Extract

chrisbol · January 5, 2023, 4:43pm

Describe the issue/error/question

Hi all,
Just discovered n8n, very happy with my first steps with this great tool.
But I am scraping the following website for concerts: Agenda — EKKO
To get the list of gig items I use the HTML Extract node with the following CSS selector:
“div.pb-8 > a” and as return type I choose HTML as I need to loop over each gig. And I choose Return Array is True.
The different gigs wth html are shown in array in output but without the a element itself and its attributes.
There is some info in the A element (like attributes href and data-filter-value) which I also want to use.
How can I solve this?
I could of course set all the various values in separate keys in this HTML Extract but that feels a bit tricky as then all those keys are ‘only’ linked by index or am I missing something?

My workflow

Jon · January 6, 2023, 9:10am

Hey @chrisbol,

Welcome to the community

It looks like the selector is working as expected but I can see the problem when giving it a run, I can’t think of a nice clean way to handle it and the best I can come up with is to have 3 options in the node and use the attributes.

chrisbol · January 6, 2023, 12:30pm

Thanks for reply and suggestion.
I found a solution but it is not very well for performance I think. Kind of ugly.
I can add a node before HTML Extract where I replace <a by <dummy><a and </a> by </a></dummy>

and then in HTML Extract use the CSS selector dummy
Then I have an array with the different gigs including the <a …> and closing a tag

system · April 6, 2023, 12:31pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.