Extract 2nd value (class) from extracted HTML

Hello,

I have extracted the following from a website:

I need to extract the 2nd class (marked in yellow) for each row
i have no idea how to do it…

i have tried nth-child() but its only giving me the “lift name”.

Any help is much appreciated

<tr><th colspan="3">LIFTS</th></tr>
<tr><td class="liftdate">16/01</td><td class="liftdate">17/01</td><td></td></tr>
<tr><td><div class="status isopen"></div></td><td><div class="status isopen"></div></td><td>Gondola lift Afrodite</td></tr>
<tr><td><div class="status isclosed"></div></td><td><div class="status isclosed"></div></td><td>Aiolos</td></tr>
<tr><td><div class="status isopen"></div></td><td><div class="status isopen"></div></td><td>Βάκχος</td></tr>
<tr><td><div class="status isclosed"></div></td><td><div class="status isclosed"></div></td><td>Pericles</td></tr>
<tr><td><div class="status isclosed"></div></td><td><div class="status isclosed"></div></td><td>Hercules</td></tr>
<tr><td><div class="status isclosed"></div></td><td><div class="status isclosed"></div></td><td>Odysseus</td></tr>
<tr><td><div class="status isclosed"></div></td><td><div class="status isclosed"></div></td><td>Hermes</td></tr>
<tr><td><div class="status isclosed"></div></td><td><div class="status isclosed"></div></td><td>Baby 1</td></tr>
<tr><td><div class="status isopen"></div></td><td><div class="status isopen"></div></td><td>Baby 2</td></tr>
<tr><td><div class="status isopen"></div></td><td><div class="status isopen"></div></td><td>Baby 3</td></tr>
<tr><td><div class="status isclosed"></div></td><td><div class="status isclosed"></div></td><td>Gondola lift Hera</td></tr>
<tr><td><div class="status isclosed"></div></td><td><div class="status isclosed"></div></td><td>Iniohos</td></tr>
<tr><td><div class="status isclosed"></div></td><td><div class="status isclosed"></div></td><td>Pan</td></tr>
<tr><td><div class="status isclosed"></div></td><td><div class="status isclosed"></div></td><td>Pythia</td></tr>
<tr><td><div class="status isclosed"></div></td><td><div class="status isclosed"></div></td><td>Zeus</td></tr>
<tr><td><div class="status isclosed"></div></td><td><div class="status isclosed"></div></td><td>Sahara</td></tr>
<tr><td><div class="status isclosed"></div></td><td><div class="status isclosed"></div></td><td>Baby 4</td></tr>

DD

Hey @xewonder,

Have you tried using the inspect option in the browser dev tools to get the selection? I suspect this one will be a case of using that then some regex to get the value.

Do you have a link to the full page that contains the table?

Hi @Jon,

Thank you very much for the reply.

you need to click this:
image

I am trying to extract the number of “lifts” and “slopes” (just a count of “isopen”) open for the current day.

DD

Hi @xewonder, I believe what @Jon meant here was getting a suitable selector like so, after right-clicking on the container of the respective table:

Personally, I would approach this data extraction task roughly as described here (extracting each table row individually, then wrapping them into a basic HTML structure again to then extract the individual cell values). Or as a workflow:

This would produce a result like this:

You can of course beautify this a bit more (for example by replacing the status text with something more readable such as just “Open” and “Closed”), or by throwing away the second date if you don’t need it. But it should hopefully help you get started.

To get the slopes data, simply replace LIFTS with SLOPES on the first HTML Extract (“Extract lifts table”) node.

Hope this helps! Let me know if you have any questions on this :slight_smile:

This is awesome!!!

Thank you so much!!

1 Like

This is great, however in “Extract lifts table” I get all the lifts twice?

Oh, sorry for that, I didn’t realize that HTML has each lift twice. I suppose the easiest way to sort this would be adding another Item Lists node removing duplicates:

No more duplicates:

1 Like

You are a STAR! thank you!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.