How to extract an unordered list from HTML into an array

I have a workflow where I read email notifications using the MS outlook node and what I want to do is extract the unordered list element within the html body that has a bunch of useful info.
Here is the HTML structure that I can use a SET node to get from the outlook email contents

<html>
<head>
</head>
<body lang="EN-GB" link="blue" vlink="purple" style="word-wrap:break-word">
      <div class="WordSection1">
         <ul type="disc">
            <li class="MsoNormal" style="">DEVICE TYPE - <a href="https://server.com/private">DEVICE NAME</a> (SerialNumber) - Device Last appeared time is 2023-07-19 16:01 </li>
         </ul>
         <p class="MsoNormal" style="margin-left:36.0pt">Tag: Primary, SiteID:1223, gateway </p>
      </div>
   </body>
</html>

IN a perfect world I would like to extract the data in the email so I can use it to create a new email template. So I would need these items:
Device Type
Device Link
Device Name
Device Serial
Device Time
Array of Tags

The contents of the tags array would be used for switch routing too. (ie if ‘gateway’ exists)

Can anyone point me in the right direction?

Hi @MartinDevon Welcome to the n8n community :tada:

Assume the tags or things won’t change. This will work.

Here is what you might be looking for.

image

4 Likes

This is amazing!! Thanks very much. Your regex capabilities terrify and delight me in equal measure :wink:

Worked like a charm!

3 Likes

You’re welcome. :sparkles:

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.