Isolate HTML content inside <body> tag

I am grabbing HTML data from the IMAP node via textHTML. The issue is that the email is essentially an entire HTML page and when I need to manipulate it later for an HTTP POST, I need just the content inside . There appears to be a few ways to do this via Javascript, but they seem overly complex for my needs. Any ideas on how best to do this? Thank you!

Hi @cleveradmin

I think I understand. But not 100% sure. Could you give an example of before and after?
For example do you want to keep all the text within he Body tags, including the other HTML tags?

That’s correct. I can provide a before and after tomorrow, but at it’s simplest:

<html>
<Head></Head>
<Body>
<h1>This is the header</h1>
</body>
</html>

Becomes:

<h1>This is the header</h1>

In thise case the HTML Extract node would be the best thing to use:

Basic example workflow:

{
  "nodes": [
    {
      "parameters": {
        "values": {
          "string": [
            {
              "name": "data",
              "value": "=<html><Head></Head><Body><h1>This is the header</h1> </body></html>"
            }
          ]
        },
        "options": {}
      },
      "name": "Set",
      "type": "n8n-nodes-base.set",
      "typeVersion": 1,
      "position": [
        460,
        300
      ]
    },
    {
      "parameters": {
        "extractionValues": {
          "values": [
            {
              "key": "text",
              "cssSelector": "body",
              "returnValue": "html"
            }
          ]
        },
        "options": {}
      },
      "name": "HTML Extract",
      "type": "n8n-nodes-base.htmlExtract",
      "typeVersion": 1,
      "position": [
        680,
        300
      ]
    }
  ],
  "connections": {
    "Set": {
      "main": [
        [
          {
            "node": "HTML Extract",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
3 Likes

Doh! So obvious, thank you Jon! Merry Christmas everyone!!