Extracting elements from HTML dom with optional tags

I’m trying to use the HTML node to extract several elements. For example my html looks like this:

<div">
	<div class="name">item 1</div>
	<div class="review">here is the review</div>
</div>
<div>
	<div class="name">item 2</div>
</div>
<div>
	<div class="name">item 3</div>
	<div class="review">here is another review</div>
</div>

The goal is to end up with a json that looks like this:


	[
		{
		"name": "item 1",
		"review": "here is the review"
		},
		{
		"name": "item 2"
		},
		{
		"name": "item 3",
		"review": "here is another review"
		}
	]

I’m using the HTML extract node and entering a CSS selector for each designed node.
The problem with the way the HTML extract node is designed, if an element is missing (review)

How can I instruct the HTML node to enter a blank value or something similar if a tag is missing.

Alternatively, I’m comfortable using the code node. I tried using cheerio to extract the data but I was having a difficult time getting cheerio to run. I am running on my own server (not n8n cloud)

Thank you for your help!

Information on your n8n setup

  • n8n version: LATEST
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app): npn
  • Operating system: Cloud

I didn’t finish this in a short way.
But can perfectly get the output you want.
I think I took a bit of a roundabout approach though.

1 Like

Thanks! You pointed me in the right direction. Do you know why cheerio doesn’t seem to work any more? or maybe I need to open a new question for it?

From what I know. n8n is not support cheerio as built in node.

There might be some complicated way to use it in code node.

Maybe others will have the answer. New question would be a better idea.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.