How to extract text value inside html font dir?

Describe the problem/error/question

I want to extract the text located inside the footer #slider > div.slider-page > font > font but I don’t know how. Can anyone show me how?

What is the error message (if any)?

Please share your workflow

Share the output returned by the last node

Information on your n8n setup

  • n8n version: 2.2.3
  • Database (default: SQLite): default
  • n8n EXECUTIONS_PROCESS setting (default: own, main): default
  • Running n8n via (Docker, npm, n8n cloud, desktop app): npm
  • Operating system: Ubuntu 24.04 LTS

This is A.I generated solutions; based on your question content hope it helps.

Here’s how you can fix that problem:


:puzzle_piece: Goal

You want to extract the text inside nested <font> tags like:

<div id="slider">
  <div class="slider-page">
    <font><font>Extract this text</font></font>
  </div>
</div>

:white_check_mark: Solution 1 – Use a HTML Extract Node

Steps:

  1. Add an HTML Extract node after the node that outputs the HTML.
  2. Set the Property Name to the field containing the HTML (e.g., html).
  3. In the CSS Selector, enter:
#slider > div.slider-page > font > font
  1. Choose Text Content as the extraction mode.

This will output:

{
  "text": "Extract this text"
}

:white_check_mark: Solution 2 – Use a Function Node (for more control)

If the HTML Extract node doesn’t capture nested tags properly, use this JavaScript in a Function node:

// Using a DOM parser to extract text from nested font tags
const html = $json.html; // assuming HTML content is in "html" field
const cheerio = require('cheerio');
const $ = cheerio.load(html);

// Select the nested font text
const text = $('#slider > div.slider-page font font').text();

return [{ text }];

:brain: Note: The Function node has cheerio available in n8n Cloud and newer self-hosted versions.
If not, use the HTML Extract node instead.


:white_check_mark: Solution 3 – Use an Expression (if you only need one value)

If the HTML is in a field like $json["html"], you can extract the text with a Set node using this expression:

{{ (() => {
  const html = $json.html;
  const parser = new DOMParser();
  const doc = parser.parseFromString(html, "text/html");
  const target = doc.querySelector("#slider > div.slider-page > font > font");
  return target ? target.textContent.trim() : null;
})() }}

Source: ChatGPT - New chat

Quick tip:
You can also use this short trick and start new chat in chatgpt and continue from (Free):

ask.n8n.community/ + Question = Enter Link.

Example:

https://ask.n8n.community/Help me fix this problem: https://community.n8n.io/t/how-to-extract-text-value-inside-html-font-dir/250088#p-477107-describe-the-problemerrorquestion-1

Hey @Ruriko, hope all is well!

I think I know what you need: the page numbers and image URLs,

If that is the case, you can find them by looking at the page source, everything you need is located here:

At this point, it is just a scraping job:
Extract this part from the HTML, parse the JSON, and then you’ll have the data ready to do whatever you want, like counting the images/downloading them or anything else,

here is the workflow:

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.