How to remove html tags from rss feed or from the text output

Hi to all people here.

I am not a programmer and not an native english speaker but i will do my best to explain:

(IN SHORT) I need to remove html tags from a text - : <a href <p …and so one

LONG DESCRIPTION) I retrieve the feed from an Rss agregator. I extract the content with n8n and i translate it with deepl in order to post the content to a blog.
The result of the feed extraction contain only HTML code with all the formating, achors so on. Also the output / translated content of the Deepl API is also full of HTML tags .
I need to have only the formatted text or at least ( but not so good) the plain text.

I tried almost every solution in n8n.
I am only a casual coder in C# but i know how to parse the html text and to get ride of the unvanted html tags.
Java script i do not know , as i saw that is used in the n8n. But maibe i could implement a JS code.
And directions or sugestions please
Thank you

I run desktop app of N8n
no errors, just output text is not clean, but full of original anchors ant nasty tags
I tried also HTML EXTRACT node, but maybe i was something wrong -there i have different errors depending on the internal confirgurations .
one is ERROR: No property named " some text here " exists!

Hey @sheiku,
welcome to the community :tada:

I just looked into the rss feed and I see a contentSnippet field that seems to be the content field without any html tags. Is that what you were looking for?


If you need to remove html tags yourself you could use a Set node expression with regex like this.


Here is an example workflow to illustrate.


Many thanks, @marcus,
i will try your example
Yes i try to clean the text like in your example, but i have errors or the texts is not cleaned.
I will came back with an answer…

@marcus , it seems that is working your approach !
Wow, It was easier for me (as a non coder) to learn C# than to understand regex…for me regex is unknown field.
I added the deepl translation and i selected the output translated to be only the title and the content fields of the rss… {{$json[“content”]}} {{$json[“title”]}}
and it is working.
I will tweak more the results.
Thank you again