Sanitize html: Anynone used dompurify within code node?

Hi,

I need to sanitize some html from the html extract node. Anyone did this before?

First idea:

  • use code node and regex → bad idea

Hi @Kool_Baudrillard :wave:

I might not be too sure on this one, but at the same time - could you post a sample of the HTML you’re looking to sanitise? Make sure there’s nothing sensitive of course, but it might be helpful in building something. Especially if you could also mention why regex wasn’t working so well for you!

If you want to specifically use dompurify, as long as you’re self-hosting you could add this as a module to the code node.

HI,

e.g.

\n Banner\n \n

[bite.jobposting.title]

\n \n \n \n
\n
\n

WAS UNS ĂśBERZEUGT

\n
\n [bite.jobposting.customValues_siekoennenmehr]
\n

WAS SIE BEI UNS BEWEGEN

\n
\n [bite.jobposting.customValues_siewollenmehr]
\n

WAS WIR BIETEN

\n
\n [bite.jobposting.customValues_wirbietenmehr]
\n
\n
\n
\n Standort\n

STANDORT

\n
\n [bite.jobposting.address_postCode] [bite.jobposting.address_city]
[bite.jobposting.address_street] [bite.jobposting.address_houseNumber]
[bite.jobposting.customValues_html_neueroeffnung_ab]
\n
\n
\n
\n Beschäftigungsart\n

BESCHĂ„FTIGUNGSART

\n
\n [bite.jobposting.customValues_beschaeftigungsart]\n
\n
\n
\n Eintrittstermin\n

EINTRITTSTERMIN

\n
\n [bite.jobposting.customValues_eintrittstermin]\n
\n
\n
\n Ansprechpartner\n

KONTAKT

\n
\n [bite.jobposting.customValues_ansprechpartner]
[bite.jobposting.customValues_ansprechpartner_tel]
[bite.jobposting.customValues_ansprechpartner_zusatz]
[bite.jobposting.customValues_ansprechpartner_zusatz2]
\n
\n
\n \n
\n \n \n \n \n \n \n
\n BEWERBEN.\n
\n
\n
\n
\n
\n \n \n Drucken\n \n \n
\n \n
\n \n \n E-Mail\n \n \n
\n
\n \n Icon\n \n \n Icon\n \n \n Icon\n \n \n Icon\n \n\n \n Icon\n \n
\n
\n
\n
\n
\n
\n
\n
\n \n \n
\n [bite.jobposting.customValues_html_auszeichnungen]\n
\n
\n
\n \n [bite.jobposting.customValues_html_qrcode_image]\n \n
\n
\n
\n \n [bite.jobposting.customValues_html_qrcode_detail]\n \n
\n
\n
\n \n

I’d like basically to get rid of attributes, div span, script etc and just have basic p, ul, li tags.

So I got results with regex, but not as good as a sanitizer.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.