tre
December 8, 2023, 6:44pm
1
I’m using a third party service to scrape a url and then used HTML extract in combination with the Code node to remove all the HTML tags and all.
The output in the node when selecting JSON, shows it without any tags
But, when I switch to Schema, it shows all types of HTML tags
When I try to pass this on to the next node, it has all these tags present in it: <link rel=“stylesheet”
How can I get the output of the JSON to the next node? Instead of the one that is visible on the Schema option?
n8n
December 8, 2023, 6:44pm
2
It looks like your topic is missing some important information. Could you provide the following if applicable.
n8n version:
Database (default: SQLite):
n8n EXECUTIONS_PROCESS setting (default: own, main):
Running n8n via (Docker, npm, n8n cloud, desktop app):
Operating system:
tre
December 13, 2023, 5:45am
3
Version: 1.18.2
Runnin n8n via elest.io
Jon
December 13, 2023, 11:00am
4
Hey @tre ,
It looks like the Table view is hiding the HTML characters and trying to display them but schema / json will likely show the raw data.
It looks like you may need to remove the HTML from the fields you want to use which could possibly be done with a code node but I don’t have an example to hand for doing this.
tre
December 20, 2023, 4:49pm
5
I am using the HTML Extract followed by the Clean Content node from this template# 1862
The title of the template is:
OpenAI GPT-3: Company Enrichment from website content
The screenshots I shared above are the result of the Clean Content node.
I noticed this is still extracting all random URLs and those tags above. Is there anyway I can only get the text?
jan
January 3, 2024, 1:08pm
6
New version [email protected]
got released which includes the GitHub PR 8126 .
system
Closed
April 2, 2024, 1:08pm
7
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.