Writing HTML to a file: duplicate content

sgw · February 7, 2023, 8:41am

Describe the issue/error/question

From my converter-API I receive a json object (is that the term?) containing a HTML document.

[
  {
    "data": "<!DOCTYPE html>\n<html lang=\"en\">\n  <head>\n    <meta charset=\"UTF-8\" />\n    <meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge\" />\n    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\" />\n    <title>Document</title>\n  </head>\n  <body>\n    <h1 align=\"center\" style=\"margin: 0; padding: 0\">Testanfrage Weichinger</h1>\n    

[..]

</body>\n</html>\n"
  }
]

Copied that content into vim and checked that it’s content is there only once, not twice in a row.

I then pipe that into a “Spreadsheet File” node: I need to write it to “index.html”.

What is the error message (if any)?

The output html file contains the content twice somehow.
After the nice and parsable html content there is the input “string” once more, in a single line, see screenshot:

Information on your n8n setup

n8n version: 0.214.2
Database you’re using (default: SQLite): postgres
Running n8n with the execution process [own(default), main]: default
Running n8n via [Docker, npm, n8n.cloud, desktop app]: Docker

MutedJam · February 14, 2023, 2:14pm

Hi @sgw, I am sorry for the trouble. I wouldn’t be super surprised if writing HTML into an HTML spreadsheet has unforeseen side-effects, though I was unable to reproduce the problem. Perhaps you can share an example workflow allowing me to see this first hand?

sgw · February 14, 2023, 3:40pm

I will try to provide an anonymized example asap. Thanks.

sgw · February 15, 2023, 8:14am

See the pasted workflow.
I mock the output of the API-container (which I can’t share here) in the Code-node.

For me this workflow also returns that “wrong” output. Tested right now with my local n8n:0.215.1

MutedJam · February 16, 2023, 4:01pm

Thank you @sgw, I could now see the issue.

Unfortunately I don’t have a great solution for this, the library powering the Spreadsheet node just seems to struggle with HTML content in an HTML table.

I also can’t think of a manual way of wrapping your HTML into an HTML table that would preserver the full original HTML structure. Your data includes an html tag for example, which defines the root of an HTML document. Nesting this in another HTML document would render the document invalid.

Perhaps you want to convert your existing HTML data into a different format before creating your spreadsheet?

sgw · February 16, 2023, 6:45pm

@MutedJam Thanks for looking closer.

The HTML should be piped into a converter to pdf. Currently my workflow uses Gotenberg in a docker container to do that conversion. Its API expects a file named index.html:

So my approach was to write my html object(?) to a file, name it accordingly and then send it to the API via http request.

If there is another way of naming that file accordingly: fine.
Or I somehow look for another pdf converter, that I can run on premise in my own environment.

Thanks for any help here.

MutedJam · February 20, 2023, 7:36am

Hi @sgw, since you already have a full HTML document you could simply write the existing data into an HTML file:

Hope this helps!

sgw · February 20, 2023, 8:58am

@MutedJam thanks, that works for me!

system · February 27, 2023, 8:58am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.