Read html file content with format and hyperlinks

Describe the problem/error/question

I have a Google document (.doc) that forms the message content that I wish to send by Gmail.

The document can be downloaded (Google Drive 1 node) as html and I can see that embedded hyperlinks are preserved.

When the html content is extracted (Extract from file node), all formatting and links are lost. Clearly the node extracts the text as a string.

I want the extracted content to include format and links.

What is the error message (if any)?

There are no error messages per se but the functionality is not what I’m looking to achieve.

Please share your workflow

This is my dev workfow with manual trigger.

(Select the nodes on your canvas and use the keyboard shortcuts CMD+C/CTRL+C and CMD+V/CTRL+V to copy and paste the workflow.)


## Share the output returned by the last node
<!-- If you need help with data transformations, please also share your expected output. -->

![image|396x417](upload://tn0i1owEllHhXuGMK1OYoO3xkDL.png)


## Information on your n8n setup
- **n8n version:** 1.57.0
- **Database (default: SQLite):** SQLite
- **n8n EXECUTIONS_PROCESS setting (default: own, main):** executionMode: regular
concurrency: -1
- **Running n8n via (Docker, npm, n8n cloud, desktop app):** Docker (self hosted)
- **Operating system:**Ubuntu

Hi @Alan57,

Thanks for posting! The Extract from File node > Extract from HTML operation will extract only the text and not the links. If you want to keep the formatting including the hyperlinks, you’ll have to use the ‘Extract from Text File’ option instead and then parse the part of the document you want to include in the email. You can try to do this with regex if the formatting of the google doc will be the same for all your execution.

Could you share a bit more about the tasks you’re trying to operate or what you’re looking to automate? That way, maybe the community can offer better tips/workarounds for your usecase :star2:

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.