I’ve been trying for a long time to solve the problem of opening an existing PDF document on Google Drive as a text file in Google Docs format. I tried configuring it via an HTTP Request node, changing the Content-Type, and creating a copy of the file in DOC format. I’ve asked all the smartest AI models multiple times (DeepSeek, GPTs from OpenAI, Grok3), but they keep talking about non-existent fields and checkboxes—they couldn’t help.
Please help!)
Thanks in advance!

## Information on your n8n setup
- **n8n version:1.7**
- **Database (default: SQLite):**
- **Running n8n via (Docker, npm, n8n cloud, desktop app):NPM**
- **Operating system: windows 10**
You can do it like this:
Download the pdf > Convert it to JSON > Create a new google doc > update it by the id of the created file.
Will it look good? Probably not since you have pretty limited formatting options with the google doc node.
I’ve tried using AI to transform the extracted text to Markdown since I know Google Docs supports it (You have to enable it though.) but then the update doesn’t paste it as markdown and you have to open the document cut and ‘paste from markdown’
Thanks for the answer! But that’s the first thing I tried. And instead of text, i see “\n\n\n\n\n\n\n\n\n\n.” The file must be protected. But if you open the file on the disk, right-click “Open with Google Docs.” there’s the text you need.
I try not to post untested suggestions
I also updated my previous comment to include my experience with trying to format it because as you can see it looks terrible.
Are you sure the text in your PDF is actually text?
In many cases it is a picture.
If you can’t select it like this it probably is a scanned document or something.
If it is a scan/picture you will get something like \n\n\n\n which are the new lines between the pictures. For pictures you will have to run it through OCR but i haven’t done this with n8n yet.
Yep services can pile up. Not sure what is the price for OCR from PDF.io
but I can see n8n has AWS Textract node (pretty limited though but it can handle authentication for you while you send the actual request with an HTTP node)
AWS textract has a free tier of 1000 pages per month and after that prices are pretty reasonable.
However it introduces another service and a dependency so even if price is ok it still has some downsides.