Read PDF File Path

Hi,
How do I determine the file path of the PDF file?
I looked in:

but I do not see any relevant information on this.
I also searched the variables, but I don’t see a path:

Nodes
Cron
Gmail1
Output Data
JSON
from
headers
labelIds
to
date:

html:

id:
17bc657f5119a701
messageId:

sizeEstimate:
564
subject:
test of n8n
text:
Addario’s
textAsHtml:

threadId:
17bc657c8c796a62
Parameters
additionalFields
dataPropertyAttachmentsPrefixName:
attachment_
labelIds:
=Label_5617014282744904841
limit:
2
operation:
getAll
resource:
message
returnAll:
false
Google Sheets
Read PDF
Parameters
binaryPropertyName:
={{$node[“Read Binary Files”].parameter[“fileSelector”]}}

Do not really understand what the data after “I also searched the variables, but I don’t see a path:” is about, but the documentation page you linked actually answers exactly your question.
It has even an example workflow that does exactly that, and also mentions it specifically:
Screenshot from 2021-09-12 21-20-59

Hello @jan , I think that what @Josh_Fialkoff means is that he is not trying to read a pdf he just keeps in his disc or similar, he is trying to read the attached pdf just downloaded from the email, so he cannot set a custom file path, or he cannot just know the path where the pdf is downloaded by default, so he cannot set the path where the Read PDF node has to find the file. I got the same wall just in front of me :sweat_smile: May you know how to find that path? PS Sorry for my English, it is not my mother language.

Hey @Jairo,

Are you trying to just save an attachment to disk?

Hi @jon ,

No, I would like to get the text from the PDF file and also to upload it to Google Drive. I tried uploading first, but I found te same problem. Thanks for answering :slightly_smiling_face:

Hey @Jairo,

So there are a few parts to it, If you are not saving the PDF to disk you can just use the binary property as you already have the file there ready to use.

Example Workflow
{
  "nodes": [
    {
      "parameters": {},
      "name": "Start",
      "type": "n8n-nodes-base.start",
      "typeVersion": 1,
      "position": [
        250,
        300
      ]
    },
    {
      "parameters": {},
      "name": "Read PDF",
      "type": "n8n-nodes-base.readPDF",
      "typeVersion": 1,
      "position": [
        630,
        300
      ]
    },
    {
      "parameters": {
        "url": "http://www.africau.edu/images/default/sample.pdf",
        "responseFormat": "file",
        "options": {}
      },
      "name": "HTTP Request",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 1,
      "position": [
        450,
        300
      ]
    }
  ],
  "connections": {
    "Start": {
      "main": [
        [
          {
            "node": "HTTP Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "HTTP Request": {
      "main": [
        [
          {
            "node": "Read PDF",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

If you were saving the file to disk first using the write binary node you would need to use the read binary node and input the path you would have used for the write binary.

Example Workflow
{
  "nodes": [
    {
      "parameters": {},
      "name": "Start",
      "type": "n8n-nodes-base.start",
      "typeVersion": 1,
      "position": [
        250,
        300
      ]
    },
    {
      "parameters": {
        "binaryPropertyName": "myfile"
      },
      "name": "Read PDF",
      "type": "n8n-nodes-base.readPDF",
      "typeVersion": 1,
      "position": [
        1070,
        300
      ]
    },
    {
      "parameters": {
        "url": "http://www.africau.edu/images/default/sample.pdf",
        "responseFormat": "file",
        "options": {}
      },
      "name": "HTTP Request",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 1,
      "position": [
        450,
        300
      ]
    },
    {
      "parameters": {
        "fileName": "/home/node/.n8n/output.pdf"
      },
      "name": "Write Binary File",
      "type": "n8n-nodes-base.writeBinaryFile",
      "typeVersion": 1,
      "position": [
        660,
        300
      ]
    },
    {
      "parameters": {
        "filePath": "/home/node/.n8n/output.pdf",
        "dataPropertyName": "myfile"
      },
      "name": "Read Binary File",
      "type": "n8n-nodes-base.readBinaryFile",
      "typeVersion": 1,
      "position": [
        870,
        300
      ]
    }
  ],
  "connections": {
    "Start": {
      "main": [
        [
          {
            "node": "HTTP Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "HTTP Request": {
      "main": [
        [
          {
            "node": "Write Binary File",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Write Binary File": {
      "main": [
        [
          {
            "node": "Read Binary File",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Read Binary File": {
      "main": [
        [
          {
            "node": "Read PDF",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

The 2 example workflows above don’t use an email source but instead use an HTTP source but the process and theory is the same. Hopefully this helps.

1 Like

Hey @jon ,

Thank you so much for your help. I finally managed to read the pdf and even to upload it to Drive. It was not exactly your solution but it helped a lot :partying_face:. The diference between your workflow and mine or @Josh_Fialkoff 's is that the binary data is JUST behind the node that gets it. Once you add another node in between the data needs to be recollected again. As the email was the trigger and I needed to check somewhere else in which folder I need to upload the file, I was missing the data on the way :sweat_smile: It might be a basic from coding, sorry, I come from Zapier and similar zero code solutions, and coding is not one of my skills yet :grimacing: So I will have to first check where to upload, and then how can I recollect the binary data? I guess I will need to write it to disk so I can get to the data every time I need it :thinking:

I reply myself :grimacing: I guess that if I upload the file twice, once on the main GDrive folder and once again in the folder I really need to keep it I can donwload the data as many times I need and delete the first file at the end of proccess… I just don’t want to use my server as a useless files storage.

That works but if you have the file as a binary object you can just reference it with the binary name at any point in your workflow.