Hello, I would like to read a PDF and pack the content into a JSON file. Which node do I have to use for this?

Just to make sure - since you mentioned OCR :sweat_smile: Are you looking for optical character recognition / does your PDF have pictures of text? If that’s what you’re looking for, n8n can’t do this directly, but AWS Textract can (and does have a node within n8n to use). Mindee can also do this for some documents.

If I’m off base, can you provide an example of what you’re looking for instead? :bowing_man:

If your PDF contains actual text instead of images of text like I suspected, the Read PDF node will do the job :slight_smile:

