Hi, Iβm new to n8n, i have a workflow that extract image from pdf, note: the pdf has scanned invoice
so i want to extract image from pdf and put it in drive google,
the first attempt is to use node code with python code, but in this step i have to install dependency and that is not recommended from what i read in How to Install dependencies in n8n? - #3 by Rishi_Chaudhary
so i need to stick using community nodes, but i still doesnt found a good process
over all what i need to achive
[Trigger] β [Read PDF] β [Detect & Extract Images] β [Switch] β
βββ [For Each Image] β [Rename/Process] β [Google Drive Upload]
detect and extract image is using python code which is in process to change.
i need [Read PDF] β [Detect & Extract Images] β [Google Drive Upload]
could you give me direction iβm kinda lost
hello @polyglot_engineer, welcome!
So, is your question about how to extract specific images from a PDF page (this sounds challenging), or about converting each PDF page into an image (this is probably easier)?
which is most possible way for this? converting each PDF page into an image, should i use community service or there is another way using node code
So from what i learn you need to seperate the image and PDF first because it very hard to separate it in n8n environment, after that you can use n8n workflow for OCR process, and got the result in JSON form. That my visible solution,is there any suggestion or maybe another approach for this kind of problem. Iβm open to new suggestion
Hi @polyglot_engineer, if youβre willing to pay for a service to do this, you can install the pdf.co community nodes and then pay for the service to convert it to image for you. This will be your easiest solution
To install the community node, search for βpdfβ in the right hand panel, select pdf.co and click install
once installed you should be able to convert pdf to images to store wherever you need using
Then select the type of image
1 Like