Home management system

I am developing a home management system which performs thee following:

  1. I scan and upload a document to my Google Drive as a PDF
  2. Trigger for the workflow to start is from seeing a new file appear in a Google Drive folder
  3. The PDF file is fed directly into an AI agent, which analyses the document, renames it accordingly and files it in the appropriate folder in Google drive.
  4. If the AI agent determines that action needs to be taken then I receive an email notifying me of the action.

I have all my workflow working apart from the recognition of the content of the PDF. I assume this is because the PDF is just an image encapsulated within a PDF rapper. I have tried using an OCR method, extracting the text before passing to he AI agent, but with no success. I don’t really understand why I need to OCR the document as I would have expected the AI agent to analyse the PDF.

Please can someone advise on how to carry out this PDF analysis.

did you try asking the AI agent to analyze the content in either image format etc ?

@sandy4v thanks for you response. What image formats are you referring to? When I scan the document using the Google Drive scan facility, the images of the document are being placed into a PDF rapper. I am therefore not asking the AI Agent to analyse a JPG or PNG. I’m not sure why I have this problem with feeding the PDF into the AI agent as if I upload the PDF to Grok, Gemini or Chatgpt, they are able to read and analyse the document with no problems. Also within Google Drive, gemini can read the PDFs with no issue at all. Could I ask if you or anybody on this forum are performing a similar operation with success?

can you paste your workflow code in here and I can try to replicate it at my end

Hi Sandy, I am very new to this forum and n8n. If I share my workflow in this forum how can I avoid sharing any credentials or private information?

if you are using the credential variables , they dont get exported. If you have API keys passed as parameters they will be exposed. you can create a copy if in doubt and delete all the credentials or setup a google meet and I will jump on it to see how I can help

@sandy4v thank you so much for your kind offer of help. I am currently working on a possible fix for my workflow by using Mistral to do the OCR before passing the content of the document through AI. I am happy to share the outcome with you as to whether successful or not and if I need any further help.

I would love to know the result regardless I will learn something :slight_smile:

Hey DanielS, just came across your post on this and this was also something i was looking to do for my document management. Did you have any luck with this and if so would it be possibel to share your workflow.