I have a PDF invoice and when i try to read the data from the PDF the output JSON is coming as single object
What is the error message (if any)?
""\n\nInvoice\nPayment is due within 30 days from date of invoice. Late payment is subject to fees of 5% per month.\nThanks for choosing DEMO - Sliced Invoices | [email protected]\nPage 1/1\nFrom:\nDEMO - Sliced Invoices\nSuite 5A-1204\n123 Somewhere Street\nYour City AZ 12345\[email protected]\nInvoice NumberINV-3337\nOrder Number12345\nInvoice DateJanuary 25, 2016\nDue DateJanuary 31, 2016\nTotal Due$93.50\nTo:\nTest Business\n123 Somewhere St\nMelbourne, VIC 3000\[email protected]\nHrs/QtyServiceRate/PriceAdjustSub Total\n1.00\nWeb Design\nThis is a sample description...\n$85.000.00%$85.00\nSub Total$85.00\nTax$8.50\nTotal$93.50\nANZ Bank\nACC # 1234 1234\nBSB # 4321 432\nPaid"",
I am sorry you’re having trouble. This is the expected behaviour I am afraid, n8n’s Read PDF node would not automatically parse tables or other content of your PDF file, it’d just extract the raw text.
Since parsing PDF invoices can be quite a challenge you might want to consider using a designated 3rd party service focusing on this task. Mindee would be a service integrated in n8n for example and it can parse invoices like the example one you appear to be using here:
Hi @Arudhra, to find out which node version you’re using you want to select your node on the n8n canvas, then copy it using Ctrl+C and finally paste the data you have copied into a text editor using Ctrl+V.
You should then see a line saying typeVersion which I have highlighted below:
Hi @Arudhra - can you try accessing only one PDF file at a time in the HTTP request node? I’ve also not used Mindee, but have you renamed the binary file to not be data? If you’ve changed the name, you’ll need to update this.
Are your invoices consistent in structure, or potentially totally different? I have a setup that uses the read pdf and some custom code block to retrieve the info I want. Works quite well for a free solution.
Otherwise I can recommend Eden AI as they offer a whole host of different ocr services by amazon, google, Microsoft etc.