Need to extract multiline data from mulitpage PDF to the populate excel sheet

LeahLevin · July 3, 2025, 9:20am

Describe the problem/error/question

Hi all, I am a total beginner and enjoying this glad to meet you all.
My current issue - I receive multipage PDF invoices via email and each page has mulitple lines of data I need to extract and then populate into an excel spreadsheet. I have built my first workflow in n8n and it has successfully extracted data from a PDF in my email but it only extracted 1 line entry from the multipage doc. I am glad it populated the excel sheet but it is not complete as I needed all the data from the PDF not just 1 line entry. Any help for a beginner? Much appreciated!! Leah

What is the error message (if any)?

Please share your workflow

current workflow
Microsoft Outlook trigger
Microsoft outlook get attachment
Code
Microsoft outlook download attachment
Extract from file
Information extractor with an Open Ai node attached
MS excel append worksheet

(Select the nodes on your canvas and use the keyboard shortcuts CMD+C/CTRL+C and CMD+V/CTRL+V to copy and paste the workflow.)

Share the output returned by the last node

Month

May

Date

13/05/2025

Invoice No

LONR009656111

Amount

25.53

COUNTRY

UK

Information on your n8n setup

n8n version: n8n Cloud version 1.95.3
Database (default: SQLite):
n8n EXECUTIONS_PROCESS setting (default: own, main):
**Running n8n via n8n cloud
Operating system: MS windows

Gallo_AIA · July 3, 2025, 9:53am

To extract multiple entries, try the following structured approach:

Split the multi-page PDF into individual pages or smaller chunks (e.g. per invoice line) using a PDF splitter or external service (like PDF.co or ConvertAPI via HTTP Request or a custom Code node)
Then use Split to loop over each page or chunk and send them individually into “Extract from File”. That way, each piece creates its own item in the workflow
Once you have each line or page as a separate item, pass them to your AI extractor and then append them into Excel, you’ll get one line per entry as intended.

So, n8n’s PDF extractor doesn’t automatically break documents into lines. You need to split the PDF into individual pages or chunks, loop through them with SplitInBatches, then extract each one separately. Optionally, leverage specialized OCR APIs for invoice-heavy workflows.

system · October 1, 2025, 9:53am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.