Extract From File node - How it separate content by pages?

Hello guys,

I’m using Extract from file node in the n8n to extract PDF content. My PDF has 3 pages, however the content is all inside a unique string. How can I do get the PDF content by pages?

Thank you.

Kindly, can you share your workflow to provide additional context?

Hello, I just have a extract from file node which extracts data from the PDF binary file. One point I noticed is, every file I tried returned a \n\n between the pages. I was thinking if this is the page separator which n8n uses.

True—the \n\n (double newline) is typically the page separator when n8n extracts text from PDFs.

To split the content by pages, add a Code node after your Extract from File node with this:

const pdfText = $input.first().json.data; // adjust field name based on your output

// Split by double newline (page separator)
const pages = pdfText.split('\n\n');

// Return each page as a separate item
return pages.map((content, index) => ({
  json: {
    pageNumber: index + 1,
    content: content.trim()
  }
}));

If \n\n doesn’t work consistently, you can also try splitting by form feed character which is the standard PDF page break:

const pages = pdfText.split('\f');

This will give you each page as a separate item that you can process individually downstream.

Let me know if you need help adjusting this to your workflow.