Data Extraction/Document Processing in n8n – Easy 3-Step Setup [n8n + easybits]

Hi Community,

I’m Felix, and I’ve done a lot of research on data extraction because I wanted to build a simple workflow to parse data from PDF documents. After testing multiple tools that were too complex to set up and maintain, my team and I at easybits came up with what we believe is the easiest solution for data extraction, and it’s fully GDPR compliant as well.

Here’s how to connect easybits to n8n for automated document data extraction using the HTTP Request node:

Step 1: Get Your Credentials from easybits

Before configuring n8n, you need your Pipeline ID and API Key from the easybits Extractor app.

  1. Log in to easybits Extractor at https://extractor.easybits.tech/

  2. Create a pipeline by uploading an example document and mapping the fields you want to extract. For more details, visit our Quick Start Guide here.

  3. Once you’ve finalized your pipeline, go back to your dashboard and click Pipelines in the left sidebar.

  4. Click View Pipeline on the pipeline you want to connect

  5. On the Pipeline Details page, you will find:

    • API URL: https://extractor.dev.easybits.tech/api/pipelines/[YOUR_PIPELINE_ID]

    • API Key: Your unique authentication token

  6. Copy both values. You will need them in the next step.

Important: Each pipeline has its own API Key and Pipeline ID. If you have multiple pipelines (for example, one for invoices and one for IDs), you will need separate credentials for each.

Step 2: Create Credentials in n8n

  1. In n8n, go to Settings > Credentials

  2. Click Add Credential

  3. Search for Header Auth

  4. Configure:

    • Name: easybits - [Pipeline Name] (for example: “easybits – Invoices Pipeline”)

    • Header Name: Authorization

    • Header Value: Bearer [paste your API Key here]

  5. Click Save

Step 3: Configure the HTTP Request Node

Add an HTTP Request node with these settings:

Setting Value
Method POST
URL https://extractor.dev.easybits.tech/api/pipelines/\[YOUR_PIPELINE_ID\]
Authentication Predefined Credential Type
Credential Type Header Auth
Credential Select your easybits credential
Send Body ON
Body Content Type JSON
Specify Body Using JSON

Request body:

{ 
   "files": [
      "https://example.com/your-document.pdf"     
   ]
}

Using Base64 Encoded Files

If you receive files as binary data (for example, email attachments):

{
   "files": [
      "data:application/pdf;base64,JVBERi0xLjQK..."
   ]
}

Supported File Types

  • PDF (.pdf)

  • PNG (.png)

  • JPEG (.jpg, .jpeg)

If you have any questions or feedback, feel free to drop a comment – I’m happy to help!

Nice! How does this work when the vendor changes their invoice format though? Do you have to redo the pipeline mapping each time?

I’ve been working on something similar but went a different route - no pipelines, just schema-based extraction. Works pretty well for dealing with a bunch of different vendors but setup takes a bit longer.

Does yours handle multi-page invoices okay? That was annoying to figure out on my end.

1 Like

Hey @Truong, so when the invoice format changes, you won’t need to redo the mapping in easybits. As long as the data points you’ve mapped in your pipeline remain the same, our solution will still find them even if the layout changes.

And regarding multi-page invoices: yes, our solution handles those as well without any further setup needed.

If you have further questions, feel free to let me know.