Extract PDF content using Gemini

Problem/question

What node can be used if I want to ask Gemini to extract text from a PDF? I tried the Basic LLM Chain with a binary image prompt - but it does like PDFs (I seems biased towards images).

Error message

Bad request - please check your parameters

[GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-latest:generateContent: [400 Bad Request] Provided image is not valid."

Information on your n8n setup

  • n8n version: 1.75.2,
  • Database (default: SQLite): Postgres
  • n8n EXECUTIONS_PROCESS setting (default: own, main): ?
  • Running n8n via (Docker, npm, n8n cloud, desktop app): Docker
  • Operating system: Linux

The quick-and-dirty workaround (please tell me there is a way to do this with nodes… it’s so wrong):

const { GoogleGenerativeAI } = require("@google/generative-ai");

const genAI = new GoogleGenerativeAI(**APIKEY**);
const model = genAI.getGenerativeModel({ model: "gemini-1.5-pro-latest" });

const result = await model.generateContent([
    {
        inlineData: {
            data: $input.first().binary.data.data,
            mimeType: "application/pdf",
        },
    },
    'Extract content from this document',
]);

return [{response: result.response.text()}]

It’s a bummer that the basic AI chain only supports image as in input. I was really excited to try Gemini Flash 2 for extracting/OCR PDF to structured outputs until I found n8n can only upload Image to a LLM model.

@herskind_uk I built a custom n8n node that just does this. Here’s also a youtube demo of the functionality.