Problem/question
What node can be used if I want to ask Gemini to extract text from a PDF? I tried the Basic LLM Chain with a binary image prompt - but it does like PDFs (I seems biased towards images).
Error message
Bad request - please check your parameters
[GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-latest:generateContent: [400 Bad Request] Provided image is not valid."
Information on your n8n setup
- n8n version: 1.75.2,
- Database (default: SQLite): Postgres
- n8n EXECUTIONS_PROCESS setting (default: own, main): ?
- Running n8n via (Docker, npm, n8n cloud, desktop app): Docker
- Operating system: Linux
The quick-and-dirty workaround (please tell me there is a way to do this with nodes… it’s so wrong):
const { GoogleGenerativeAI } = require("@google/generative-ai");
const genAI = new GoogleGenerativeAI(**APIKEY**);
const model = genAI.getGenerativeModel({ model: "gemini-1.5-pro-latest" });
const result = await model.generateContent([
{
inlineData: {
data: $input.first().binary.data.data,
mimeType: "application/pdf",
},
},
'Extract content from this document',
]);
return [{response: result.response.text()}]
It’s a bummer that the basic AI chain only supports image as in input. I was really excited to try Gemini Flash 2 for extracting/OCR PDF to structured outputs until I found n8n can only upload Image to a LLM model.
@herskind_uk I built a custom n8n node that just does this. Here’s also a youtube demo of the functionality.