I’m trying to extract text from PDF files using OCR. I installed the n8n-nodes-tesseractjs community node. When I run it, I get:
What is the error message (if any)?
Problem in node ‘Tesseract’
The API version “5.4.54” does not match the Worker version “5.3.31”.
If I restart my Docker containers with docker compose down && docker compose up -d, the Tesseract node works, but the built-in node Extract from File now fails:
Problem in node ‘Extract from File’
The API version “5.3.31” does not match the Worker version “5.4.54”.
I have done some investigating and here is what I found
The API version “5.4.54” does not match the Worker version “5.3.31” error is thrown by PDF.js (pdfjs-dist) when the main PDF API bundle and the web worker are different versions.
This is because the Tesseract community node recently added PDF handling and started shipping/depending on [email protected]. while n8n’s built-in node Extract from File is still using [email protected]
That mismatch is exactly what triggers “API version … vs Worker version …”.
Fastest, stable fix (what worked for me)
Roll back the Tesseract community node to the 1.3.0 ([email protected]) version as this one doesn’t use pdfjs-dist
Long term, this should be solved once n8n core upgrades its pdfjs-dist to the same major version that the Tesseract node now uses. At that point, the mismatch disappears.
That would actually be a great improvement, because it means the Tesseract node could handle both images and PDFs directly removing the need for a separate Extract from File step and simplifying OCR workflows.
Until then, the rollback to [email protected] is the most reliable workaround.