Version mismatch errors with Tesseract node in self-hosted n8n

Describe the problem/error/question

I’m trying to extract text from PDF files using OCR. I installed the n8n-nodes-tesseractjs community node. When I run it, I get:

What is the error message (if any)?

Problem in node ‘Tesseract’
The API version “5.4.54” does not match the Worker version “5.3.31”.

If I restart my Docker containers with docker compose down && docker compose up -d, the Tesseract node works, but the built-in node Extract from File now fails:

Problem in node ‘Extract from File’
The API version “5.3.31” does not match the Worker version “5.4.54”.

Please share your workflow

I’m self-hosting n8n on Windows 11 using this guide:
https://github.com/n8n-io/self-hosted-ai-starter-kit

It seems like a version mismatch between node API and worker versions. How can I fix this so that both nodes run correctly?

Thanks!

Share the output returned by the last node

Information on your n8n setup

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

Hello I have the same problem, did you get it fixed ?

1 Like

Great news!

I have done some investigating and here is what I found

The API version “5.4.54” does not match the Worker version “5.3.31” error is thrown by PDF.js (pdfjs-dist) when the main PDF API bundle and the web worker are different versions.

This is because the Tesseract community node recently added PDF handling and started shipping/depending on [email protected]. while n8n’s built-in node Extract from File is still using [email protected]

That mismatch is exactly what triggers “API version … vs Worker version …”.

Fastest, stable fix (what worked for me)

Roll back the Tesseract community node to the 1.3.0 ([email protected]) version as this one doesn’t use pdfjs-dist

Long term, this should be solved once n8n core upgrades its pdfjs-dist to the same major version that the Tesseract node now uses. At that point, the mismatch disappears.

That would actually be a great improvement, because it means the Tesseract node could handle both images and PDFs directly removing the need for a separate Extract from File step and simplifying OCR workflows.

Until then, the rollback to [email protected] is the most reliable workaround.

Cheers,

1 Like

yes, this will really help a lot for recognition without complicating the schemes!

Please file a FR with n8n for this if a core package needs to be updated.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.