How can I transcribe large MP3/MP4 files in n8n

Zyna_Pasa · April 7, 2025, 7:53am

Hi everyone,

I’m trying to build an automated transcription workflow in n8n for large MP3 and MP4 files. I came across ElevenLabs API, but it’s a bit expensive for my use case.

I’m looking for low-cost or free alternatives, ideally something like Whisper AI. I noticed that Hugging Face hosts Whisper models, and I tried to use them within n8n but couldn’t get it to work properly — either due to authentication issues or input size limits.

Does anyone here have experience with:

Running Whisper AI (via Hugging Face or another service) inside n8n?
Efficiently transcribing large files (over 100MB)?
Any tips for hosting Whisper locally or using an API that integrates well with n8n?

My ideal setup would:

Keep costs minimal
Handle large file sizes
Work smoothly with n8n’s workflow logic

Any working examples, custom node tips, or general guidance would be greatly appreciated!

Thanks in advance

Michael_G_Jackson · April 16, 2026, 11:49am

Hey!

Had this exact problem — large video file that just wouldn’t compress enough for Whisper.

The real issue is Whisper’s 25MB limit forces you into this messy chain of converting, compressing, splitting, and stitching — and it still breaks on bigger files.

What fixed it for me was using a transcription API that takes the video URL directly. No downloading, no compression, no splitting at all. It processes the full video server-side and returns a clean transcript with speaker labels and timestamps.

Docs here: https://wayin.ai/api-docs/video-transcription/

Happy to share the simple n8n workflow JSON if it helps!

dima_automation · April 16, 2026, 12:24pm

trying to split a 100mb+ audio file into 25mb chunks inside n8n is a one-way ticket to out-of-memory crashes. the guy above has the right idea about using URLs, but you don’t need an expensive custom api for it.

the standard backend pattern for this is to stop passing massive binary payloads through n8n completely. just have n8n upload the raw file to a cheap bucket (like cloudflare r2 or aws s3), generate a temporary pre-signed url, and pass that url to a serverless gpu provider like replicate or fal.ai.

they host the exact same open-source whisper-large-v3 models, they accept direct urls to bypass standard file size limits, and you only pay for the literal seconds of gpu compute time (usually pennies per hour of audio). n8n just waits for the webhook back when the transcription is done.

Topic		Replies	Views
Can’t process videos larger than 25MB for transcription – tried MP4→MP3 conversion (CloudConvert fails due to memory) Questions	4	357	February 10, 2026
Transcribe audio with OpenAI Whisper at speed x3, using N8N Questions	1	395	September 26, 2025
Split audio file Questions	7	1981	December 19, 2024
Anyone played around with OpenAI whisper in n8n? Questions http-request , external-api	6	5156	January 14, 2024
Transcribing audio using n8n-nodes-speech-transcribe community node or speechmatics Questions data-transformation	10	1937	April 16, 2025

How can I transcribe large MP3/MP4 files in n8n

Related topics