Voice messages transcribe fine — but uploaded audio files don’t (even with corrected MIME)

Konstantin_Dar · June 18, 2025, 2:54pm

Hi everyone

I’m using n8n.cloud to build a Telegram → Whisper (OpenAI) transcription flow.

Here’s the weird part:

When I send a voice message in Telegram, everything works fine — Whisper transcribes it without issues.
But when I send an audio file (e.g., from a recorder app), transcription fails in the OpenAI node.

I’ve already:

Used Telegram → Get File with Download enabled and Binary Property = data
Inserted a Set node (or Function) to correct the MIME type to audio/mpeg
Verified that the binary output looks OK — fileName, fileExtension, and mimeType are all there

Still, OpenAI throws an error:

“Missing binary data” or a vague 400 bad request.

Is there a hidden difference between how Telegram treats voice messages vs uploaded audio files?

Do I need to handle them separately somehow — or trick Telegram/OpenAI with additional magic?