Hi everyone
I’m using n8n.cloud to build a Telegram → Whisper (OpenAI) transcription flow.
Here’s the weird part:
When I send a voice message in Telegram, everything works fine — Whisper transcribes it without issues.
But when I send an audio file (e.g., from a recorder app), transcription fails in the OpenAI node.
I’ve already:
Used Telegram → Get File with Download enabled and Binary Property = data
Inserted a Set node (or Function) to correct the MIME type to audio/mpeg
Verified that the binary output looks OK — fileName, fileExtension, and mimeType are all there
Still, OpenAI throws an error:
“Missing binary data” or a vague 400 bad request.
Is there a hidden difference between how Telegram treats voice messages vs uploaded audio files?
Do I need to handle them separately somehow — or trick Telegram/OpenAI with additional magic?