Hello everyone,
I’m working on a workflow to automate the transcription of YouTube audio using OpenAI Whisper, and I’ve run into some persistent issues on my self-hosted n8n instance. I would greatly appreciate any insights or help.
My Setup:
- n8n Version:
1.91.3
(upgraded from 1.91.2 during troubleshooting) - Installation: Docker Compose, building the image from a
Dockerfile
that installs Python3, pip, ffmpeg, and yt-dlp on top of then8nio/n8n:1.91.3
base image. - Host OS: Linux (personal PC)
- Python Scripts: I have two scripts run via
Execute Command
nodes:descargar_audio_youtube.py
: Usesyt-dlp
to download audio from a YouTube video (with--restrict-filenames
) and returns the path to the local.webm
file.convertir_audio.py
: Usesffmpeg
to convert the.webm
to.mp3
(96kbps, mono, 16kHz), verifies the size is <25MB, and then the idea is to pass this audio to Whisper.
- Docker Volumes:
/home/kplok3/n8n_scripts:/opt/n8n_scripts_acceso
(for the scripts)n8n_data:/home/node/.n8n
(for n8n persistent data)- Files are downloaded and converted within
/opt/n8n_scripts_acceso/descargados/
(inside the container).
The Goal:
The basic workflow is: Google Sheets (Trigger with Video ID)
→ Execute Command (Download WebM)
→ Execute Command (Convert to MP3)
→ Read MP3 File → OpenAI (Whisper to transcribe)
The Problems I’m Facing:
We’ve tried several strategies, and the main issues are:
Read/Write Files from Disk
node doesn’t return binary file content:
- When I configure this node to read the path of the locally generated MP3 file (e.g.,
/opt/n8n_scripts_acceso/descargados/my_audio.mp3
), with the “Put Output File in Field” option unchecked (expecting the standardbinary.data
output), the node only returns file metadata (mimeType, fileName, fileSize, etc.), but not thebinary
object with thedata
property. - This happens consistently, even after upgrading n8n from 1.91.2 to 1.91.3. File permissions for the
node
user inside the container are correct (thenode
user owns the file and has read permissions). - The n8n UI does show a download button when inspecting this node’s output, suggesting n8n internally can access the content, but it’s not exposing it to the workflow.
- We also tried configuring the node to read a text file (a
.b64
file containing the Base64 audio content) and put the content into a JSON field using “Put Output File in Field” =base64Content
. The result was the same: only metadata, and thebase64Content
field did not appear in the output.
Execute Command
node andstdout maxBuffer length exceeded
:
- As a workaround for the above, we tried having the
convertir_audio.py
script print the Base64-encoded MP3 content directly tostdout
. - This resulted in the
stdout maxBuffer length exceeded
error. - We added the environment variable
N8N_EXEC_COMMAND_MAX_OUTPUT_SIZE=52428800
(50MB) todocker-compose.yml
and verified withdocker inspect
that the variable was applied to the restarted n8n container. - We measured the size of the Base64 string the script was trying to print: approximately 20.9MB.
- Despite 20.9MB < 50MB, the
stdout maxBuffer length exceeded
error persisted, suggesting the environment variable is not having the expected effect or there’s another lower limit.
Current Simplified Workflow (tested after updating to 1.91.3, still failing at Read/Write
):
Google Sheets Trigger
(providesID_VIDEO
)Execute Command
(Download):printf "%s" {{ $json.ID_VIDEO }} | python3 /opt/n8n_scripts_acceso/descargar_audio_youtube.py
- (Python script downloads WebM with a safe filename, returns WebM path to
stdout
)
Execute Command
(Convert):printf "%s" {{ $items("Execute Command").first().json.stdout }} | python3 /opt/n8n_scripts_acceso/convertir_audio.py
- (Python script converts WebM to MP3 96k, verifies size <25MB, returns MP3 PATH to
stdout
)
Read/Write Files from Disk
:
- Operation:
Read File(s) From Disk
- File(s) Selector:
{{ $items("Execute Command_Convert").first().json.stdout }}
(using the name of the preceding node) - “Put Output File in Field”: Unchecked.
- Expected Result: Output with
binary.data
. - Actual Result: Only metadata, no
binary.data
.
OpenAI
:
- File/Audio Data:
{{ $items("Read/Write Files from Disk").first().binary.data }}
- Current Error (or previous
[object Object]
type error): Fails because it doesn’t receive binary data.
Questions for the Community:
- Has anyone experienced similar issues with
Read/Write Files from Disk
not returning binary content in n8n v1.91.x (specifically 1.91.2 or 1.91.3) under Docker? - Is there any special configuration or consideration for this node that I might be missing to ensure it includes
binary.data
? - Regarding
N8N_EXEC_COMMAND_MAX_OUTPUT_SIZE
, is it known not to work as expected in certain versions, or are there other limits that might override it? - Given that
Read/Write Files from Disk
is failing in my case, does anyone have suggestions for a robust alternative workflow to read a local file (generated by anExecute Command
node) and pass its binary content to a node like OpenAI Whisper?
I’d be very grateful for any help or ideas. We’ve tried many things and are a bit stuck!